The output of sha512 (or any hash) is a sequence of bits – 512 of them, in this case – which encode a (very large) number, and can be separated into bytes or whatever other divisions are desired.
Hexadecimal representations turn each four-bit chunk into a hexadecimal digit and are a common way of representing hashes to a user, but not inherent to the hash function itself. It could just as well be raw bytes, or decimal, or a long string of 0s and 1s. What's important is that the writer and reader agree on how the number's encoded.
In this case, raw bytes won't work, because the shadow
file uses newlines to separate records, so that byte isn't available, and it would be hard to read or copy for a person if it ended up with other odd characters in it. That's why ASCII encodings are generally used for hashes that you might see.
ASCII-represented hex, decimal, and binary are all fairly inefficient mechanisms, however: hex doubles the size of the raw bytes (4 bits of input to one ASCII byte of output). For shorter hashes that's less of an issue: MD5 is only 128 bits, 16 bytes, so 32 hexadecimal digits, and that's manageable, but for longer hashes it gets unwieldy fast. For a 512-bit hash like this, though, that'd be 128 bytes just for the hash. Decimal or binary would be even worse, though it probably isn't hugely important how long these are in this case as long as everyone agrees.
For this specific case, man 3 crypt
says that:
The characters in "salt" and "encrypted" are drawn from the set [a-zA-Z0-9./]
.
a-z
(26), A-Z
(26), 0-9
(10), .
(1), and /
(1) make 26+26+10+1+1=64 available characters in total, so a base-64 representation sounds like it's in use. That means each ASCII byte represents 6 bits (2^6 = 64) of the data: four bytes (32 bits) of base64 holds three bytes (24 bits) of the original data, so it's only 33% expanded on where it started. A 512-bit value needs 86 bytes to store in this encoding.
Base64 is a good default when a) you need to store or transmit arbitrary binary data within ASCII and b) nobody will ever have to read it out loud. Both of those hold here, so it's a sensible choice. Hexadecimal representations are convenient when you might have to read or check the hash manually, because case is unimportant and there aren't that many distinct values. There is also a little-used, but standard, base32 encoding that sits in the middle (all upper case and digits), but there's not much reason to use it here.
You probably have a base64
tool installed which will do these conversions for you in both directions. It may use different bytes at the end than crypt
does – the MIME base64 encoding uses +
and /
instead of .
and /
, for example – but you can see how it turns arbitrary input into slightly-longer ASCII-encoded output. There are also online tools to encode and decode, but for a password hash you're likely to get unprintable bytes and invalid byte sequences, so it may not be much help there.
/etc/shadow
contents + passwords - https://www.slashroot.in/how-are-passwords-stored-linux-understanding-hashing-shadow-utils - shows how to do it manually - https://unix.stackexchange.com/questions/81240/manually-generate-password-for-etc-shadow. – slm Aug 19 '18 at 03:04crypt()
- https://en.wikipedia.org/wiki/Crypt_(C). – slm Aug 19 '18 at 03:11