12

If I understand authentication mechanism correctly, when we input the credentials in the login prompt, the hash of the password is computed and then that hash is compared with the hash stored somewhere, and if I am not mistaken that "somewhere" is the /etc/shadow. If it matches, then authentication is successful, otherwise, authentication fails.

My question is, what program, I mean which binary, computes the hash of the input password? Or is it implemented by the kernel?

JLC
  • 317

3 Answers3

20

No specific binary does the hashing; it's done by a library call crypt(3). So, in theory, any program (even a perl script) can generate the hash. e.g.

perl -e 'print crypt("hello","\$6\$0abcdef")'
$6$0abcdef$JhsHMeFo8zYQPa8/HDXGoMWZgIxiCJKu2BQqCGjkyh/NadMax6GiJWQOT5ZL7POkUzrwbvL/Yhbx9f8XtLr.F/

You can read more about this function by man 3 crypt.

Now a normal Unix login process is a bit more complicated. Typically we use the PAM ("Pluggable Authentication Modules") stack. This, along with NSS ("Name Service Switch"), allows a lot of flexibility in how and where passwords are stored and how authentication is done.

When using PAM, the program itself doesn't call crypt(), it calls the PAM stack and that decides what to use.

A very common pattern is to use pam_unix and it will be in pam_unix.so that the call to crypt() is done, and where the comparison to the /etc/shadow (or LDAP or wherever) would be done.

This PAM/NSS combination is very powerful because it means new methods of authentication can be added (e.g. joining Active Directory domains) without every program needing to know about it; just update the PAM configuration and "magic" happens.

Lekensteyn
  • 20,830
  • Am I correct if I say that /usr/bin/login, /usr/bin/sudo, /usr/bin/su, /usr/bin/ssh, etc., are examples of applications that may (or may not) use the crypt or PAM? – JLC Sep 02 '23 at 18:51
  • 1
    @JLC, in practice, all the programs you mention support PAM. (Well -- it's the SSH daemon, so sshd, that's responsible for PAM support, not the client; but presuming the daemon is what you mean...) – Charles Duffy Sep 02 '23 at 19:46
  • sshd is a special case 'cos it can optionally use PAM or not (the usePAM entry in sshd_config). If you do an ls /etc/pam.d you'll see lots of entries (eg su, sudo, screen, passwd, login and more). These are the PAM configs for those programs. – Stephen Harris Sep 02 '23 at 20:38
  • So brushing GUI under the carpet for the moment, what have we got: getty calls login and login uses PAM? – Mark Morgan Lloyd Sep 03 '23 at 07:21
  • @MarkMorganLloyd Yeah. We can see login uses PAM by seeing that it uses libpam; eg. % ldd /bin/login | grep libpam libpam.so.0 => /lib64/libpam.so.0 (0x00007f87be58e000) libpam_misc.so.0 => /lib64/libpam_misc.so.0 (0x00007f87be38a000) – Stephen Harris Sep 03 '23 at 12:24
  • @MarkMorganLloyd See my answer for more details about this part. – Gilles 'SO- stop being evil' Sep 03 '23 at 12:25
8

when we input the credentials in the login prompt, the hash of the password is computed and then that hash is compared with the hash stored somewhere, and if I am not mistaken that "somewhere" is the /etc/shadow.

This is broadly correct (and more correct than many simplified explanations on the web, so well done). More precisely, the data flow is:

  1. Read some configuration files. I'll assume they result in the common case of a local account with the normal defaults. Other possibilities at this stage can result in querying a network account database, or doing non-password-based authentication, or (but this would be unusual) doing password-based authentication with a different location of the password hash.
  2. Input the user's password P. (This may happen after the next step.)
  3. Find the line with the desired user name in /etc/shadow and extract the “encrypted¹ password” field. Note that this step requires permission to read /etc/shadow.
  4. Split the shadow field into two parts²: the configuration+salt S and the expected hash H. Pass P and S to the crypt library function¹, obtaining a result R. Compare R with H: if they're equal then the password authentication has succeeded, otherwise it's failed.
  5. Continue with the user authentication if there are non-password based methods.
  6. If the authentication has succeeded, log the user in, otherwise error out.

On a typical embedded Linux system, all these steps happen inside the same program: login for a console login, su or sudo when elevating privileges, dropbear for SSH logins, etc. Step 1 may be completely omitted since embedded systems often don't have any runtime configurability at this point. The implementation of the crypt function comes from the system's standard library, e.g. musl. So the code to perform the calculation is stored in something like /lib/libc.so while the code that performs the surrounding configuration and database lookups is in /bin/login and such. The kernel is not involved in authentication except for providing basic input-output primitives (open file, read file, etc.). The kernel only gets involved more directly after authentication, to keep track of the privileges of the process after the authentication. (It may also be involved for temporary privilege escalation to read /etc/shadow if the authentication program doesn't run as root all the time.)

On a typical non-embedded Unix system, most of this process is subcontracted to the PAM library. PAM consists of a main library (/lib/libpam.so.0 — here and elsewhere the exact path is system-dependent) as well as a number of auxiliary libraries and programs. I won't get into details because they're all part of the same software suite. The authenticating program calls a series of functions in the PAM library to authenticate the user as well as decide what to do after a successful authentication (session establishment). I think pam_authenticate is the function that performs parts of step 1 as well as steps 3–5 (I'm not sure as I'm not familiar with this side of PAM).

With PAM, steps 3–4 (finding the password hash and validating the password against it) are specifically handled in /lib/security/pam_unix.so, the PAM module for traditional password-based authentication. The pam_unix module runs in the context of the process that performs the authentication, which might not run with enough privileges to read /etc/shadow (but needs to run with enough privileges to be able to get those privileges³). To minimize the amount of code that can access the password hashes, this part runs in a dedicated program unix_chkpwd. This program is the one that reads the password hash from /etc/shadow, calls the crypt function and verifies its output.

¹ A misleading name since the password is hashed, not encrypted — you can't “decrypt” the content to find the original password.

² The interface is slightly weird for historical reasons. The field in /etc/shadow contains 4 parts: an algorithm identifier, some cost parameters, a salt string, and the expected output string. The algorithm identifier selects which password hashing algorithm is used — note that these are not hashing algorithms despite what the name suggests. See e.g. Do all Linux distributions use the same cryptographic hash function? for more information. The cost parameters depend on the algorithm; a higher cost makes normal authentication slower but also makes cracking the password harder if an attacker manages to retrieve the hash. The salt is unique and protects against multi-account attacks (e.g. trying to get into the account of the employee with the weakest password, to get a foothold into an organization). Internally, the crypt function uses the algorithm identifier to determine which auxiliary function to call, and on some systems this auxiliary function can live in a different library.

³ Typically, the program (login, su, …) starts with root privileges, and keeps (at least a part of itself) root as its saved user ID but changes its effective user ID to the a dedicated system user (when logging in) or to the invoking user (when elevating privileges). This minimizes the risks of a security hole in the login program that allows an attacker to get partial control of the login program (e.g. read files) but not the ability to execute arbitrary code. Elevating privileges requires calling a dedicated system call such as seteuid.

  • re. point 4, I've had the impression that it's not even really necessary to split the stored password entry to remove the hash proper, as far as I've tried, crypt() seems to accept it is-as (and presumably ignores the part it doesn't need). Meaning you can reduce a password check to just comparing the result of crypt(entered_password, stored_hash) to the stored hash, and you don't need to even know what the structure of the stored field is. Anyway, the result contains the salt part too, so you can't compare it to just the extracted hash. – ilkkachu Sep 03 '23 at 18:10
  • E.g. in Perl, perl -le '$p = "foobar"; $h = q#$5$rounds=12345$3K4Ah7pu76z$n4i.EB32cHKCBdFfRt4sFiklMu4oflfxKdzr0Do..UB#; $c = crypt($p, $h); print $h; print $c;' prints out the hash $5$...UB twice (since it's the correct password) – ilkkachu Sep 03 '23 at 18:16
3

The binary responsible for computing the hash of the input password is typically part of the authentication system and is not implemented by the kernel. In most Linux distributions, this is handled by a program called crypt or passwd, and the specific binary used may vary by distribution.

Here's a simplified overview of how it works:

  1. When a user logs in or changes their password, the password is passed to the crypt or passwd utility.
  2. The crypt or passwd utility computes the hash of the input password using a one-way cryptographic hash function. The specific hash function used can vary but is commonly based on algorithms like SHA-256 or SHA-512.
  3. The computed hash is then compared to the stored hash in the /etc/shadow file for the corresponding user. Note that this means that the utility needs to have permission to read the shadow file.
  4. If the two hashes match, the authentication is successful, and the user is granted access.

The hashing and password management are typically handled by user-space programs to ensure flexibility and compatibility with various password storage schemes and hashing algorithms. The kernel is only responsible for low-level interactions between the user-space authentication utilities and tracking what user each process runs as.

Swapnil
  • 47
  • 3
    This might have been correct 20 years ago, but these days one has to allow for the almost-universal adoption of PAM. – Mark Morgan Lloyd Sep 03 '23 at 07:19
  • @MarkMorganLloyd This is true even when using PAM. The “crypt” program used by PAM is called unix_chkpwd. – Gilles 'SO- stop being evil' Sep 03 '23 at 10:39
  • 2
    That is to say it's not true that the program would be called crypt. Ok, it might be true in some system, but I've never heard of a distinct program called crypt in any Linux, so if it exists, it's far from universal. The library function crypt() would be a different matter, of course. – ilkkachu Sep 03 '23 at 13:24
  • Browsing the FreeBSD man pages archive, it looks like crypt in FreeBSD is a tool for encrypting files, and a similar program has been there in some other systems too. – ilkkachu Sep 03 '23 at 13:29
  • 2
    I think this answer is not correct as the program is named login, not passwd or crypt. – U. Windl Sep 04 '23 at 12:05