1

I'm hitting an issue where I need to get the unresolved symlink of a shell process. For example given a symlink ~/link -> ~/actual, if bash is launched with a $PWD of ~/link, I need to fetch that from outside the bash process.

Getting the resolved cwd is possible using lsof or /proc as called out in https://unix.stackexchange.com/a/94359/115410 but I'm beginning to think it's not possible to get the unresolved path.

I have tried to use lsof -b to not use readlink but the logging says that path never tries to use readlink anyway. It does appear possible to read the environment via /proc/.../environ and parse out PWD but this is slow, /proc may not exist on the system and I believe there are some security implications to trying to read a processes' environment.

Here is the code in question I'm trying to fix:

lsof on macOS:

exec('lsof -OPln -p ' + this._ptyProcess.pid + ' | grep cwd', (error, stdout, stderr) => {
  ...
});

/proc on Linux:

Promises.readlink(`/proc/${this._ptyProcess.pid}/cwd`);

1 Answers1

2

The logical value of the current working directory (logical cwd, what you call “unresolved pwd”) is an internal concept of the shell, not a concept of the kernel. The kernel only remembers the fully resolved path (physical cwd). So you won't get the information you want through generic system interfaces. You have to get the shell's cooperation.

The PWD environment variable is how shells transmit the logical cwd to their subprocesses. When a shell runs another shell, the parent sets PWD to the logical cwd (it does this whenever it runs a program), and the child checks PWD that the value of PWD is sensible and if so uses that as its logical cwd, falling back to the physical cwd if $PWD is missing or wrong.

Observe:

#!/bin/sh
mkdir /tmp/dir
ln -sf dir /tmp/link
cd /tmp/link
sh -c 'echo Default behavior: "$PWD"'
env -u PWD sh -c 'echo Unset PWD: "$PWD"'
PWD=/something/fishy sh -c 'echo Wrong PWD: "$PWD"'
rm /tmp/link
rmdir /tmp/dir

Output:

Default behavior: /tmp/link
Unset PWD: /tmp/dir
Wrong PWD: /tmp/dir

Reading the process's environment doesn't have any particular security implications, but what it tells you is what the value of PWD was when the process started. If the processed changed to another directory, the value in the environment is no longer relevant. What you need is the value that would be in the environment if the shell ran another process now. But the only way for this to appear is to actually make the shell run another process.

The typical way for GUIs to find the cwd of a shell that they run is to make the shell print it out. If you need the information occasionally and want to leave maximum control over the shell configuration to the user, ensure that the shell is displaying a prompt and issue the pwd command. This is simple, works even in “exotic” shells like csh and fish, but is ambiguous in corner cases (e.g. a directory name containing newlines). If it's ok to tweak the shell configuration, you can make the shell print an escape sequence each time it displays a prompt (PS1 for many shells, but the way to make it include the current directory varies), or when it changes directories (chpwd_functions in zsh, more invasive ways in other shells).

Note that if a component of the logical cwd has been moved or removed, or if the symbolic link has been changed to point elsewhere, the value may be wrong. On the other hand the physical cwd will always be correct. (If the directory has been removed, Linux's /proc/PID/cwd, which is where all programs such as ps and lsof get their information, will appear as a broken symlink whose target ends in (deleted). I don't know what lsof reports on macOS for a deleted current directory.) So if you do find out the logical cwd, you should probably ensure that it matches the physical cwd and fall back to the physical cwd if it doesn't.