7

I want to test if a relative symlink points within the subtree of a certain directory.

This example would yield false since it points outside the foo directory:
/foo>readlink bar
../fie.txt

While this example would yield true:
/foo>readlink bar
fum/fie.txt

Is there an existing utility I can leverage or will I have to code it from scratch? I'm using bash.

Fylke
  • 245

3 Answers3

5

I don't think there's such an utility. With GNU readlink, you could do something like:

is_in() (
  needle=$(readlink -ve -- "$1" && echo .) || exit
  haystack=$(readlink -ve -- "$2" && echo .) || exit
  needle=${needle%??} haystack=${haystack%??}
  haystack=${haystack%/} needle=${needle%/}
  case $needle in
    ("$haystack" | "$haystack"/*) true;;
    (*) false;;
  esac
)

That resolves all symlinks to end up with a canonical absolute path for both needle and haystack.

Explanation

  • We get the canonical absolute path of both the needle and the haystack. We use -e instead of -f as we want to make sure the files exist. The -v option gives an error message if the files can't be accessed.

  • As always, -- should be used to mark the end of options and quoting as we don't want to invoke the split+glob operator here.

  • Command substitution in Bourne-like shells have a misfeature in that it removes all the newline character from the end the output of a command, not just the one added by commands to end the last line. What that means is that for a file like /foo<LF><LF>, $(readlink -ve -- "$1") would return /foo. The common work-around for that is to append a non-LF character (here .) and strip that and the extra LF character added by readlink with var=${var%??} (remove the last two characters).

  • The needle is regarded as being in the haystack if it is the haystack or if it is haystack/something. However, that wouldn't work if the haystack was / (/etc for instance is not //something). / often needs to be treated specially because while / and /xx have the same number of slashes, one is a level above the other.

    One way to address it is to replace / with the empty string which is done with var=${var%/} (the only path ending with / that readlink -e may output is /, so removing a trailing / is changing / to the empty string).

For the canonizing of the file paths, you could use a helper function.

canonicalize_path() {
  # canonicalize paths stored in supplied variables. `/` is returned as 
  # the empty string.
  for _var do
    eval '
      '"$_var"'=$(readlink -ve -- "${'"$_var"'}" && echo .) &&
      '"$_var"'=${'"$_var"'%??} &&
      '"$_var"'=${'"$_var"'%/}' || return
  done
}

is_in() ( needle=$1 haystack=$2 canonicalize_path needle haystack || exit case $needle in ("$haystack" | "$haystack"/) true;; () false;; esac )

  • I found studying this post very instructive. May I ask you a couple of questions? Is there any significance in the fact that in needle=${needle%??} haystack=${haystack%??} the needle variable is dealt with first, whereas in the next line it is the other way around? Also, how come your return statements don't explicitly return a non-zero value (to indicate error)? Last one: would it make sense to factor out the entire transformation (the call to readlink, plus the two suffix truncations) to a separate _canonicalize_path helper function? – kjo Feb 22 '16 at 12:28
  • 1
    @kjo, 1) no significance 2) return returns by default with the status of the last command. With || return, that allows to return the status as provided by the failing application. 3) sure, but the resulting function will likely not be a pleasant sight. I'll add an example. – Stéphane Chazelas Feb 22 '16 at 12:48
  • Thanks! I see what you mean! Not a pleasant sight at all. Shell programming must be the hardest type of programming I know of... – kjo Feb 22 '16 at 13:11
0

I solved the problem like this:

echo $abs_link_target | grep -qe "^$containing_dir"

The $abs_link_target variable contains the absolut path to the symlink target (expanded through readlink -f). I then check to see if the beginning of the target path matches the beginning of the $containing_dir

Fylke
  • 245
0

grep -q "^/foo/bar/" <<< "$(readlink -f "anyfile.ext")"

  • 1
    Assuming the target of anyfile.ext exists and is reachable (otherwise, readlink -f as opposed to readlink -e might not give you the correct path) and that the resulting path doesn't contain newline characters (assumes zsh or bash4 or ksh93m+ or above). Note that if anyfile.ext points to /foo/bar itself, it will say it's not within. – Stéphane Chazelas Jan 10 '14 at 11:56