4

Consider the following:

/tmp$ mkdir -p src/{a,b} dst/dstsub; touch src/{a,b}/hello
/tmp$ tree src/ dst/
src/
├── a
│   └── hello
└── b
    └── hello
dst/
└── dstsub

Now run

/tmp$ cp -r src/a/.. dst/dstsub/

With GNU cp, this results in the contents of src == src/a/.. being copied in dst/dstsub, like so:

/tmp$ tree dst/
dst/
└── dstsub
    ├── a
    │   └── hello
    └── b
        └── hello

However, repeating the exercise with Busybox results in a different destination structure:

/tmp$ rm -r dst/; mkdir -p dst/dstsub
/tmp$ busybox cp -r src/a/.. dst/dstsub/
/tmp$ tree dst/
dst/
├── a
│   └── hello
├── b
│   └── hello
└── dstsub

Here, the contents of the source directory src == src/a/.. were copied alongside the destination directory dst/dstsub/, not in it.

Trying to read the standard text:

 cp -R [-H|-L|-P] [-fip] source_file... target

[...] The cp utility shall copy each file in the file hierarchy rooted in each source_file to a destination path named as follows:

If target exists and names an existing directory, the name of the corresponding destination path for each file in the file hierarchy shall be the concatenation of target, a single <slash> character if target did not end in a <slash>, and the pathname of the file relative to the directory containing source_file.

Target is dst/dstsub/, it's an existing directory and ends in a slash. Source_file is src/a/.., so the directory containing it is src/a/ and the pathnames relative to the directory containing it are ../a, ../b etc., which would result in the destination paths dst/dstsub/../a, dst/dstsub/../b etc., matching Busybox's behaviour.

However, if we first canonicalize the source path to remove .., it would be src/ (or rather src/., since the behaviour is that src itself is not copied, as would happen with cp -r src dst which would create dst/src). This would result in destination paths like dst/dstsub/./a etc., which seems to match what GNU cp does. (I don't know if it does exactly that, but this seems a plausible explanation.)

The only mention of .. (dot-dot) being special I can see in the standard text is this:

 2. If source_file is of type directory, the following steps shall be taken:

b. If source_file was not specified as an operand and source_file is dot or dot-dot, cp shall do nothing more with source_file and go on to any remaining files.

And this doesn't apply since the path containing .. was explicitly given as a command-line operand. Also the operand wasn't exactly .. in case it matters, but I get the same results with both GNU and Busybox with just .. as the source directory:

/tmp$ cd src/a/
/tmp/src/a$ cp -r .. /tmp/dst/dstsub/

Is the canonicalization of .. described above (or equivalent behaviour) allowed by the standard? Should it be done? Should it not be done? Or was there something else I missed about the behaviour of GNU cp or the interpretation of the standard text?

ilkkachu
  • 138,973
  • 1
    Obviously, that's not what I'd usually do, and "don't do that" would in most cases be a valid answer. This came up when I tried to figure out what the command in Why does 'cp -r ../* .' work but not 'cp -r .. .' should do. – ilkkachu Jun 27 '20 at 11:08
  • 2
    IMO the -v option reveals something interesting. GNU cp says 'src/a/../a' -> 'dst/dstsub/a', while busybox cp says 'src/a/../a' -> 'dst/dstsub/../a'. And the source says that removing trailing dot-dot(s) from source_file before concatenating to target is intended, but it doesn't say why. – fra-san Jun 27 '20 at 16:58
  • 2
    Adding to the mix, cp on FreeBSD (12.1), whose manual says it's expected to be POSIX.2 compatible, behaves as GNU cp (i.e. copies into dstsub); its verbose output is slightly different, though: src/a/../a -> dst/dstsub/./a. – fra-san Jun 28 '20 at 15:45

0 Answers0