7

In POSIX compliant systems, is a path whose occurrences of (possibly overlapped) /./ have been substituted for / guaranteed to lead to the same target as the original one?

Example:

#!/bin/bash

shopt -s extglob

some_command -- "${@///+(./)//}"


UPDATE:

Given the comments, it is not equivalent, so I'll update the question:

In POSIX compliant systems, is a path for which the occurrences of (possibly overlapped) /././ have been replaced by /./ guaranteed to lead to the same target as the original path?

Fravadona
  • 561
  • 3
    I guess that a pathname starting with //./ may technically be interpreted differently from the transformed one starting with //. See also On what systems is //foo/bar different from /foo/bar?. I suppose a real answer would discuss the path canonicalisation that the cd utility does. – Kusalananda Sep 16 '22 at 09:03
  • 2
    I see this daily with Cygwin; it's one of the systems that does implement //item/ differently to /item/. – Chris Davies Sep 16 '22 at 09:11
  • @roaima would //./somewhere be the same as /somewhere in Cygwin? Or different from //somewhere? – Fravadona Sep 16 '22 at 10:46
  • @Fravadona no. The first component after a leading // is a hostname. For example, ls -d //./tmp returns ls: cannot access '//./tmp': No such file or directory but ls -d /tmp is successful. Also ls -d //$(hostname)/ is successful (after a fashion). – Chris Davies Sep 16 '22 at 11:07
  • @roaima yes, but //$(hostname) and //./$(hostname) should be equivalent, right? – terdon Sep 16 '22 at 12:05
  • @terdon no. Even if you treat . as localhost (true for parts of Windows networking) then your suggestions could be equivalent to / and /$(hostname). – Chris Davies Sep 16 '22 at 12:18
  • It's a bit like Autofs/NIS/YP with a hostname mapping. Sort of. – Chris Davies Sep 16 '22 at 12:20
  • Sounds like we have an answer then: not in Cygwin. Maybe in regular nixes though? – terdon Sep 16 '22 at 12:20
  • 1
    Paths starting with a double-slash are reserved in the standard as implementation-defined (and different normal paths), so in principle it's not just Cygwin, but something that any software that wants to be POSIX-compatible needs to deal with. Apart from modifying paths that already start with a //, it also means that you can't change something like /.//foo into //foo. – ilkkachu Sep 16 '22 at 13:06
  • /./ has special meaning to rsync. – spuck Sep 16 '22 at 17:57

2 Answers2

4

A "POSIX compliant system" is a little vague in the real world, unfortunately. POSIX covers parts of the command set, the shell, the file system, the system calls, a threads interface, and more all through different sections and subsections of the standard. Many systems are POSIX compliant but with extensions, or have POSIX-compliant parts living alongside non-compliant or extended parts.

At the beginning of the string, // may be treated specially in some systems under some circumstances.

For the rest of the string, replacing /././ with /./ or just / should be equivalent on a POSIX system or a system interfacing to a POSIX-compliant subsystem. Likewise, in most cases anywhere except the beginning of the string // or more forward slashes back to back can be collapsed to just /.

However, some things will still trip you up. While the shell and FS may treat /foo//bar the same as /foo/bar, something like a web server pointed to serve files out of the file system might not treat them the same on the URL side, and a web cache in front of that probably won't. That's because while mapping from URLs to files in the FS might look straightforward, there are places where one standard and another don't necessarily map exactly as one would naively guess. Other network layers in front of an FS may cause similar edge cases.

In particular I'm reminded of caching with Varnish in front of an Apache server serving static files, when my team discovered /foo//bar and /foo/bar in our configuration would initially cache the same backend file, but to two different cache objects with two different cache TTLs. Using the Varnish config to rewrite to the canonical form solved that.

ilkkachu
  • 138,973
  • 2
    note that it's just // with exactly two slashes at the start that's special. 3.267 Pathname says "... Multiple successive characters are considered to be the same as one , except for the case of exactly two leading characters." and 4.12 Pathname Resolution says "...although more than two leading characters shall be treated as a single character." (see links in On what systems is //foo/bar different from /foo/bar?, and yes, I think it's a bit iffy.) – ilkkachu Sep 16 '22 at 13:19
  • @ilkkachu a POSIX dirname //foo//bar will output //foo – Fravadona Sep 16 '22 at 13:26
  • @Fravadona, right, on a closer look it says "the first component following the leading [two] characters may be interpreted in an implementation-defined manner", just before my second quote above. So I guess //foo//bar should be equal to //foo/bar then. Anyway, you need to take care not to touch the first component. Though as for dirname, I get dirname /././foo -> /./. on the few ones I tried, so in itself it doesn't really answer the question about collapsing the single dots – ilkkachu Sep 16 '22 at 13:31
  • @ilkkachu Wouldn't that mean that //././a should be equivalent to //./a? Well, no, you're right. It all comes down to how a single dot in a path is treated in POSIX, if there's a spec for it – Fravadona Sep 16 '22 at 13:34
  • @Fravadona, yeah, I guess, but I also didn't check. Not that I'm sure I'd trust that in general, what with // being a bit special and rare anyway. (i.e., I wouldn't be surprised if some system making use of // would choose to ignore the spec on purpose under it.) – ilkkachu Sep 16 '22 at 14:27
  • @ilkkachu Should we now discuss systems which map other resource types such as URLs to files, such as file:///foo ? – Christopher E. Stith Sep 16 '22 at 18:52
  • 1
    @ChristopherE.Stith, if you like, but I don't think URLs/URIs are POSIX. – ilkkachu Sep 17 '22 at 09:21
  • @ilkkachu unfortunately if you're dealing with a purely POSIX system with nothing surrounding it that's not covered by POSIX you're probably not doing anything of interest with the system. – Christopher E. Stith Sep 22 '22 at 17:05
  • @ChristopherE.Stith, the question was about POSIX compliant systems, so the parts outside of the specification, however useful, would seem to be outside the scope of the question. At least it'd be rather hard to answer about those parts without since there's no clear idea what the specification for those parts would be. Now, asking if it's ok to map /./ -> / or whatever in file:/// URIs would seem to be a fine question, but might require some details as to what's going to interpret those URIs. – ilkkachu Sep 22 '22 at 17:54
4

In Pathname Resolution of POSIX 2018, the special case of a path that starts with two slash characters is described:

If a pathname begins with two successive slash characters, the first component following the leading slash characters may be interpreted in an implementation-defined manner, although more than two leading slash characters shall be treated as a single slash character.

Then, with the definition of a . in a path:

The special filename dot shall refer to the directory specified by its predecessor.

You can conclude that in POSIX compliant systems, it should be possible to replace all occurrences of overlapping /./ with a single / in a path, with the exception of the ones that start with //./ for which the first /./ can't be substituted.

Also, replacing the overlapping /././ with a single /./ should work without exception on those systems.

ilkkachu
  • 138,973
Fravadona
  • 561
  • @ilkkachu Why did you modify "starts with two —or more— slash characters" ? The end of the POSIX quote is in fact specifying what to do with "more than two leading slash characters". – Fravadona Sep 17 '22 at 21:36
  • because it's only the case with exactly two slashes that's special in any way. The end of the quote says "more than two leading slash characters shall be treated as a single slash character.". So if more than two is special, then just one is also special. But it's only the case with exactly two slashes that has the implementation-defined behavior. I commented about it under the other answer too with more references. – ilkkachu Sep 18 '22 at 07:35