36

Why was colon (:) chosen as path separator?

Note that I mean "path separator" and not "directory separator". Path separator is the symbol placed between the entries in the PATH environment variable.

PATH="/usr/local/sbin:/usr/local/bin:/usr/bin:..."
                     ^ this symbol

Everything in computers and software was once a deliberate decision made by someone somewhere. For example why tilde represents home dir (and why hjkl for direction keys in vi). I like to know the background for this decision.


Some random facts:

Having colon as the path separator means that directory with a colon in the name cannot be added to the path.

from POSIX:

Since <colon> is a separator in this context, directory names that might be used in PATH should not include a <colon> character.

http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html

It seem to be not possible to escape the colon. @Random832 from Stack Overflow inspected the source code handling PATH and found no escape mechanism.

https://stackoverflow.com/questions/14661373/how-to-escape-colon-in-path-on-unix

Lesmana
  • 27,439
  • 2
    That's also the separator for /etc/passwd (that also contains paths in the home and shell columns). – Stéphane Chazelas Sep 21 '16 at 13:06
  • please reopen question as discussed here: http://meta.unix.stackexchange.com/questions/4163/closing-votes-on-a-historic-question-about-plan-9 – Lesmana Sep 22 '16 at 07:46
  • 14
    I spent about half an hour yesterday researching this question. I read the 1971 Unix Programmer's Manual which specifies the use of a colon but not the reason why colon was chosen over (e.g.) pipe symbol. I also read as much as I could about Multics but it, apparently, only had one directory in its PATH (so no need for separator). I doubt we'll get a good answer here but if there's a chance that some veteran Unix user could answer this question, I'd like them to have the opportunity, so I'm voting to re-open. – Anthony Geoghegan Sep 22 '16 at 08:25
  • @AnthonyGeoghegan, note that $PATH was not introduced until Unix V7 (released in 1979), while /etc/passwd was there from the start – Stéphane Chazelas Sep 22 '16 at 09:01
  • @StéphaneChazelas I read in Doug McIlroy's Annotated Excerpts that “Then in v3 /bin overflowed the small (256K), fast fixed-head drive. Thus was /usr/bin born, and the idea of a search path reinstated.” – Anthony Geoghegan Sep 22 '16 at 09:08
  • @AnthonyGeoghegan, yes though until V7, that was hardcoded in the commands that execute commands like the shell. $PATH came with V7 along with the Bourne shell and execvp() – Stéphane Chazelas Sep 22 '16 at 09:54
  • Thanks, @StéphaneChazelas I always enjoy learning this type of history and finding out how Unix-like operating systems evolved to what they are today. (I've just realised that my first comment doesn't make it clear that I was referring to the use of colons in /etc/passwd as stated in your first comment). – Anthony Geoghegan Sep 22 '16 at 10:24
  • 6
    There might not have been a shell/environment variable called PATH before the introduction of Unix Version 7 (in 1979), but there was a :-delimited search path as early as 1977. PWB/Unix (Programmer’s Workbench) used the Mashey shell, written by John R. Mashey, which fell chronologically between the Thompson shell and the Bourne shell. … (Cont’d) – G-Man Says 'Reinstate Monica' Sep 22 '16 at 21:50
  • 5
    (Cont’d) …  The Mashey shell supported 26 shell variables (guess what their names were) — and variable p was the search path (called “the Shell directory search sequence for command execution”), with directories separated by colons. … … … … … … … … … … … … … … … … … … … … … … … Fun fact: while the Mashey shell processed the .profile file, it also allowed you to specify an initial $p value in file called .path. – G-Man Says 'Reinstate Monica' Sep 22 '16 at 22:00
  • Was colon allowed in file names on the popular file systems at this time ? – rudimeier Sep 22 '16 at 22:24
  • @rudimeier: Well, back in the 1970s, there weren't popular file system*s; there was the* Unix file system. Then, when Unix Version 7 came along, there was *the* Unix Version 7 file system. But to answer your question, it has always been the case that all characters are allowed in filenames except for / (slash) and nul. – G-Man Says 'Reinstate Monica' Sep 24 '16 at 01:22
  • I guess colon was simply chosen because it had served as separator in /etc/passwd (which also needs to separate paths) , so why chose any other character? And a workaround for PATH containing a colon in its name could be solved by creating a symlink, without colon in its name, to the one with a colon, and put the symlink's name in PATH. –  Oct 24 '16 at 21:31

1 Answers1

9

After some digging I don't have a real answer but at least new information to add to this conversation supported by some historical facts.

Here is Peter Chubb in one of his speeches talking about the shell, around the 19:00 mark you can hear him mentioning why e is the alias for the default editor in unix shells. It's because older terminals were not so comfortable or easy to use and typing on them was an unpleasant experience.

He mentions a precise model, the Teletype Model 33 in this case.

After some research I find that this machine only lets you pick in a pool of 64 characters, not even full US ASCII support, 2 to the power of 6 chars, it's a 6 bit combination.

In fact this machine has nothing to do with ASCII at all, meaning that it doesn't even support just the first 64 chars of ASCII, it's just going for a totally unrelated set of inputs and probably not standard (for our modern era) set of characters.

The ASR 33 teletype can print 64 characters which only allowed for UPPER CASE LETTERS, numbers, and symbols.Source

and this just proves that it's definitely not US ASCII given the fact that to support uppercase letters you really need more than 6 bits, the uppercase letters are beyond the 64 chars mark (or the value 63 in decimal if you want to follow a table)

 0 NUL    16 DLE    32      48 0    64 @    80 P    96 `   112 p 
 1 SOH    17 DC1    33 !    49 1    65 A    81 Q    97 a   113 q 
 2 STX    18 DC2    34 "    50 2    66 B    82 R    98 b   114 r 
 3 ETX    19 DC3    35 #    51 3    67 C    83 S    99 c   115 s 
 4 EOT    20 DC4    36 $    52 4    68 D    84 T   100 d   116 t 
 5 ENQ    21 NAK    37 %    53 5    69 E    85 U   101 e   117 u 
 6 ACK    22 SYN    38 &    54 6    70 F    86 V   102 f   118 v 
 7 BEL    23 ETB    39 '    55 7    71 G    87 W   103 g   119 w 
 8 BS     24 CAN    40 (    56 8    72 H    88 X   104 h   120 x 
 9 HT     25 EM     41 )    57 9    73 I    89 Y   105 i   121 y 
10 LF     26 SUB    42 *    58 :    74 J    90 Z   106 j   122 z 
11 VT     27 ESC    43 +    59 ;    75 K    91 [   107 k   123 { 
12 FF     28 FS     44 ,    60 <    76 L    92 \   108 l   124 | 
13 CR     29 GS     45 -    61 =    77 M    93 ]   109 m   125 } 
14 SO     30 RS     46 .    62 >    78 N    94 ^   110 n   126 ~ 
15 SI     31 US     47 /    63 ?    79 O    95 _   111 o   127 DEL 

Now we know that we get 64 chars out of this thing, without any real standard to support them in coded table and we also don't have lowercase letters, just uppercase plus symbols and numbers.

Thanks to this website I can show you the input layout of such keyboard

ASR33 keyboard layout

and by pressing SHIFT you also get

ASR33 keyboard layout (second layer)

There is also a bit more information about how the physical connections that generate the characters are coded (the page also clarifies that ASR33 and ASCII chars are different down to the bit level).

I think it's interesting to note that there are no { or } but only ( and ) which means that probably creating subshells was OK but creating new processes was probably not so easy or permitted by the terminal.

In the end I don't think that there is a real scientific answer, it was probably a "free" character waiting for a special meaning; one thing is sure though: shells and terminals are older than ASCII and thinking about ASCII or any coded table as we know them today is probably not going to solve the mystery.

user31223
  • 155
  • more on the : sign and the shell http://stackoverflow.com/questions/3224878/what-is-the-purpose-of-the-colon-gnu-bash-builtin – user31223 Nov 15 '16 at 21:48