The quote from Raymond by @jasonwryan has some useful information, but starts in the middle of the story:
- Keep in mind that Unix started as a reduced-scope version of Multics, and that throughout its history, features in Unix were often imitations or adaptations of features seen and used on other systems.
- The
'-'
option character was used in Multics. Bitsavers has a manual for its user commands.
- Other systems used different characters, some with more claim to be more keystroke-efficient (such as
'/'
used for TOPS and VMS) and some less (such as '('
used in VM/SP CMS).
- Multics options were multi-character, e.g., keywords separated by underscore.
- Longer Multics options frequently had a shorter, abbreviated form, such as
-print
vs -pr
(page 3-8).
- Unix options were single-character, and after several years,
getopt
was introduced. Because it was not part of the original Unix, there are utilities which did not use getopt
and were left as-is. But having getopt
helped with making programs consistent.
On the other hand, Unix options using getopt
were single-character. Other systems, in particular all larger ones, used keywords. Some (not all) allowed those keywords to be abbreviated, i.e., not all characters provided as long as the option was unambiguous. There are pitfalls in that test for ambiguity. For example:
- early in 1985, I was working on a program which had to be ported to PrimOS. Prime's developers competed with several other companies by offering a command-language that (tried to) imitate each of those others, providing the most commonly used commands from each. Of course, they supported abbreviations (as did VMS). After reading the online help, I typed
sta
, thinking to get status
. That was the abbreviation for start
, and having given nothing to start, the command interpreter logged me off.
- The X Toolkit (used by xterm) allows abbreviated options. To use this effectively in xterm, it has to preprocess the command parameters to prefer
-v
(for version) over -vb
(visual bell). The X Toolkit has no direct way to specify a preferred option when there is an ambiguity.
Because of this potential for ambiguity, some developers prefer to not allow abbreviations. Lynx, for example, uses multi-character options without allowing abbreviations.
Not all programs used getopt
: tar
and ps
did not. Nor did rcs
(or sccs
), as you can see by noting where the dash was optional, and option values were optional.
Taking all of this into account, GNU developers adapted the keyword options used in other systems by extending getopt
to provide a long version of each short option. For instance, textutils 1.0 changelog says
Tue May 8 03:41:42 1990 David J. MacKenzie (djm at abyss)
* tac.c: Use regular expressions as the record boundaries.
Give better error messages.
Reformat code and make it more readable.
(main): Use getopt_long to parse options.
The change in fileutils was earlier:
Tue Oct 31 02:03:32 1989 David J. MacKenzie (djm at spiff)
* ls.c (decode_switches): Add long options, using getopt_long
instead of getopt.
and someone may find one still earlier, but it seems that the file-header shows the earliest date:
/* Getopt for GNU.
Copyright (C) 1987, 1989 Free Software Foundation, Inc.
which is (for instance) concurrent with the X Toolkit (1987). Most of the Unix utilities with which you are familiar (such as ls
, ps
) used the existing single-character options that require periodic visits to the manual. When introducing getopt_long
, the GNU developers did not do this by first adding new options; they began by tabulating the existing options and providing a matching long option.
Because they were adding to an existing repertoire, there was (again) the problem of conflict with existing options. To avoid this, they changed the syntax, using two dashes before long options.
These programs continue to use getopt_long
in this manner for the usual reasons:
- scripts depend upon the options; developers are not anxious to break scripts
- there's a written coding standard (which may be effective)
- no one has come up with a competing set of tools which is markedly incompatible (both BSDs and GNU developers copy option names from each other)
-
is technically called a hyphen. We use the word "dash" to refer to the em dash (—) in most cases, and sometimes the en dash (–), but neither of which is a hyphen (-). – chharvey Jun 20 '15 at 16:20java -version
– The Unknown Dev May 22 '16 at 14:37find . -delete
– Krzysztof Wende May 24 '16 at 14:29-ab
which activates botha
andb
. Without the double dash,-help
would activate theh
,e
,l
, andp
options. – Aaron Franke Apr 18 '19 at 18:00--
?. – Basil Bourque Jan 05 '22 at 01:45find
for years before I was familiar with long options. Maybe because I didn't first use Linux (where I first saw it) but maybe not. – Pryftan May 11 '22 at 16:14