6

Many command-line tools (grep, flex, etc.) use the ^ symbol to denote "beginning of line" and the $ symbol to denote "end of line." When did this convention arise? It seems perfectly reasonable to reserve two characters for these purposes, but it's a bit odd that on modern keyboards the $ symbol is to the left of the ^ symbol.

Is this a completely arbitrary decision? Does this come from some older keyboard layout? Is this convention used because some older tool decided to do things this way?

Regardless of the answer, are there primary sources that document this?

  • Relating in: https://unix.stackexchange.com/a/136524/117549 – Jeff Schaller Apr 05 '19 at 17:36
  • @JeffSchaller The idea of using regular expression syntax, adapted to glob, makes a lot of sense. Perhaps I'm mistaken, though, but does glob actually support ^ and $? – templatetypedef Apr 05 '19 at 17:40
  • I did not mean to imply any connection between wildcards and regexes; I saw the date in Gilles' answer and thought it was a useful addition here. ^ and $ are regular expression tokens (here), which is why I added that tag. – Jeff Schaller Apr 05 '19 at 17:43
  • I’m not sure this is the origin of the symbols, and it doesn’t explain why they were chosen, but QED had them in 1970. – Stephen Kitt Apr 05 '19 at 18:11

2 Answers2

3

The QED editor, written in 1965 for the Berkeley Timesharing System, used $ for addressing the last line in a file, just like ed, ex, vi and vim does today. See page 2-1 in the manual. The original QED editor did not allow for the use of regular expressions though.

Ken Thompson later ("late 1960s") wrote a version of QED for Multics which was the first editor to implement regular expressions. This editor heavily influenced Ken's development of ed in 1969 for Unix (later "finalised" by Dennis Richie in about 1971). Bill Joy, out of frustration with ed, implemented ex and vi and these were part of the first BSD release in 1977 for the PDP-11.

The ^ and $ expressions, together with much of what became the POSIX regular expression syntax, with the semantics it has today, was implemented in Ken's version of QED. See page 4 in the manual.

It is not clear where the choice of these particular symbols came from, but $ already had the meaning "last" from the way it was used to address the last line.

On certain terminals, the ^ character was impossible to generate. Ken's QED editor therefore allowed \' to be used instead of ^ (see Bell Labs manual).

Kusalananda
  • 333,661
  • 2
    For more history, with unfortunately a myth here and there, see https://unix.stackexchange.com/a/115995/5132 and https://unix.stackexchange.com/a/332494/5132 . – JdeBP Apr 05 '19 at 19:18
2

This copy of ed(I), dated 11/3/71, from the Unix First Edition, confirms that ed is based on QED, and shows that ^ and $ had their current meanings then:

  1. A circumflex (^) at the beginning of a regular expression matches the null character at the beginning of a line.
  2. A currency symbol ($) at the end of a regular expression matches the null character at the end of a line.

This note and the Wikipedia page for grep indicate that grep was created in the early 1970s and used the same regular expression syntax as ed.


The concept of the regular expression predates its use in computer utilities by nearly two decades.  The Wikipedia page for “regular expression” and this Stack Overflow question credit American mathematician Stephen Cole Kleene with inventing regular expressions, or at least describing them and coining the term.  Many histories refer to his paper, Representation of Events in Nerve Nets and Finite Automata (PDF).  This 101-page document, dated 15 December 1951, is difficult to read, and (as far as I can see) does not mention the ^ and $ syntax.  However, it does present * as meaning “zero or more of the preceding thing” on page 49 of the paper (page 52 of the PDF file).  This is (somewhat) widely known as the “Kleene star”.