All of the uppercase names in this section refer to (possibly machine-compilable) lex
descriptions of the grammar (starting in 2.10. Shell Grammar). The feature asked about is clarified in item 5:
[ NAME
in for
]
When the TOKEN
meets the requirements for a name (see XBD Name ), the token identifier NAME
shall result. Otherwise, the token WORD
shall be returned.
That is (referring to 3.231 Name), a NAME
is a certain type of WORD
:
In the shell command language, a word consisting solely of underscores, digits, and alphabetics from the portable character set. The first character of a name is not a digit.
Not all words are names: a decimal integer is a word, but not a name.
Regarding the grammar, these lines tell yacc
what symbolic constants (via #define
) lex
might return:
%token WORD
%token ASSIGNMENT_WORD
%token NAME
%token NEWLINE
%token IO_NUMBER
while the yacc
grammar (rules) begins with
%start complete_command
You may notice occurrences of WORD
and NAME
in the grammar. yacc
expects lex
to return those symbolic constants at those points. Conventionally, uppercase names are used for this purpose, with other names being just the rules within the yacc
grammar.
When interpreting a command, the shell interpreter only cares about the first WORD, which it expects to be a NAME. It passes the other WORDs to the command as parameters, and the command has to decide what those mean. The yacc
grammar is vague in this area, but note the reference to "7a". There is no labeled item for that in the written standard, but it devolves off to 2.9.1 Simple Commands corresponding to this clump in the grammar:
simple_command : cmd_prefix cmd_word cmd_suffix
| cmd_prefix cmd_word
| cmd_prefix
| cmd_name cmd_suffix
| cmd_name
(as an exercise, someone might try completing the grammar and making it actually match the standard with respect to terminology).