0

From Bash Reference Manual

Rule from Word Splitting section:

The shell scans the results of parameter expansion, command substitution, and arithmetic expansion that did not occur within double quotes for word splitting.

Rule from Filename Expansion section:

After word splitting, unless the -f option has been set (see Section 4.3.1 [The Set Builtin], page 58), Bash scans each word for the characters ‘*’, ‘?’, and ‘[’. If one of these characters appears, then the word is regarded as a pattern, and replaced with an alphabetically sorted list of filenames matching the pattern

So after parameter expansion, command substitution, and arithmetic expansion, word splitting happens unless on the parts within double quotes.

  1. In [[ ... ]], Giles and John1024 both said that word splitting and filename expansion don't apply to the conditional expression within [[ ... ]]. Which rules in the Bash Reference Manual or POSIX 7 Specifications govern that?

    • The conditional expression within [[ ... ]] isn't double quoted, so why doesn't word splitting apply?

    • The -f option isn't set. Why does filename expansion not apply either?

  2. Besides [[ ... ]], are there other cases where word splitting, filename expansion, or both don't apply? Are their reasons that one or both of the two don't apply the same as [[..]]?

  3. Do word splitting and filename expansion always go hand in hand, in the sense that they either both apply or both don't apply to each case?

Tim
  • 101,790
  • The key difference between [ and [[ is that [ is a command (whether built in or not) while [[ is a shell keyword. See also Differences between keyword, reserved word, and builtin? – Wildcard Mar 16 '16 at 21:48
  • @Wildcard: Can you explain why the difference betw command and keyword makes word splitting and filename expansion apply or not apply? Can you cite from bash reference manual for explanation? – Tim Mar 16 '16 at 21:51
  • 1
    The basic answer to "why does software behave a certain way" is of course always either "because it was designed that way" or "because there's a bug." glenn jackman's answer already includes the citation from the documentation. – Wildcard Mar 17 '16 at 00:32
  • 1
    To more fully answer the question of how/why the shell performs word splitting/filename expansion in some places but not others, check out shell grammar: LESS='+/^SHELL GRAMMAR' man bash The command [ begins a "simple command"; the keyword [[ begins a "compound command." They have different rules for how they are expanded, that's all. – Wildcard Mar 17 '16 at 00:35
  • Can you point out what are the expansion rules for simple commands? ( I didn't see any rule explicitly claimed to be for simple commands only). Do you mean the rules for simple commands do not apply to compound commands? – Tim Mar 17 '16 at 01:20
  • glenn jackman already pointed out where the exception is specified for compound commands for [[ (see it for yourself at LESS='+/\[\[ expression \]\]' man bash). For the simple command expansion rules, see LESS='+/^SIMPLE COMMAND EXPANSION' man bash – Wildcard Mar 17 '16 at 01:42
  • "which rules in the Bash Referenxe Manual or Posix7 Specifications govern that?" - [[...]] is not a POSIX construct, though single square brackets are. – hilcharge Mar 22 '16 at 01:25

2 Answers2

5

In the documentation for the [[ command, you'll see

Word splitting and filename expansion are not performed on the words between the [[ and ]]; tilde expansion, parameter and variable expansion, arithmetic expansion, command substitution, process substitution, and quote removal are performed.

(emphasis mine)

Also the case statement has exemptions

The word undergoes tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal before matching is attempted. Each pattern undergoes tilde expansion, parameter expansion, command substitution, and arithmetic expansion.

Notable by their absence are word splitting and filename expansion.

Additionally, variable assignment (see Shell Parameters)

A variable may be assigned to by a statement of the form

name=[value]

If value is not given, the variable is assigned the null string. All values undergo tilde expansion, parameter and variable expansion, command substitution, arithmetic expansion, and quote removal

So this is safe:

a="hello world"
b=$a

Other places where word splitting is not performed:

My trick: search for the word "undergo" in the bash manual.

glenn jackman
  • 85,964
  • Thanks. Does the rule about word splitting (see my post, quote from Word Splitting section in Bash Manual) not cover the exemptions in your reply? – Tim Mar 16 '16 at 21:33
  • I don't understand your question due to the awkward wording. Can you rephrase? – glenn jackman Mar 16 '16 at 21:34
  • I believe the bash manual has been written with a great deal of precision. If word splitting is missing from the list of expansions for the case statement, I believe that is on purpose. (This is a matter of faith on my part, I have not checked for any bugs registered against the manual) – glenn jackman Mar 16 '16 at 21:39
  • 1
    (1) My question is: does the Word Splitting section in Bash Manual cover or mention the exemptions in your reply? Or did I misunderstand the rule in the Word Splitting section in Bash Manual? (2) also do word splitting and filename expansion always go hand in hand, in the sense that they both apply or both don't apply to each case? – Tim Mar 16 '16 at 21:43
  • 1
    It does not appear to mention the exemptions. This is one reason that the bash manual is so maddening: you really have to read and reread it to understand it as a whole. Consider section 3.2 (Shell Commands) and 3.7 (Executing Commands). To me, those sections should be merged. But I haven't the time nor energy to rewrite that manual. – glenn jackman Mar 16 '16 at 21:52
  • do word splitting and filename expansion always go hand in hand, in the sense that they either both apply or both don't apply to each case? – Tim Mar 16 '16 at 21:56
  • I haven't seen anywhere where they do not go hand in hand, but that's not proof that they are always together. – glenn jackman Mar 16 '16 at 22:01
  • If you want to know the "why" about all of this, you'll really have to go talk to the inventors. I could only speculate. – glenn jackman Mar 16 '16 at 22:02
  • @glennjackman In a read, there is word split (for each variable) but there is not pathname expansion. –  Mar 16 '16 at 22:14
  • @glennjackman: It seems to me that your example of variable assignment is subject to the rule I quoted: word splitting happens unless within double quotes. Do you have another example of variable assignment that is an exemption to the rule? – Tim Mar 16 '16 at 23:02
  • @tim I quoted the manual for assignment where it is quite explicit about the expansions performed. While it is good practice to always quote variables it is not strictly required here. – glenn jackman Mar 16 '16 at 23:21
  • @binary, reference? – glenn jackman Mar 16 '16 at 23:22
  • @glennjackman Is LESS=+/'^ *read \[' man bash enough?: The characters in IFS are used to split the line into words. Or is POSIX better?: the results shall be split into fields as in the shell for the results of parameter expansion No mention of pathname expansion. True, that is not a confirmation of negation, but it is a clear negation of execution. –  Mar 17 '16 at 02:44
  • 1
    @glennjackman Or just try read a b c <<_EOF_ * * * _EOF_ (please add the missing newlines after EOF and before EOF). Then `echo "$a" "$b" "$c" will print * * *. No expansion was performed by read, but the input was split into three vars. –  Mar 17 '16 at 02:49
  • @Bi: I think the word splitting in your last comments is done by read, not by the shell. Here we are talking about word splitting and filename expansion done by the shell – Tim Mar 17 '16 at 08:11
  • Also complicated by the lack of filename expansion in here-documents – glenn jackman Mar 17 '16 at 11:14
  • @glennjackman Is the lack of mention of "pathname expansion" in read command both in man bash and POSIX (as already linked) also a non-confirmation?. –  Mar 17 '16 at 16:00
  • 1
    @Tim And where is the command read implemented? As a built-in? As it must to be able to change shell variables. That means that any shell developer must also implement read (not an external developer of an external command). –  Mar 17 '16 at 16:03
3

The words within [[ and ]] are an extension, which bash uses (among other things) to provide regular expressions:

An additional binary operator, ‘=~’, is available, with the same precedence as ‘==’ and ‘!=’. When it is used, the string to the right of the operator is considered an extended regular expression and matched accordingly (as in regex3)).

Doing filename expansion on a regular expression would not be helpful, since both use the same * and ? meta characters for different purposes.

Further reading:

Thomas Dickey
  • 76,765