2

According to this manual -r for read:

Do not allow backslashes to escape any characters

I understand that generally, the read shell builtin gets input and creates a variable which holds that input as a string in which backslashes would be just literal components and wouldn't escape anything anyway.

Is read -r only used in rare exceptional usecases of read (with the common denominator of the output being anything else than a string)?

  • input is input. string is string. read is the link between – alecxs Feb 22 '21 at 16:28
  • 2
  • 4
    Or in other words, it's rather the other way round, you almost never want to use read without -r and without explicitly setting $IFS for that one read invocation to the list of delimiters you want read to use to delimit words (or the empty string, if you don't want splitting). – Stéphane Chazelas Feb 22 '21 at 17:26
  • @Kusalananda I admit I didn't quite understand the linked question itself, let along I felt lost quite fast with the answer. Anyway, I have edited my question here to improve it; I invite anyone who read it already to re-read it and consider publishing an answer. – variableexpander Feb 22 '21 at 17:39
  • You say "the output being anything else than a string". The output of what? I don't see read outputting anything; it merely sets variables. I agree that read -r makes sense in the majority of cases; it is the special treatment of backslashes that is the exception. But this is just an opinion. – berndbausch Feb 22 '21 at 17:53
  • @berndbausch by "output" I meant these variables. About makes sense in the majority of cases ; I actually thought it only makes sense in rare exceptional cases; me, as a non professional, amateur sysadmin which only does some of my own small-scale shared-hosting/PaaS system administration --- I don't recall ever using it and I never had a problem when I didn't use it; hence I took an opposite approach than of Stéphane Chazelas and of yours and now I seek to learn why I was wrong in doing so. – variableexpander Feb 22 '21 at 18:07
  • 1
    @berndbausch, note that that behaviour was fixed in the Almquist shell in the late 80s which didn't do that backslash unless you passed a -e option (similar to what Dennis Ritchie did to echo in V8 in the early 80s), but unfortunately that was later reverted as portability with the Bourne shell was deemed more important than a cleaner design. – Stéphane Chazelas Feb 22 '21 at 18:18

1 Answers1

5

I understand that generally, the read shell builtin gets input and creates a variable which holds that input as a string in which backslashes would be just literal components and wouldn't escape anything anyway.

Plain read var, without -r, when given the input foo\bar, would store in var the string foobar. It treats the backslash as escaping the following character, and removes the backslash. You'd need to enter foo\\bar to get foo\bar.

read can be used to read multiple values, like so:

$ read a b <<< 'xx yy'; echo "<$a> <$b>"
<xx> <yy>

(<<< is a "here-string", the following string is provided to the command as input.)

It uses the characters in IFS as separators, so whitespace by default. It's these separators that a backslash can be used to escape, making them regular characters, and removing the backslash, also if it appears in front of a regular character. So you'd get:

$ read a b <<< 'xx\ yy'; echo "<$a> <$b>"
<xx yy> <>
$ read a b <<< 'xx\n yy'; echo "<$a> <$b>"
<xxn> <yy>

Being able to escape the separators is seldom useful, and removing backslashes can also be annoying if someone wants to enter a string with C-style character escapes.

In addition, a backslash at the end of a line would make read wait for another line to be read as a continuation of the first, similarly to how continuation lines work in C and in the shell.

With read -r, backslashes are just a regular character:

$ read -r a b <<< 'value\with\backslashes\ yy'; echo "<$a> <$b>"
<value\with\backslashes\> <yy>

In many use cases, backslashes aren't something one would expect the user to input, and if there aren't any, read -r is the same as plain read. But in case someone were to (need to) input backslashes, using read -r may reduce the surprises involved. Hence it's probably good to use it, unless you really know you want them to be special for read (in addition to whatever special properties your program might otherwise assign to them).

ilkkachu
  • 138,973
  • I understood pretty much everything I read besides the $ read a b <<< 'xx yy'; echo "<$a> <$b>" code examples; I am having hard time to understand these; I humbly suggest to change how the code examples are formatted and/or removing the herestrings and focusing on a more intuitive feature for input. Especially the use of $ at the start and not separating input from output to two different blocks makes it hard for me. – variableexpander Feb 23 '21 at 00:49
  • @variableexpander, the $ at the start marks a shell prompt, it's a somewhat usual custom to separate commands from output. It's hard to do italics or such in code blocks to mark that. And partly related to that, at least I try not to write code blocks where there's user input anywhere other than the command line, since it's not easy to mark which part is input and which part is output. – ilkkachu Feb 23 '21 at 10:55
  • somecommand <<< "foo bar" is pretty much the same as echo "foo bar" | somecommand, except that with read, the latter couldn't store the output variables, so it doesn't work there. And I'm not sure something like echo 'xx yy' | ( read a b; echo "<$a> <$b>" ) would be any clearer. – ilkkachu Feb 23 '21 at 10:56
  • @variableexpander, the $ at the start marks a shell prompt I new that, I just prefer markdown over using that :) – variableexpander Feb 23 '21 at 11:00
  • Thanks for the insight, much appreciated. – variableexpander Feb 23 '21 at 11:01
  • @variableexpander, sure I could write "If you run read a b and give it the input xx yy, then $a and $b contain xx and yy, respectively" and did something like that in the first paragraph (which I actually wrote separately from the rest, so maybe that's why). But it's a lot more to write, and harder to copypaste from the terminal where you actually ran the command. Making sure the command is right is useful too, and no-one wants to spend time for unnecessary stuff (because humans are lazy...). That read <<< blah; echo isn't perfect though, with the input to read in the middle. – ilkkachu Feb 23 '21 at 11:21