Mostly, it means what it says, e.g.:
$ read -d . var; echo; echo "read: '$var'"
foo.
read: 'foo'
The reading ends immediately at the .
, I didn't hit enter there.
But read -d ''
is a bit of a special case, the online reference manual says:
-d delim
The first character of delim is used to terminate the input line, rather than newline. If delim is the empty string, read will terminate a line when it reads a NUL character.
\0
means the NUL byte in printf
, so we have e.g.:
$ printf 'foo\0bar\0' | while read -d '' var; do echo "read: '$var'"; done
read: 'foo'
read: 'bar'
In your example, read -d ''
is used to prevent the newline from being the delimiter, allowing it to read the multiline string in one go, instead of a line at a time.
I think some older versions of the documentation didn't explicitly mention -d ''
. The behaviour may originally be an unintended coincidence from how Bash stores strings in the C way, with that trailing NUL byte. The string foo
is stored as foo\0
, and the empty string is stored as just \0
. So, if the implementation isn't careful to guard against it and only picks the first byte in memory, it'll see \0
, NUL, as the first byte of an empty string.
Re-reading the question more closely, you mentioned:
The author commented under the answer that -d ''
means using the NUL string as delimiter.
That's not exactly right. The null string (in the POSIX parlance) means the empty string, a string that contains nothing, of length zero. That's not the same as the NUL byte, which is a single byte with binary value zero(*). If you used the empty string as a delimiter, you'd find it practically everywhere, at every possible position. I don't think that's possible in the shell, but e.g. in Perl it's possible to split a string like that, e.g.:
$ perl -le 'print join ":", split "", "foobar";'
f:o:o:b:a:r
read -d ''
uses the NUL byte as the separator.
(*not the same as the character 0
, of course.)
Why not use something like -d '\0'
or -d '\x0'
etc.?
Well, that's a good question. As Stéphane commented, originally, ksh93's read -d
didn't support read -d ''
like that, and changing it to support backslash escapes would have been incompatible with the original. But you can still use read -d $'\0'
(and similarly $'\t'
for the tab, etc.) if you like it better. Just that behind the scenes, that's the same as -d ''
, since Bash doesn't support the NUL byte in strings. Zsh does, but it seems to accept both -d ''
and -d $'\0'
.
abc
is four bytes,a
b
c
<NUL>
. An empty string is one byte,<NUL>
. So this isn't really a special case at all in any shell that's written in C and using standard C strings. – Charles Duffy May 07 '22 at 22:17