Bash uses C-style strings internally, which are terminated by null bytes. This means that a Bash string (such as the value of a variable, or an argument to a command) can never actually contain a null byte. For example, this mini-script:
foobar=$'foo\0bar' # foobar='foo' + null byte + 'bar'
echo "${#foobar}" # print length of $foobar
actually prints 3
, because $foobar
is actually just 'foo'
: the bar
comes after the end of the string.
Similarly, echo $'foo\0bar'
just prints foo
, because echo
doesn't know about the \0bar
part.
As you can see, the \0
sequence is actually very misleading in a $'...'
-style string; it looks like a null byte inside the string, but it doesn't end up working that way. In your first example, your read
command has -d $'\0'
. This works, but only because -d ''
also works! (That's not an explicitly documented feature of read
, but I suppose it works for the same reason: ''
is the empty string, so its terminating null byte comes immediately. -d delim
is documented as using "The first character of delim", and I guess that even works if the "first character" is past the end of the string!)
But as you know from your find
example, it is possible for a command to print out a null byte, and for that byte to be piped to another command that reads it as input. No part of that relies on storing a null byte in a string inside Bash. The only problem with your second example is that we can't use $'\0'
in an argument to a command; echo "$file"$'\0'
could happily print the null byte at the end, if only it knew that you wanted it to.
So instead of using echo
, you can use printf
, which supports the same sorts of escape sequences as $'...'
-style strings. That way, you can print a null byte without having to have a null byte inside a string. That would look like this:
for file in * ; do printf '%s\0' "$file" ; done \
| while IFS= read -r -d '' ; do echo "$REPLY" ; done
or simply this:
printf '%s\0' * \
| while IFS= read -r -d '' ; do echo "$REPLY" ; done
(Note: echo
actually also has an -e
flag that would let it process \0
and print a null byte; but then it would also try to process any special sequences in your filename. So the printf
approach is more robust.)
Incidentally, there are some shells that do allow null bytes inside strings. Your example works fine in Zsh, for example (assuming default settings). However, regardless of your shell, Unix-like operating systems don't provide a way to include null bytes inside arguments to programs (since program arguments are passed as C-style strings), so there will always be some limitations. (Your example can work in Zsh only because echo
is a shell builtin, so Zsh can invoke it without relying on the OS support for invoking other programs. If you used command echo
instead of echo
, so that it bypassed the builtin and used the standalone echo
program on the $PATH
, you'd see the same behavior in Zsh as in Bash.)
-d ''
already means to delimit on\0
? I found an explanation here: http://stackoverflow.com/questions/8677546/bash-for-in-looping-on-null-delimited-string-variable#comment33868451_8677566 – CMCDragonkai Jan 07 '17 at 08:01echo -e "aaa\x00bbb"
does print thebbb
part too. You just can’t assign it to variables, unless you escape them (e.g.esc0() { sed 's/\xFF/\xFF\xFF/g; s/\x00/\xFF0/g'; }
) – Jul 18 '22 at 01:05