1
$ FILE="$(mktemp)"
$ printf "a\0\n" > "$FILE"
$ od -tx1z "$FILE"
0000000 61 00 0a                                         >a..<
0000003

So far so good.

I wrapped the above into a bash script

#! /bin/bash

cmd=("$@")
FILE="$(mktemp)"
eval "${cmd[@]}" > "$FILE"
od -tx1z "$FILE"

but

$ script printf 'a\0\n' 
0000000 61 30 6e                                         >a0n<
0000003

Why does the output change \0 to literal string? How can I prevent that from happening?


Not very important in this post: my question comes from that I am trying to wrap some commands into a bash script, so to prevent a command expansion from removing NUL:

FILE="$(mktemp)"
printf "a\0\n" > "$FILE"
S="$(uuencode -m "$FILE" /dev/stdout)"
uudecode -o /dev/stdout <(printf "$S") | od -tx1
rm "$FILE"
Tim
  • 101,790
  • Please use printf '%s' "$S" the other in your question is unsafe to the contents of $S. –  Nov 21 '18 at 00:09
  • can you please tell me why? – Tim Nov 21 '18 at 00:10
  • Because printf is designed to have format and the string separated. When using printf without an explicit format the string could be modified by any of the escapes that also afect the command echo. Read this and entry 2. in here for more details. –  Nov 21 '18 at 03:20

2 Answers2

1

Why does the output change \0 to literal string? How can I prevent that from happening?

Because you used eval, which adds another level of shell processing. eval runs the command printf a\0\n, where the backslashes escape the zero and the n, leaving them as-is with the backslash removed.

You could prevent that by well, not using eval. Just using "$@" > "$FILE" should work to run the command given as arguments to the script. Though in that case, you couldn't use redirections or other shell syntax there, as you could with eval. Or, you could redesign the whole thing so that you don't need to pass commands as arguments.

I am trying to wrap some commands into a bash script, so to prevent a command expansion from removing NUL:

S="$(uuencode -m "$FILE" /dev/stdout)"

Is that a problem here? uuencode -m shouldn't produce any NUL bytes. Exactly the opposite, since it encodes binary data to text.

That last script writes an a, a NUL, and a newline to $FILE and passes the same to od, which prints out the hex representation of those, 0000000 61 00 0a, or something like that.

ilkkachu
  • 138,973
  • Thank you. Could you explain "though in that case, you couldn't use redirections or other shell syntax there, as you could with eval"? It seems that the redirection in "${cmd[@]}" > "$FILE" works. – Tim Nov 20 '18 at 19:50
  • @Ben, if you run script "echo foo > bar" or script echo foo ">" bar, and the script runs eval "$@", the > passed as an argument to the script will be taken as the redirection operator by eval, and you'll see the output go to the file bar. But if the script runs "$@" without eval instead, then the first one will give an error, and the second will output the string foo > bar. In neither case is the > in the argument interpreted as a redirection operator. – ilkkachu Nov 20 '18 at 20:11
  • Thanks. I found that when cmd=( "ls" "|" "cat") "${cmd[@]}" doesn't run by itself, but need eval "${cmd[@]}". cmd="ls | cat" also need eval "$cmd". So when do I need eval and when do I not need eval to run a string or an array of strings as a command? – Tim Nov 20 '18 at 21:49
  • When will using eval fail to run a command (here is an example), while not using eval will succeed? – Tim Nov 20 '18 at 21:49
1

You could output a NUL to stdout, stderr or receive it from stdin.
Any "internal" capture and (almost) any use of a NUL is a problem.

You can emit a NUL to stdout:

$ printf 'a\0b' | sed -n l
a\000b$

Or:

$ printf 'a\0b' | cat -A
a^@b

And to a file also:

$ printf 'a\0b' >outfile

And exactly that works in scripts as well.

eval

The problem with your first script is that it is using eval:

$ cmd=(printf 'a\0b\n')
$ echo "${cmd[@]}"
printf a\0b\n

$ eval echo "${cmd[@]}"
printf a0bn

In the second loop tru the shell line parsing backslashes were removed.
Just don't use eval:

$ "${cmd[@]}"| sed -n l
a\000b$

expansions

That goes to prove that both (the builtin) printf and stdout are able to use NULs.

But this fails:

$ printf '%s\n' "$(printf 'a\0b')" | cat -A
bash: warning: command substitution: ignored null byte in input
ab$

There is even (in bash 4.4+) a warning message.

In short, the shell (most shells) use C-strings internally, a C-string ends in the first NUL. Some shells cut at the first NUL in a "command expansion".

$ ksh -c 'echo $(printf "a\0b\n")|sed -n l'    # also yash
a$

Some remove the NUL

$ bash -c 'echo $(printf "a\0b\n")|sed -n l'
ab$

And some even change the NUL to an space:

$ zsh -c 'echo $(printf "a\0b\n")|sed -n l'
a b$

Similar problems happen with assignments to variables.

encode

Yes, the answer you link to use uuencode to encode (and decode) the file content.

A simpler approach seems to be to use xxd (which can reverse octal dumps):

FILE="$(mktemp)"
printf "a\0b\n" > "$FILE"
S=$(xxd -p "$FILE")
xxd -r -p <(printf '%s' "$S") | xxd -p
rm "$FILE"