7

The question may sound quite complicated, but in fact it is not! Consider:

% f() { echo "$@"; }
% f a
a
% f cmd -o"value with space"
cmd -ovalue with space
% f cmd -ovalue with space
cmd -ovalue with space
% f cmd -o'value with "quotes"'
cmd -ovalue with "quotes"
% f cmd -ovalue with "quotes"
cmd -ovalue with quotes

So obviously the property of "value with space" being just one parameter gets lost; likewise will the double-quotes be "eaten up" on re-input.

The desired output is output that can be used as input again to produce the same output.

I don't think there's something built into BASH that allows such, right?

Clarification

In case it's not obvious what I want to do: I have a command stored in a shell array, and I want to print such array to stdout in a way that the user can copy&paste the output to enter it at the shell prompt (or a script) so that the same command that was in the array is reproduced.

Consider this (stupid) example:

> X=(echo "Bob's car is named \"Bobby\"")

A plain echo "${X[@]}" would output

echo Bob's car is named "Bobby"

while one possible correct output could be

echo Bob\'s' car is named "Bobby"'

U. Windl
  • 1,411

3 Answers3

6

One of the options for transformation during parameter expansion in Bash is (available since Bash 4.4 it seems; older versions output a "bad substitution"):

${parameter@operator}
The expansion is either a transformation of the value of parameter or information about parameter itself, depending on the value of operator. Each operator is a single letter:

[...]
Q
The expansion is a string that is the value of parameter quoted in a format that can be reused as input.

bash-5.2$ f() { echo "${@@Q}"; }
bash-5.2$ f cmd -o'value with "quotes"'
'cmd' '-ovalue with "quotes"'
bash-5.2$ f cmd -ovalue with "quotes"
'cmd' '-ovalue' 'with' 'quotes'
muru
  • 72,889
5

The feature you're looking for is called Serialisation (Serialization in American English).

Here a simple command is an array of one or more strings, so it boils down to serialising an array.

If the command is an external command, there's that further limitation that the strings cannot contain the NUL byte as they are passed as C-style strings to the execve() system call. In most shells, you have that same limitation even for commands that don't involve the execve() system call (like for builtins or functions), the only exception being the zsh shell.

So if you can make that assumption that command arguments won't contain NUL bytes, serialisation is easy: you just need to print them NUL-delimited:

print0() {
  [ "$#" -eq 0 ] ||
    printf '%s\0' "$@"
}
print0 cmd -o"value with space" > file

In bash 4.4 or newer, reading that back as a list of arguments is just:

readarray -td '' args < file

Where -d '' sets the NUL byte as the delimiter, -t strips the delimiter from the values, not strictly necessary in current versions of bash which can't store NULs in variables.

An then do:

"${args[@]}"

To execute the command.

Or even with GNU xargs:

xargs -r0a file env

Beware however, that except in zsh, you can't store the result of that serialisation in a variable as in all other shells, you can't have a NUL byte in the value of a shell variable.

JSON, XML, YAML are common formats used for serialisation of complex data structure, but they have their own problem (for instance, JSON strings must be made of characters, while argument strings are arrays of arbitrary bytes), and more importantly, few shells have builtin support for parsing them (ksh93v- beta version had some experimental support for parsing JSON, but that was very buggy and dropped in newer versions).

A few languages have a builtin serialisation format. For instance, php has a serialize() and corresponding unserialize() function, but php doesn't have very good APIs to execute commands.

A common approach in interpreted languages is to serialise as code. That's what Data::Dumper in perl does for instance. If you have an array with cmd and -ovalue with space as arguments, you can just store it as @array = ("cmd", "-o value with space") and then it's just a matter of evaluating that perl code to get the array back.

In Korn-like shells such as bash or zsh, it's very easily done as that's exactly what typeset -p does. In zsh, you could do typeset -p argv in your serialise function, but you can't do typeset -p @ as @ is not a variable. In bash, where the positional parameters are not mapped to the argv variable, you can still use a temporary array.

serialise() {
  local args
  args=( "$@" )
  typeset -p args
}
serialised=$(serialise cmd -o"value with space")

Then unserialising is just:

eval "$serialised"

Which will create the $args array (beware that if run in a function, that array will be local to the function).

And then:

"${args[@]}"

again to run the command.

Beware that the unserialisation has to be done with the same shell version, on the same OS and in the same locale as where serialisation was done. See this answer for "Escape a variable for use as content of another script" for further details on how to serialise strings.


For completeness, in ksh93, which has more complex data structures than other shells, including multi-dimensional arrays, structures and objects, there is builtin serialisation and unserialisation support.

  • serialise: print -C var
  • unserialise: read -C var

For instance, you can copy a variable with:

print -C var | read -C var_copy
  • I realized that there is not tag for "serialization" yet ;-) Also please explain the effect of setting an empty delimiter for readarray. – U. Windl Nov 25 '22 at 08:21
  • @U.Windl, see info bash readarray which points to its misnomer equivalent: info bash mapfile. – Stéphane Chazelas Nov 25 '22 at 08:27
  • The info from info is basically the same as in the BASH manual page, but still it does not really explain the effect of an empty delimiter. I could guess that an empty string is represented as a binary zero internally, but maybe some day the argument checker might complain that the argument to -d cannot be empty... – U. Windl Nov 25 '22 at 08:33
  • 1
    @U.Windl, maybe the older versions of the manual were not as clear, but the current version doesn't leave any doubt as to what -d with an empty string does. Note that info bash is the manual same as the one you get in man, though using man for a manual this size is not really appropriate IMO. – Stéphane Chazelas Nov 25 '22 at 09:02
  • 1
    This seems very strained as an example of serialization, unless you are thinking of a command-line command as some kind of "object". It looks a lot more like ordinary string escaping. – Karl Knechtel Nov 25 '22 at 09:52
  • @KarlKnechtel, as I said, a simple command (without redirection or variable assignment) is a an array of NUL-delimited strings. In JSON, you'd serialise it as ["arg1", "arg2"], here ["cmd", "-ovalue with space"]; as shell code cmd -o"value with space" for instance. – Stéphane Chazelas Nov 25 '22 at 09:58
2

To add to the other answers, bash's (and zsh's) printf has %q for this:

%q causes printf to output the corresponding argument in a format that can be reused as shell input.

$ f() { printf '%q ' "$@"; echo; }
$ f cmd -o"value with space" -o'"'\' -o"'" -o"'\\" -o'\' $'\n'
cmd -ovalue\ with\ space -o\"\' -o\' -o\'\\ -o\\ $'\n' 

The above is how it works with the builtin in those shells. In addition to that, GNU's printf from coreutils also supports that, but the output is quoted instead of escaped:

$ f() { /bin/printf '%q ' "$@"; echo; }
$ f cmd -o"value with space" -o'"'\' -o"'" -o"'\\" -o'\' $'\n'
cmd '-ovalue with space' '-o"'\''' "-o'" '-o'\''\' '-o\' ''$'\n' 
Note that the way `printf` works is that the format string is *repeated* while there are arguments remaining. So in these examples, there's a trailing space at the end.
JoL
  • 4,735