4

In a POSIX sh, or in the Bourne shell (as in Solaris 10's /bin/sh), is it possible to have something like:

a='some var with spaces and a special space'
printf "%s\n" $a

And, with the default IFS, get:

some
var
with
spaces
and
a
special space

That is, protect the space between special and space by some combination of quoting or escaping?

The number of words in a isn't known beforehand, or I'd try something like:

a='some var with spaces and a special\ space'
printf "%s\n" "$a" | while read field1 field2 ...

The context is this bug reported in Cassandra, where OP tried to set an environment variable specifying options for the JVM:

export JVM_EXTRA_OPTS='-XX:OnOutOfMemoryError="echo oh_no"'

In the script executing Cassandra, which has to support POSIX sh and Solaris sh:

JVM_OPTS="$JVM_OPTS $JVM_EXTRA_OPTS"
#...
exec $NUMACTL "$JAVA" $JVM_OPTS $cassandra_parms -cp "$CLASSPATH" $props "$class"

IMO the only way out here is to use a script wrapping the echo oh_no command. Is there another way?

muru
  • 72,889

2 Answers2

5

Not really.

One solution is to reserve a character as the field separator. Obviously it will not be possible to include that character, whatever it is, in an option. Tab and newline are obvious candidates, if the source language makes it easy to insert them. I would avoid multibyte characters if you want portability (e.g. dash and BusyBox don't support multibyte characters).

If you rely on IFS splitting, don't forget to turn off wildcard expansion with set -f.

tab=$(printf '\t')
IFS=$tab
set -f
exec java $JVM_EXTRA_OPTS …

Another approach is to introduce a quoting syntax. A very common quoting syntax is that a backslash protects the next character. The downside of using backslashes is that so many different tools use it as a quoting characters that it can sometimes be difficult to figure out how many backslashes you need.

set java
eval 'set -- "$@"' $(printf '%s\n' "$JVM_EXTRA_OPTS" | sed -e 's/[^ ]/\\&/g' -e 's/\\\\/\\/g') …
exec "$@"
2

If you were using Bash or similar, an array would do the trick:

a=(some var with spaces and a 'special space')

But since the POSIX shell does not have these, the best internal approach I can see is to actually use a special space. The non-breaking space (U+00A0) is well-suited to this purpose, but being outside ASCII requires agreement on the character set of the script.

a="some var with spaces and a special space"
# this is a non-breaking space ------^
echo "$a" \
| while read word; do printf '%s\n' ${word} | sed 's@ @ @g'; done
# this is a non-breaking space ----------------------^

This outputs:

some
var
with
spaces
and
a
special space

At the moment, I am unsure of how to include this in a variable expansion (it will need a subshell), but this should offer a starting point for further investigation.

Fox
  • 8,193
  • 1
    +1 for NBSP idea. I'd thought of using tabs instead, since tabs are much rarer than spaces; hopefully no one would notice the absence of tabs in IFS. Note: an array wouldn't do the trick, the origin variable is set outside the script and bash doesn't export arrays. – muru Oct 24 '16 at 15:03