0

sh on Debian is linked to dash. The following command produces two empty lines:

sh -c 'echo "\n"'

Why two empty lines? Why not one? Why \n is interpreted?

However, on busybox's sh the same command produces the following line:

\n

Is there any uniformity in backslash handling by sh on different platforms? Is there any standard saying haw backslash must be handled?

  • If dash prints two empty lines, it is POSIX compliant with respect toecho, since the POSIX base +XSI extensions is required for a system that is permitted to use the UNIX branding. – schily May 18 '20 at 06:46
  • Related, if not a duplicate: https://unix.stackexchange.com/questions/65803 – Kusalananda May 18 '20 at 07:19

1 Answers1

1

POSIX+XSI compliant echo implementations output 2 newline characters when invoked with a \n as argument as it requires it to expand \n into a newline character. The echo builtin of your sh seems to be compliant in that regard.

Some other echo implementations require a -e option for that to happen even though POSIX requires echo -e to output -e. (that will change in future versions of the POSIX specification where the behaviour will be unspecified instead (except for those systems implementing the XSI option)).

For those reasons and more, it's better to avoid echo and use printf instead. See Why is printf better than echo? for details.

In any case, in all Bourne like shells, within double quotes, \ is only special when in front of the \, $, " and ` characters (whose special meaning it removes) and newline (which causes it to be removed, \<NL> is a line-continuation), not n. So in:

echo '\n'
echo "\n"
echo \\n
echo $'\\n'

It's the string made of the two characters \ and n that is passed to echo (at least when that code is not within `...` for the last 2).

For a string made of one newline character to be passed, you'd need:

echo $'\n'

That's ksh93 syntax now supported by many other Bourne-like shells. Portably (in Bourne-like shells):

echo '
'
echo "
"

Though you may want to make it:

# at the top of the script:
NL='
'
# or (with the Bourne shell, use `...` instead):
eval "$(printf 'NL="\n" CR="\r" TAB="\t"')"

# and then:
echo "$NL"

To improve legibility.

Note that the situation will be different in non-Bourne-like shells. See How to use a special character as a normal one? for details on that.

  • 1
    If printf was implemented without bugs in all shells, this was a useful alternative to echo. But many shells do not implement printf correctly, so the important thing is still to choose the right shell – schily May 18 '20 at 06:52
  • @schily, what bug(s) are you refering to? By correctly, do you mean as per the POSIX specification? – Stéphane Chazelas May 18 '20 at 06:57
  • Last time I checked, several shells did not handle null bytes correctly and in general, I would expect printf fieldwith specifications to be usable. Also not all shells correctly distinct between the escape method in the format string and the %b arguments. – schily May 18 '20 at 07:09
  • BTW: The general problem needs to be seen in a wider scope. gettext on Linux is broken, as it by default does not expand escape sequences which is needed since gettext needs to do the job of the C-compiler if you like to be able to use common entries in the *.mo files. With respect to usability in the l11n area, you need a printf that supports %n$ and this is only true for bosh and ksh93 IIRC. – schily May 18 '20 at 07:23
  • @schily, I'm not sure I see the relevance of gettext wrt the printf utility especially when it comes to its suitability as an alternative to echo. Note that %n$ is also supported by zsh, but is not POSIX. It's true POSIX printf %s fieldwidth specification is mostly useless as it works in number of bytes (improved in zsh, ksh93, fish), but again, that's not really relevant to using printf as a better alternative to echo. – Stéphane Chazelas May 18 '20 at 07:48
  • The point seems to be that handling of escape sequences as expected is where GNU based programs have general problems and where these programs not only ignore POSIX but also own written specs, see the gettext spec written by Sun and Linux people in collaboration in y2000. – schily May 18 '20 at 09:00
  • @schily Do you have an example of GNU printf or the GNU shell's printf builtin handling escape sequences the wrong (not POSIX) way? (other than the problem with byte 0x5c found in some characters other than backslash in some locales which most echo/printf implementations handle improperly). – Stéphane Chazelas May 18 '20 at 09:07
  • BTW, Linux has no gettext, it's just a kernel, you may be confusing with GNU gettext. – Stéphane Chazelas May 18 '20 at 09:08
  • The file LI18NUX-2000-amd4.pdf explains that you are wrong. Regarding printf.... There are plenty of non-compliances in GNU gettext. \c is expanded in the format string, %b does not support field with. %b expands \33 to ESC while it should not. There are backslash extensions that cause text that is expected to be printed literarily to be expanded to something... – schily May 18 '20 at 09:46
  • Sorry that should read plenty of non-compliances un GNU printf – schily May 18 '20 at 10:03
  • @schily, the behaviour is unspecified per POSIX for printf '\c' or printf %b '\33', so those are conforming behaviours. I see your point about %3b in GNU printf (but not the GNU shell's printf builtin), but you still haven't justified your claim about the handling of escape sequences being broken in GNU printf implementations. – Stéphane Chazelas May 18 '20 at 10:24
  • Unspecified behavior in this area causes unpredictable results. This is different from true enhancements, that are only used in case you explicitly use e.g. a related option. BTW: It is your claim that printf is a predictable replacement for echo. – schily May 18 '20 at 11:34