12

I want to clarify that I am not talking about how to escape characters on the shell level of interpretation.

As far as I can tell, only two character need to be escaped: % and \

To print a literal %, you must escape it with a preceding %:

printf '%%'

To print a literal \ you must escape it with a preceding \:

printf '\\'

Are there any other instances where I would need to escape a character for it to be interpreted literally?

  • looks like \' \" \? .......... a good search engine for this kind of stuff is http://symbolhound.com/ – jsotola Jan 16 '19 at 04:32

2 Answers2

13

In the format argument of printf, only the % and \ characters are special (no, " is not special and \" is unspecified per POSIX).

But, two important notes.

  1. In most printf implementations¹, it's the byte values for \ and % that are special and the POSIX specification could even be interpreted as requiring it as it requires the printf utility to be an interface to the printf(3) C function and not wprintf(3) for instance (like it requires %.3s to truncate to 3 bytes and not 3 characters).

    In some character encodings including BIG5 and GB18030, there are hundreds of characters that contain the encoding of backslash, and to escape those for printf, you'd need to insert a \ before each 0x5c byte within the encoding of those characters!

    For instance in BIG5-HKSCS, as used for instance in the zh_HK.big5hkscs (Hong Kong) locale, all of Ěαжふ㘘㙡䓀䨵䪤么佢俞偅傜兝功吒吭园坼垥塿墦声娉娖娫嫹嬞孀尐岤崤幋廄惝愧揊擺暝枯柦槙檝歿汻沔涂淚滜潿瀙瀵焮燡牾狖獦珢珮琵璞疱癧礒稞穀笋箤糭綅縷罡胐胬脪苒茻莍蓋蔌蕚螏螰許豹贕赨跚踊蹾躡鄃酀酅醆鈾鎪閱鞸餐餤駹騱髏髢髿鱋鱭黠﹏ contain byte 0x5c (which is also the encoding of \).

    With most printf implementations, in that locale, printf 'αb' doesn't output αb but byte 0xa3 (the first byte of the encoding of α) followed by the BS character (the expansion of \b).

    $ LC_ALL=zh_HK.big5hkscs luit
    $ locale charmap
    BIG5-HKSCS
    $ printf 'αb' | LC_ALL=C od -tx1 -tc
    0000000  a3  08
            243  \b
    0000002
    

    Best is to avoid using (and even installing / making available) those locales as they cause all sorts of bugs and vulnerabilities of that sort.

  2. Some printf implementations support options, and even those that don't are required to support -- as the option delimiter. So printf -- won't output -- but likely report an error about a missing format argument. So if you can't guarantee your format won't start with -, you have to use the -- option delimiter:

     printf -- "$escaped_format" x y...
    

In any case, if you want to print arbitrary strings, you'd use:

printf '%s\n' "$data" # with terminating newline
printf %s "$data"     # without

There's no character that is special in the string passed to %s (though note that with the exception of the printf builtin of zsh, you can't pass the NUL character in any of printf arguments).

Note that while the canonical way to enter a literal \ is with \\ and a literal % with %%, on ASCII-based systems, you can also use \134 and \45 and with some printf implementations \x5c, \x25, or \x{5c}, \x{25}, or (even on non-ASCII systems): \u005c, \u0025 or \u{5c}, \u{25}.


¹ yash's printf builtin being the only exception I am aware of.

  • "POSIX requires it to be an interface to printf(3)..." -- that's a bit funny in that the C printf() doesn't interpret backslash-escapes, but it's the compiler that does. Which doesn't make those ASCII-incompatible charsets less of a problem of course. – ilkkachu Nov 15 '19 at 12:43
  • @ilkkachu, yes the idea is that it requires its argument to be interpreted as arrays of bytes as opposed to text strings, like %.3s is meant to truncate to 3 bytes, not 3 characters. POSIX doesn't say in so many words that the format has to be interpreted as an array of bytes though. (and yes the handling of backslash there has nothing to do with the C printf) – Stéphane Chazelas Nov 15 '19 at 14:01
  • @StéphaneChazelas "Some printf implementations support options, and even those that don't are required to support -- as the option delimiter. " printf does not adhere to the XBD Utility Syntax Guidelines- I didn't think POSIX required printf to support the -- end of options delimter – Harold Fischer Nov 15 '19 at 21:52
  • 1
    @HaroldFischer, see https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/utilities/V3_chap01.html#tag_17_04 Standard utilities that do not accept options, but that do accept operands, shall recognize "--" as a first argument to be discarded.... – Stéphane Chazelas Nov 15 '19 at 22:46
  • What about this: !! I get odd messages when I run a printf from the terminal that contains a double exclamation point !! in the string. – Seamus Dec 13 '23 at 05:53
  • @Seamus, that would be history expansion done by your shell, not to do with printf. See How to use a special character as a normal one in Unix shells? for that. – Stéphane Chazelas Dec 13 '23 at 06:35
3

From the manual:

$ man printf
...
   printf FORMAT [ARGUMENT]...
...
   FORMAT controls the output as in C printf.  Interpreted sequences are:

This lists several interpreted sequences. The following are those where the character itself needs to be escaped.

   \"     double quote
   \\     backslash
   %%     a single %

I tested these three in bash, and they behaved as expected. As per man bash, this implementation of printf uses the "standard printf(1) format specifications" as above, in addition to a few more that aren't relevant here.


However, other shells such as zsh implement printf slightly differently. Here, the double quote shouldn't be escaped.

$ printf '"'
"   
$ printf '\"'
\"
Sparhawk
  • 19,941
  • Yeah, same behavior here on dash and bash. For what it's worth, the dash manual makes no mention of needing the to escape ", but maybe I'm not reading in between the lines – Harold Fischer Jan 16 '19 at 04:53
  • @HaroldFischer Presumably dash just inherits printf(1) too? I found the zsh manual a bit more opaque, so I didn't quote it here. – Sparhawk Jan 16 '19 at 04:59
  • (edited) backslash-dquote is only needed if the format string is in dquotes, which is usually a bad idea, as then you also need to backslash backquote and (most) dollarsign, and may need to quadruple backslash if followed by a printf special. printf is builtin in bash and dash, but like all nonspecial builtins in a POSIX shell must also be present as an 'external' program. – dave_thompson_085 Jan 16 '19 at 05:09
  • @dave_thompson_085, the question does say I want to clarify that I am not talking about how to escape characters on the shell level of interpretation. – Sparhawk Jan 16 '19 at 05:10
  • 2
    The standard lists only \\, \a, \b, \f, \n, \r, \t, \v and \nnn, leaving others unspecified, in particular those \" and \', which would be valid escapes in C. (That first one is the double backslash, of course, the space isn't part of it, I blame the sucky formatting in comments) – ilkkachu Nov 15 '19 at 11:14
  • 2
    Assuming this question is about outputting some supplied data: No character has to be escaped in a special way (apart from from the shell) if printf is used properly. For example:printf '%s\n' '\', or printf '%s\n' '%'. Data should not go in the format argument. – Kusalananda Nov 15 '19 at 11:27
  • @Kusalananda: What is improper about this?: printf '--- end log entry ---\n' – Seamus Apr 16 '22 at 07:36
  • @Seamus Nothing? Unless you want the output to contain the character sequence \n rather than a literal newline. If you want \n in the output, you could do printf '--- end log entry ---%s' '\n'. – Kusalananda Apr 16 '22 at 09:54
  • @Kusalananda: Oh, sorry - I should have been more explicit. My printf command (and yours), both throw an error: -bash: printf: --: invalid option – Seamus Apr 16 '22 at 21:32
  • 1
    @Seamus In bash, yes, and I tested it in zsh where it doesn't error out. So make the problematic part a string that you format with %s: printf '%s\n' '--- end log entry ---'. – Kusalananda Apr 16 '22 at 22:31
  • @Kusalananda: Perfect! :)) 'tho I still think it's a bit of an "odd duck". I'll guess that zsh didn't have the legacy to deal with that seems to be the point of item 2. in the answer above?? – Seamus Apr 16 '22 at 22:59