15

A well-formed printf usually has a format to use:

$ var="Hello"
$ printf '%s\n' "$var"
Hello

However, what could be the security implications of not providing a format?

$ printf "$var"
Hello

As the variable expansion is quoted, there should not be any issues, are there?

Kusalananda
  • 333,661
  • 4
    var='%'; printf "$var" – Chris Davies Apr 27 '22 at 00:13
  • @roaima I am failing to follow. What's the security issue generated by var='%'; printf "$var" ? –  Apr 27 '22 at 00:31
  • 2
    are you talking about the shell command printf or the C function printf(); 'cos the answer is different, depending. – Stephen Harris Apr 27 '22 at 00:45
  • 2
    Well ... I tagged the question with shell, so yes, it is the (usually a builtin) shell printf. Should I add some text to the question to clarify? Feel free to add as you see fit. thanks. @StephenHarris –  Apr 27 '22 at 00:48
  • 3
    Technically, it's incorrect to say that you're "not providing a format" when you run printf "$var". The first argument is the format. What you're doing is providing an unpredictable/dynamic format string and no arguments. – bta Apr 27 '22 at 21:40
  • "What's the security issue generated by var='%'; printf "$var"?" - the program outputs (arguably) unexpected errors. If the program is part of a security module then the error output might trigger unwanted consequences. I can't give a definitive answer to a conjectural question, though – Chris Davies Dec 13 '23 at 13:05

2 Answers2

31

In:

printf "$var"

There are two problems:

  • variable data passed as the format. Could be a problem if $var is under the control of an attacker
  • option delimiter (--) missing, so $var could be taken as an option if it starts with -.

It would be a lot worse with:

printf $var

Where split+glob in most Bourne-like shells is performed upon $var expansion on top of that causing the sort of security vulnerabilities mentioned at Security implications of forgetting to quote a variable in bash/POSIX shells.

Here that would be up to arbitrary command execution:

$ export var1='-va[1$(uname>&2)] x' var2='%d a[1$(uname>&2)]'
$ bash -c 'printf $var1'
Linux
$ ksh -c 'printf $var2'
Linux
0

The arbitrary uname command (thankfully harmless here) was run by printf.

For

printf "$var"

itself, there are fewer problems I can think of.

The most obvious one is the DoS one for things like var=%1000000000s which would spam the output with a lot of space characters or worse with things like %.1000000000f which would also use up a lot of memory and CPU time:

$ var=%.1000000000f command time -f 'max mem: %MK, elapsed: %E' bash -c 'printf "$var"' | wc -c
max mem: 4885344K, elapsed: 0:12.33
1000000002

Other DoS ones could be the $var values that trigger syntax errors because of incorrect format or incorrect options, causing printf to fail and the script it's invoked in along with it if the errexit option is enabled.

printf "$var" with var='-va[1$(uname>&2)]' doesn't seem to be a problem for bash, ksh93 and zsh, the only three shells that I know that support that -v varname option, zsh treating it as the format, and the other two as a syntax error (because of the missing format)¹.

There's some minor information disclosure with ksh93 and bash with export var='%(%Z %z)T\n' that reveals the timezone of the script.

$ bash -c 'printf "$var"'
BST +0100

In yash, printf "$var" would call printf with more than one argument if $var was an array with more than one element, but yash's printf doesn't do arithmetic evaluation and anyway its arithmetic evaluations are not affected by the same kind of command injection vulnerabilities affecting ksh's, bash's or zsh's.

ksh93's printf is the one with the most extensions (all the date formatting, regexp format conversion, padding based on grapheme width, URI/HTML encoding...), and it still remains quite experimental. printf "$data" there exposes thousands of lines of code to that data. I wouldn't be surprised if there was a path for arbitrary command execution in there, possibly via some arithmetic expression evaluation or by triggering some bug in its own code². Of course, that could also happen with any printf implementation.

Problems with variable external data in the printf() C function, are when they contain % sequences that end up dereferencing random memory areas on the stack. printf(var) when var is %12$s tries to print the byte values stored at the 12th argument passed to printf. Since printf is not passed any other argument, that will be something else that happens to be on the stack, and that could be pointer to some area of memory holding sensitive information. With %n, printf() would end up writing some number there.

$ tcc -run -w -xc - $'%6$s\n' <<<'f(char*f){char*s="secret";printf(f);}main(int c,char**v){f(v[1]);}'
secret
$ tcc -run -w -xc - $'%p%p%p%p%p\n%s\n' <<<'f(char*f){char*s="secret";printf(f);}main(int c,char**v){f(v[1]);}'
0x7fff1182db380x7fff1182db500x7900000x80x562b5ec0ba6a
secret

printf utilities may end up calling printf() or may implement all of it themselves (they have to at least to some extent as %b is not in printf(), and for numeric formats, they need to convert the arguments to numbers).

If they do call printf(), they will guard against calling it with not enough arguments to cover the format specification. That is a POSIX requirement that printf "%s" output nothing or that printf %d output 0 for instance, so printf implementations should pass enough empty string or 0 number arguments to printf().

You could imagine poorly written printf implementations failing to do so properly. I'm not aware of any, but I've seen awk implementations in the past where their own printf() was affected (also via OFMT or CONVFMT there which involve printf() processing³).


¹ print "$var" is an arbitrary command injection vulnerability in zsh however via that vector. It's important to use print -- $var there, and even generally print -r -- "$var" is what you want there.

² As an example, I get a SEGV with var='%(%.999999999999s)T' with the ksh93 that comes with Ubuntu 20.04

³ Even today, with my current version of busybox, busybox awk -v OFMT='%#x %#x %#x %#x %g' 'BEGIN {print 1.1}' outputs 0x1 0x4 0x4 0x4624bb30 1.1 and busybox awk -v OFMT='%n %g' 'BEGIN {print 1.1}' segfaults.

  • @IsaaC, -- is necessary with print in zsh to avoid the ACE, but not printf. printf -vx outputs -vx in zsh but gives an error in bash/ksh. In zsh (or any POSIX compliant printf), you'd still need -- for printf -- -- to output --. Other than that, yes, that's the problems I can think of. There may be others. I only singled out ksh93 because its printf implementation is orders of magnitude bigger than most others, but that doesn't mean that you can't have bugs in others as well. – Stéphane Chazelas Apr 27 '22 at 19:35
  • Why is time quoted in your example for var=%.1000000000f? – TooTea Apr 28 '22 at 06:37
  • @TooTea, to make sure the time keyword of the shell is not invoked, but the one from /usr/bin which on my system supports that -f option. The time keyword of my shell (zsh) has equivalent features (with the TIMEFMT variable), but as it happens its %M is broken on systems other than darwin/macos as it gives the mem usage in MiB instead of KiB. – Stéphane Chazelas Apr 28 '22 at 06:40
  • for the exemple : you use var=%.1000000000f 'time' ... : instead of 'time' (which will bypass aliases and builtin named time), maybe use : command time instead? to ensure it bypasses also any function of the same name? function ls { echo "foo"} ; \ls ; 'ls'; command ls : the first 2 echo "foo", and the 3rd one really executes the first ls seen in $PATH – Olivier Dulac Dec 13 '23 at 10:18
  • 1
    @OlivierDulac thanks. I've added it in. Note that command (except in zsh) like 'time' would not bypass a builtin (in a shell that has such a builtin) but both bypass the time keyword that Korn-like shells have which was the intent here. – Stéphane Chazelas Dec 13 '23 at 11:25
4

Since you're talking about the shell printf command not the C printf(3) function, there possible vulnerabilities are more limited with proper quoting of "$var". The shell command won't allow for the traditional stack dumping that could be done in C. But as Stéphane's answer shows, there are some dangers even then, and command injection with unquoted expansion as in the example below.

How you use the output could also potentially cause impact on processing later on.

To create a dumb example:

tst()
{
    while [ "$x" == "" ]
    do
      read x
    done
printf $x

}

The condition "$x is not empty" will pass, but the output of printf will be empty if the user entered %s. So any routine assuming the output of tst is not empty could potentially fail. That could lead to unexpected code paths being taken.

That could lead to a security issue, depending on the rest of the code.

Note there's a lot of "if" and "could" caveats's here. It's very dependent on the application. That's why I said there's not an immediate impact.

So a standard defensive coding style would recommend specifying a format string if the input isn't trusted. If the input is trusted then there's no need.

From a non-security perspective, you normally want output to match input; if the user entered hello%sthere you expect to see that, and not hellothere.

Peter Cordes
  • 6,466
  • As Stéphane's answer points out, there's no immediate security vulnerability in printf "$var" from the question, but there is in your example without the double-quotes on the expansion. But +1 anyway for pointing out a different kind of possible attack, via violating possible invariants in program logic instead of via code injection. – Peter Cordes Apr 27 '22 at 07:31
  • @PeterCordes there are immediate security vulnerabilities with both printf "$var" and printf $var, but it's a lot worse with the latter (is what I'm pointing out in my answer). – Stéphane Chazelas Apr 27 '22 at 08:12
  • @StéphaneChazelas: Oh right, security yes in terms of DoS, but not simple command injection is what I was thinking. – Peter Cordes Apr 27 '22 at 08:13