To reassure a few, I didn't find the bug by observing exploits, I have
no reason to believe it's been exploited before being disclosed
(though of course I can't rule it out). I did not find it by
looking at bash
's code either.
I can't say I remember exactly my train of thoughts at the time.
That more or less came from some reflection on some behaviours of
some software I find dangerous (the behaviours, not the
software). The kind of behaviour that makes you think: that
doesn't sound like a good idea.
In this case, I was reflecting on the common configuration of
ssh that allows passing environment variables unsanitised from
the client provided their name starts with LC_
. The idea is so
that people can keep using their own language when ssh
ing into
other machines. A good idea until you start to consider
how complex localisation handling is especially when UTF-8 is
brought into the equation (and seeing how badly it's handled by
many applications).
Back in July 2014, I had already reported a vulnerability in
glibc localisation handling which combined with that sshd
config, and two other dangerous behaviours of the bash
shell
allowed (authenticated) attackers to hack into git servers
provided they were able to upload files there and bash
was
used as the login shell of the git unix user (CVE-2014-0475).
I was thinking it was probably a bad idea to use bash
as the login
shell of users offering services over ssh, given that it's quite
a complex shell (when all you need is just parsing a very simple command line) and has inherited most of the misdesigns of ksh.
Since I had already identified a few problems with bash
being
used in that context (to interpret ssh ForceCommand
s), I was
wondering if there were potentially more there.
AcceptEnv LC_*
allows any variable whose name starts
with LC_
and I had the vague recollection that bash
exported
functions (a dangerous albeit at time useful feature) were
using environment variables whose name was something like
myfunction()
and was wondering if there was not something
interesting to look at there.
I was about to dismiss it on the ground that the worst thing one
could do would be to redefine a command called LC_something
which could not really be a problem as those are not existing
command names, but then I started to wonder how bash
imported those environment variables.
What if the variables were called LC_foo;echo test; f()
for instance? So I decided to have a closer look.
A:
$ env -i bash -c 'zzz() { :;}; export -f zzz; env'
[...]
zzz=() { :
}
revealed that my recollection was wrong in that the variables
were not called myfunction()
but myfunction
(and it's the
value that starts with ()
).
And a quick test:
$ env 'true;echo test; f=() { :;}' bash -c :
test
bash: error importing function definition for `true;echo test; f'
confirmed my suspicion that the variable name was not sanitized,
and the code was evaluated upon startup.
Worse, a lot worse, the value was not sanitized either:
$ env 'foo=() { :;}; echo test' bash -c :
test
That meant that any environment variable could be a vector.
That's when I realised the extent of the problem, confirmed that it was
exploitable over HTTP as well (HTTP_xxx
/QUERYSTRING
... env vars), other ones like mail processing services, later DHCP (and probably a long list) and
reported it (carefully).