3

We have this code in a shell script that pipes output for Apache to log.

declare -a values=( $taintRequestVals )

for item in ${!values[@]}
do
    cat $apacheLog | sed "s/${values[$item]}=[^&\t\n]*/${values[$item]}=***/g" | /bin/grep ${values[$item]}=
done

However it's extremely inefficient. Within seconds, the access.log quadrupled exponentially to the point where the server's root slice filled up. Looking for a better way obfuscate sensitive data such as passwords while Apache is writing to access.log.

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255

2 Answers2

4

The problem here is, you're reading from the Apache log and writing to it at the same time. Whatever you added to the log also makes it back into the pipeline through the cat call (no wordplay intended :) ). This creates a nasty positive feedback loop that will keep working until your file system fills up. The answer to this question may be interesting to you about why this happens.

How should you go about it then? A naive solution would be to modify the file in place like so:

for item in ${!values[@]};do
    sed -i "..." "$apacheLog"  #cat isn't needed here
done

and don't pipe the output anywhere: the script itself will modify the file in situ. Also see terdon's answer for how to make the sed call only once (without a loop) so as to improve efficiency.

The problem with this approach, however, is that a live Apache server will likely be logging things to the file as you're working on it and weird things can start happening. A better solution would be to look in the Apache documentation for ways to keep sensitive information out of the logs.

Incidentally, what you're doing doesn't even sanitize the logs: it appends the sanitized lines back into the (still tainted) log file.

Joseph R.
  • 39,549
0

As it stands, there are various improvements you can make. First, and least important, you have a useless use of cat. What is much more important is that you are running sed multiple times, each of which will print out the entire file. I'm not really sure what you are doing with grep, are you trying to print only those lines that contain the specific variable?

Anyway, one way of doing things better would be to run sed once and have it do all the replacements. Something like:

replace=""
for item in ${!values[@]}
do
    ## build the sed line
    replace="s/${values[$item]}=[^&\t\n]*/${values[$item]}=***/g;$replace"
done

### run the replacement using sed's -i option so it 
### changes the original file
eval sed -i \""$replace"\" $apacheLog
terdon
  • 242,166