I often need to tail -f apache access logs for websites to troubleshoot issues- one thing that makes it annoying is that anyone loading a page once may cause 12+ lines to get written to the log, and since they're long lines each one wraps multiple lines in my terminal.
tail -f seems to play nicely with piping to grep and awk, and I came up with a pretty simple solution to filter out duplicates when one IP address makes many requests in a particular second (as well as trim it to the particular info I usually need)-
tail -f log.file | awk ' { print $1 " " $4 " " $9}' | uniq
The problem is, this doesn't work. I just get no output at all, even when I know there should be tons of lines printed.
I've tried some troubleshooting, but haven't been able to get things to really work-
tail -f log.file | awk ' { print $1 " " $4 " " $9}'
This works exactly as I think it should, and prints the lines as they happen (but with many duplicates) like so:
12.34.56.78 [10/May/2016:18:42:01 200
12.34.56.78 [10/May/2016:18:42:02 304
12.34.56.78 [10/May/2016:18:42:02 304
12.34.56.78 [10/May/2016:18:42:02 304
12.34.56.78 [10/May/2016:18:42:02 304
12.34.56.78 [10/May/2016:18:42:02 304
12.34.56.78 [10/May/2016:18:42:02 304
12.34.56.78 [10/May/2016:18:42:02 304
tail log.file | awk ' { print $1 " " $4 " " $9}' | uniq
This also works exactly as I think it should, and filters out any duplicate lines. But for my troubleshooting I really need the real time updates of tail -f
How can I make tail -f
filter out duplicate lines?
stdbuf
, e.g.stdbuf -oL uniq
. – Mikel May 10 '16 at 23:58Edit- turns out the stdbuf -oL needs to go before the awk, not the uniq
– Yex May 11 '16 at 00:14This works exactly as I want things to. The filtering isn't perfect (sometimes you'll get alternating pairs of duplicates, but no double duplicates), but it's good enough.
– Yex May 11 '16 at 00:21