1

I am currently piecing together a tool to work with Syslogs generated in my network, one of the requirements is to convert the DateTime from the format in which it is in syslog (%b %d %Y %T) to epoch. In essence, this is what I am trying to achieve:

Original Syslog format:

1:      Jul 02 2019 15:14:19: %ASA-6-106015: <message> 
2:      Jul 02 2019 15:14:49: %ASA-6-106015: <message>

Final Log:

1:      1562080489   %ASA-6-106015  <message>
2:      1562080529   %ASA-6-106015  <message>

I know that I can do this by iterating through the entire log and performing a date -d operation. This is something I want to avoid. I prefer using GAWK time functions.

Here is my approach,

gawk -F: '{ print strftime("%s", timestamp}' syslog.log  

But here the timestamp must be in the same format as the value returned by the systime() function. Which it isn't.

Also, I cannot use the mktime() function to convert syslog timestamp to the required format since it accepts input only if it is in a specific format [YYYY MM DD HH MM SS]

I feel there is a method to do this, but I am missing it. Any alternate methods will also be appreciated.

  • what local time is the log file using? I've tried converting it with TZ=UTC, but the result is 30 minutes off. –  Jul 04 '19 at 14:34
  • 1
    @mosvy Its actually GMT+5:30 – CodeHumor Jul 09 '19 at 16:33
  • s/minutes/seconds/ in my comment above. Converting with GMT+5:30 will not result in those values. And I'm pretty sure that not passing each line back and forth between two processes is faster than doing it ;-) –  Jul 09 '19 at 17:33

4 Answers4

2

With GNU date, you can run date once and have it take input from stdin. Using gawk's coprocess feature to have a single instance each of awk and date process all the dates:

% awk -v cmd='stdbuf -oL date +%s -f-' -F': ' 'BEGIN{OFS=FS} {print $2 |& cmd; cmd |& getline $2} 1' foo
1: 1562048059: %ASA-6-106015: <message>
2: 1562048089: %ASA-6-106015: <message>

Note that date's output needs to be unbuffered (hence the stdbuf -oL), otherwise the coprocess will hang.

muru
  • 72,889
  • Why coprocess? What's wrong with: awk -F': ' 'BEGIN{cmd="date +%s -d "; OFS=FS} {exe=cmd "\"" $2 "\"";exe |& getline $2}1' file –  Jul 04 '19 at 20:34
  • Why execute date so many times, when just once will do? That's why OP is avoiding it - they're going to run it on large log files (see other questions by them) – muru Jul 04 '19 at 23:49
  • Got it, thanks. @muru. –  Jul 05 '19 at 21:37
1

Just like the date(1) utility, gawk's mktime() assumes that the date spec is using the local time.

To force it to use UTC, the TZ envvar should be used:

$ TZ=UTC gawk -F'[: ]+' '{sub(/([^:]+:){4} */, mktime(sprintf("%s %02d %s %d %d %d", $3, index("  JanFebMarAprMayJunJulAugSepOctNovDec",$1)/3, $2, $4, $5, $6))"\t"$7"\t"); print}'
1562080459      %ASA-6-106015   <message>
1562080489      %ASA-6-106015   <message>
0

Here is a typical way of converting the month names into numbers by using an associative array where the index is the month name and the value is the month number. Eg mon["Jul"] is 7. This is setup once in the BEGIN block.

awk 'BEGIN { 
       split("Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec",months," ")
       for(i=1;i<=12;i++)mon[months[i]] = i }
     { m = $2; d = $3; y = $4; t = $5; gsub(":"," ",t)
       print mktime(y " " mon[m] " " d " " t) }'

Then for each line the various fields are re-arranged into the right order for mktime() and concatenated with intervening spaces. The time t field has the : converted to space. The above just prints the epoch time, you still need to add the rest of the data.

meuh
  • 51,383
-1

Perhaps perl:

perl -MTime::Piece -i.bak -pe '
    if ( /([[:upper:]][[:lower:]]{2} \d{2} \d{4} \d\d:\d\d:\d\d)/ ) {
        $datetime = Time::Piece->strptime($1, "%b %d %Y %T");
        $epoch = $datetime->epoch;
        s/$timestamp/$epoch/
    }
' log_file
glenn jackman
  • 85,964