2

When I run awk '/2017-12-05T12:07:33.941Z/{y=1;next}y' file.json it works as expected, printing everything after the timestamp.

I'm trying to follow the syntax in this Q/A but my syntax isn't expanding right:

awk -v last_log="$last_log" '/{print last_log}/{y=1;next}y' file.txt

Also trying:

awk -v last_log="$last_log" '/$0 ~ last_log/{y=1;next}y' file.txt

Example, given the following input, return all logs after last processed( 2017-12-05T12:07:33.941Z):

{ "name": "PeriodicWork", "hostname": "myHost", "pid": 12189, "level": 20, "msg": "Executing [CheckFailedTask NodeId=8]", "time": "2017-12-05T10:07:33.941Z", "v": 0 }

{ "name": "PeriodicWork", "hostname": "myHost", "pid": 12188, "level": 50, "msg": "Executing [CheckFailedTask NodeId=8]", "time": "2017-12-05T12:07:33.941Z", "v": 0 }

{ "name": "PeriodicWork", "hostname": "myHost", "pid": 12187, "level": 40, "msg": "Executing [CheckFailedTask NodeId=8]", "time": "2017-12-05T12:57:33.941Z", "v": 0 }

2 Answers2

3

Here, I don't think it was your intention to treat 2017-12-05T12:07:33.941Z as a regexp, (where . would match any character instead of just .).

For a substring match, as opposed to a regexp match, you can use index():

LAST_LOG="$last_log" awk 'index($0, ENVIRON["LAST_LOG"]) {y=1;next};y' file.txt

I prefer ENVIRON over -v as -v mangles the content of the variable if it contains backslashes (as already noted in the Q&A you referenced).

About why yours fail:

/{print last_log}/

as a condition, matches on records that match the {print last_log} regular expression, but that's not a valid regular expression ({ is a regexp operator that needs to be used as {2} or {1,5}...).

In:

/$0 ~ last_log/

again, that tries to match on $0 ~ last_log as a regexp. Here that means on lines that contain 0 ~ last_log after the end of the line ($), so will never match. You probably meant:

awk -v last_log="$last_log" '$0 ~ last_log {y=1;next};y' file.txt

That is where the condition is the $0 ~ last_log expression as opposed to just one regexp /foo/. /foo/ is short for $0 ~ /foo/, that is match the foo regexp against the full record.

You can do $0 ~ var, but you can't do var alone, as it wouldn't be a regexp matching, but an expression that resolves to true if var contains a number other than 0, or a non-empty non-numerical string (like for your y one)

0

With awk intervals

From last_log to eof can also be obtained by "," operator;

 awk  '$0 ~ last, 0'  last="$last_log" file.txt

With last_log values like the one presented

 awk "/$last_log/,0" file.txt

Probably we need to add | tail -n +1

With Sed:

sed  "1,/$last_log/d"  file.txt
JJoao
  • 12,170
  • 1
  • 23
  • 45
  • That prints the first matching line though which it seems the OP wanted to skip. – Stéphane Chazelas Dec 05 '17 at 15:52
  • @StéphaneChazelas, your are right. | tail -n +2 ? – JJoao Dec 05 '17 at 15:59
  • 1
    Or awk 'NR==1, $0 ~ last{next}; 1' though that wouldn't be simpler than the OP's $0 ~ last {y=1;next};y – Stéphane Chazelas Dec 05 '17 at 16:02
  • @StéphaneChazelas awk "1,/$last_log/{next} 1" – JJoao Dec 05 '17 at 16:07
  • 1
    Those are best avoided as it's a command injection vulnerability if the content of the variable is not under your control. It's just as dangerous as passing variable data to eval. – Stéphane Chazelas Dec 05 '17 at 16:08
  • @StéphaneChazelas, right again. The same problem appear with the other uses of "$last_log" – JJoao Dec 05 '17 at 16:15
  • 1
    In the other cases (LAST_LOG="$last_log" awk... or awk -v last_log="$last_log"), except for bugs in awk implementations, you wouldn't get a command injection vulnerability. You could get a DoS in the regexp case for values of regexps that are very expensive like a (x*){5000} that eats all your memory and time of one CPU with GNU awk, but normally not arbitrary command execution. – Stéphane Chazelas Dec 05 '17 at 16:34
  • @StéphaneChazelas, thank you. (In this situation this does not constitutes a problem, but in other situations it may be dangerous) – JJoao Dec 05 '17 at 16:42