There is a very large text file with two comma-separated values:
78414387,10033
78769989,12668
78771319,13677
78771340,13759
80367563,16336
81634533,10025
82878571,10196
110059366,10218
110059411,10812
110059451,10067
I need to search for these values in a log file which is looking like so:
- delivery-AMC_prod_product 231825855936862016-07-02 00:00:52 c.c.i.d.s.d.DeliveryTopologyFactory$$anon$1$$anon$2 [INFO] ack: uid=57773c737e3d80d7def296c7| id=278832702| version=28| timestamp=1467432051000
- delivery-AMC_prod_product 231825855936862016-07-02 00:00:52 c.c.i.d.s.d.DeliveryTopologyFactory$$anon$1$$anon$2 [INFO] ack: uid=57773c732f18c26fe604fd04| id=284057302| version=9| timestamp=1467432051000
- delivery-AMC_prod_product 231825855936862016-07-02 00:00:52 c.c.i.d.s.d.DeliveryTopologyFactory$$anon$1$$anon$2 [INFO] ack: uid=57773c747e3d80d7def296c8| id=357229| version=1151| timestamp=1467432052000
- delivery-AMC_prod_product 231825855936862016-07-02 00:00:52 c.c.i.d.s.d.DeliveryTopologyFactory$$anon$1$$anon$2 [INFO] ack: uid=57773c742f18c26fe604fd05| id=279832706| version=35| timestamp=1467432052000
- delivery-AMC_prod_product 231825855936862016-07-02 00:00:52 c.c.i.d.s.d.DeliveryTopologyFactory$$anon$1$$anon$2 [INFO] ack: uid=57773c744697ddb976cf5a95| id=354171| version=503| timestamp=1467432052000
- delivery-AMC_prod_product 231825855936862016-07-02 00:00:53 c.c.i.d.s.d.DeliveryTopologyFactory$$anon$1$$anon$2 [INFO] ack: uid=57773c754697ddb976cf5a96| id=355638| version=1287| timestamp=1467432053000
My script:
#!/bin/bash
COUNT=0
while IFS=',' read ID VERSION; do
VERSION=`echo $VERSION |col -bx`
if (grep "id=${ID}| version=$VERSION" worker-6715.log.2016-$1.log.* > /dev/null); then
let "COUNT++"
else
echo "$ID, $VERSION FAIL"
exit 2
fi
done < recon.txt
echo "All OK, $COUNT checked"
- If I cut off unnecessary fields from the log file, would this speed the execution up?
- If I create a RAM device and copy the logfile there, would this speed the execution up or is my Red Hat Linux 6 (Hedwig) caching the file anyway? Any other suggestions?
-E
to match it explicitly then that would speed up too. – Keyshov Borate Jul 04 '16 at 11:29|
is part of the pattern, not an OR. – terdon Jul 04 '16 at 11:30...| col -bx
achieve? It spawns a subshell and an external process, each time round the loop, but doesn't look like it should have any effect onVERSION
. – JigglyNaga Jul 04 '16 at 14:58