I've got massive file looking similar to that:
H2,3,5,9,ef,ty,i;
H2,7,5,6,rt,hg,j;
T2,5,5,0,207,3.7,00,...,2023:46:18:14:31,76;
T2,5,5,0,207,3.5,00,...,2023:46:18:14:31,76;
T2,5,5,0,119,3.5,00,...,2023:46:18:14:32,10;
T2,5,5,0,207,3.5,00,...,2023:46:18:14:32,15;
T2,5,5,0,186,3.4,00,...,2023:46:18:14:32,16;
T2,5,5,0,207,4.6,00,...,2023:46:18:14:32,30;
....
I need to get rid of lines:
- Starting from T2,5,5,0,207
- Having the repeating time mark in field 15
and leave all other lines untouched.
I tried that in different combinations but none of what I checked worked so far:
awk -F ',' ' x!=$15 { if ($1 == T2 && $5 == 207) {x=$15; print$0} else print$0} ' test > test1
I would really appreciate any advice!! Thanks
uniq
instead. – user10489 Feb 18 '23 at 13:48T2,5,5,0,207
specifically as their first fields, or should any record with non-unique five first fields be deleted? Is the second condition dependent or independent of the first condition? I.e., should all records with duplicated timestamps be deleted, or only those that also has duplicated five first fields? – Kusalananda Feb 18 '23 at 13:532023:46:18:14:31
or76;
the "repeating time mark in field 15"? You omitted a whole load of fields so it was impossible to count. (You could have referenced field 9 or maybe field 10 so that the description matched the data.) Next time you have a question please try to ensure you have example data that can be used for testing – Chris Davies Feb 18 '23 at 18:17