This intent to remember uniq line in input.
as Jeff Schaller pointed out, $0
is undefined in BEGIN
block.
a more correct code should be
{
if (data[$0]++ == 0)
lines[++count] = $0;
}
END {
for(i=1; i<count; i++)
print lines[i];
}
or even
!data[$0]++ { lines[++count] = $0; }
END {
for(i=1; i<count; i++)
print lines[i];
}
The first time a line appear data[$0]
will be equal to 0 and line[ ]
will receive the line.
After test, data[$0]
will be incermented (++
is a post incrementation) and test will evaluate to false for line with same content.
The END
statement print all the line in order.
see also How does awk '!a[$0]++' work?