I don't want to sort my file, just filter out duplicate lines, maintaining the original ordering. Is there a way to use sort's unique function without it's sort function (something like cat -u
would give if it existed)? Just using uniq
without sort
does nothing worthwhile, because uniq
only looks at adjacent lines, so a file has to be sorted first.
Also, incidentally, what in hell is the difference between uniq
and uniq --unique
? Here are commands on a random file from pastebin:
wget -qO - http://pastebin.com/0cSPs9LR | wc -l
350
wget -qO - http://pastebin.com/0cSPs9LR | sort -u | wc -l
287
wget -qO - http://pastebin.com/0cSPs9LR | sort | uniq | wc -l
287
wget -qO - http://pastebin.com/0cSPs9LR | sort | uniq -u | wc -l
258
In summary:
- How do I filter duplicates greedily without sorting?
- How is
uniq
not unique enough that there is alsouniq --unique
?
p.s. This question looks like a duplicate of the following q's, but it isn't:
sort
oruniq
at all. And "How is uniq not unique enough that there is also uniq --unique?" really should be a separate question. – muru Jun 18 '15 at 10:06As for the separate question, I just posted it here:
http://unix.stackexchange.com/questions/210528/how-is-uniq-not-unique-enough-that-there-is-also-uniq-unique
– enfascination Jun 18 '15 at 10:21