Grep remove line with 0 but not 0.2?

Question

I have a file whose content is similar to the following one.

I need to remove all the lines with a single zero.
I was thinking to use grep -v "0", but this removes also the line containing 0.2. I saw I could use the -w option, but this doesn't seem to work either.

How can I remove all the lines containing just a single 0 and keep all those lines starting with a 0?

@JulienLopez It's not a dupe of that question. That question is about matching a word, and answered with -w, which fails here. — Sparhawk, Feb 14 '19 at 10:16
Why are you forced to use grep for this task? And what exactly do you mean by a single zero? This sounds very much like an XY problem. — Roland Illig, Feb 15 '19 at 12:41
@RolandIllig it was 1 hour before bedtime and I wanted to start processing a series of 500,000 strings to check if they were bitcoin private keys and if so get balance. Next time I had time to look at it I had processed many thousands of the strings and I just wanted to parse for any non-zero values. — Philip Kirkbride, Feb 15 '19 at 16:02

score 38 · Accepted Answer · answered Feb 14 '19 at 07:19

38

grep -vx 0

From man grep:

-x, --line-regexp
       Select only those matches that exactly match the whole line.
       For a regular expression pattern, this is like parenthesizing
       the pattern and then surrounding it with ^ and $.

-w fails because the first 0 in 0.02 is considered a "word", and hence this line is matched. This is because it is followed by a "non-word" character. You can see this if you run the original command without -v, i.e. grep -w "0".

answered Feb 14 '19 at 07:19

Sparhawk

19,941

You could also use the -F option since we're not using regex patterns, just plain string matching – glenn jackman Feb 14 '19 at 13:54
@glennjackman Maybe I've read this earlier, but I can't seem to find it now. Running with -F (surprisingly to me) appears to take a similar amount of time or even slightly slower (~5–10%). Hence, I'm not sure what the advantage would be. – Sparhawk Feb 14 '19 at 21:53
2

It's possible that the RegEx engine is used so often and so widely used that they have implemented a very efficient version of it, but that a "plain search" probably has not been upgraded for 30 years. – Nelson Feb 15 '19 at 03:31
@Sparhawk: grep presumably has a special case for regexes with no metacharacters, because that's a common use-case. It's surprising that fgrep would be slower, but it's not surprising that the overhead of noticing this special case while compiling a short pattern is negligible vs. the time to scan a large file. (If it requires a special case at all to go that fast, vs. a pattern with a character class or x.*y.) – Peter Cordes Feb 15 '19 at 14:54
But that's maybe an oversimplification because the input is actually many short lines (not one giant string). I forget if grep recognizes any character other than \n newline as a line separator. If not, the implicit ^ and $ can still turn into a fixed-string search like strstr(big_buf, "\n0\n"). (Or 0\n at the start of a buffer.) But we're not just looking for the first match potentially far into a big buffer, we want to efficiently filter. But anyway, in theory yes it's just a 2-byte memcmp at the start of each line, and you'd hope that both fgrep and grep would see that. – Peter Cordes Feb 15 '19 at 15:04

score 30 · Answer 2 · edited Oct 13 '22 at 08:17

30

With grep:

grep -v '^0$' file

^ means beginning of the line, $ means end of the line.

edited Oct 13 '22 at 08:17

Stéphane Chazelas

544,893

answered Feb 14 '19 at 07:20

Arkadiusz Drabczyk

25,539

2

This is what the user asked for : avoid any lines containing only 1 "0". – Olivier Dulac Feb 14 '19 at 15:27
1

I would not put a literal dollar sign inside double quotes like that. – user541686 Feb 15 '19 at 18:42
@mehrdad not that big problem with regex as it is usually either last char or next one wont be [a-Z0-9] – Sampo Sarrala Feb 17 '19 at 06:16

score 14 · Answer 3 · answered Feb 14 '19 at 11:18

14

While grep can be used for this (as other answers clearly show), let’s take a step back and think about what you actually want:

You have a file containing numbers
You want to perform filtering based on the numeric value.

Regex interpret character sequence data. They don’t know about numbers, only about individual digits (and regular combinations thereof). Although in your particular case there’s a simple hack around this limitation, it’s ultimately a requirement mismatch.

Unless there’s a very good reason to use grep here (e.g. because you’ve measured it, and it’s vastly more efficient, and efficiency is crucial in your case), I recommend using a different tool.

awk, for instance, can filter based on numeric comparisions, e.g.:

awk '$1 == 0' your_file

But also, to get all lines containing numbers greater than zero:

awk '$1 > 0' your_file

I love regex, it’s a great tool. But it’s not the only tool. As the saying goes, if all you have is grep, everything looks like a regular language.

answered Feb 14 '19 at 11:18

Konrad Rudolph

3,769

3

I wholeheartedly agree that awk may be more elegant here... however, it will also match maybe a little bit more than what the user expects (every numerical value evaluating to 0). Ie, printf '0\n1\n-1\na\nb\n0\n0 also\n0.0\n-0.0\n0*0\n' | awk '($1 == 0)' will match: 0, 0.0 and -0.0... and also 0 also ! Not just "0". (which is sometimes what's needed, sometimes not). If the user want only "0" : awk '/^0$/' (or grep '^0$'). Also you should edit: the user needs to add ! to negate the test, so it hides 0 (and other zeroes) and displays the rest. ie: awk '!( $0 == 0)' – Olivier Dulac Feb 14 '19 at 15:20
1

@Olivier, or check the string value: $1 == "0" – glenn jackman Feb 14 '19 at 18:04
1

@OlivierDulac I explicitly used > rather than != (or, equivalently, ! (… == …)) to highlight that this is an arbitrary numerical comparison, not just equality. As for your other comment, this is entirely true but then we’re essentially back in string comparison territory and the existing solution using grep works (though awk of course also works). – Konrad Rudolph Feb 15 '19 at 11:39
@KonradRudolph fair points :) – Olivier Dulac Feb 15 '19 at 18:57
1

@glennjackman: nice trick indeed. But then OP would rather do test $0=="0" – Olivier Dulac Feb 15 '19 at 18:59

score 6 · Answer 4 · answered Feb 14 '19 at 07:34

grep's -w is a bit convoluted in a way that it splits up the original string into word and non-word constituents (anything except letters, digits or underscore) . Since it has already encountered a a valid word constituent 0 in 0.02 it had asserted the negation logic to remove the line.

Using sed is a bit easy in this context to just remove the whole words that match

sed '/^0$/d' file

score 4 · Answer 5 · edited Apr 05 '19 at 22:55

4

When the lines you want to delete only contain a 0 followed by the next line you can select those lines by issuing the following command:

grep -v "^0$"

This will only print the occurrences of 0 that are at the end of a line and at the beginning of a line at the same time. The -v option then inverts our selection.

edited Apr 05 '19 at 22:55

Rui F Ribeiro

56,709
26
150
232

answered Feb 14 '19 at 07:35

majesticLSD

183

1

This answer is almost identical to Arkadiusz Drabczyk's, but you forgot the -v, so it doesn't work. – Sparhawk Feb 14 '19 at 08:01
You're right. I was typing while he posted his answer so I didn't see it has already been given. I've misread that part with the -v option, thanks! – majesticLSD Feb 14 '19 at 08:10

score 0 · Answer 6 · edited Oct 13 '22 at 08:07

0

\b - word border
```
grep -v "\b0\b"
```
match beginning of line, your pattern and end of line
```
grep -v "^0$"
```
or as @Sparhawk suggested -vx lineregexp

Note that -w works, but in your case 0.2 are two words because dot character is a word separator.

edited Oct 13 '22 at 08:07

AdminBee

22,803

answered Feb 14 '19 at 07:23

Jakub Jindra

1,462

grep -v "\b0\b" doesn't really work here. What version of grep do you use? – Arkadiusz Drabczyk Feb 14 '19 at 07:26
works with grep (BSD grep) 2.5.1-FreeBSD on macOS and grep (GNU grep) 2.16 on ubuntu – Jakub Jindra Feb 14 '19 at 08:02
2

GNU regex use \< and\> as word boundaries, but that will have the same effect as -w – glenn jackman Feb 14 '19 at 13:53

score 0 · Answer 7 · answered Feb 14 '19 at 10:01

0

Another answer for the sake of variety, assuming you have a PCRE-enabled grep

grep -Pv "^0(?!\.)"

this performs a negative lookahead to match the lines that start with 0 and are not followed by a dot. Then -v discards non-matching lines. You can see in action here

answered Feb 14 '19 at 10:01

mrbolichi

109

2

This will also remove lines such as 0123, which is not what the OP wants – iruvar Feb 14 '19 at 23:51

score 0 · Answer 8 · answered Feb 16 '19 at 09:12

0

Assuming any line which is not just a single 0 has a period

grep '\.' file

answered Feb 16 '19 at 09:12

Roger Mungo

335

Grep remove line with 0 but not 0.2?

8 Answers8

Linked