I have certain questions regarding grep
.
Why does the following command match '
<Hello
'?$ grep -E "\<H" test Hello World <Hello H<ello
What needs to be done to match '
<Hello
' only?
I have certain questions regarding grep
.
Why does the following command match '<Hello
'?
$ grep -E "\<H" test
Hello World
<Hello
H<ello
What needs to be done to match '<Hello
' only?
To prevent grep
from interpreting a string specially (a regular expression), use -F
(or --fixed-string
):
$ cat test
one < two
Hello World
X<H
A <H A
I said: <Hello>
$ grep -F '<H' test
X<H
A <H A
I said: <Hello>
Remember to quote the search pattern properly, otherwise it may be interpreted badly by your shell. For example, if you ran grep -F <H test
instead, the shell will try to open a file named "H" and use it to feed standard input of grep
. grep
will search for the string "test" in that stream. The following commands are roughly equivalent to each other, but not to the above:
grep -F <H test
grep -F test <H # location of `<H` does not matter
grep -F H test
cat H | grep -F test # useless cat award
As for matching words only, have a look at the manual page grep(1)
:
-w, --word-regexp
Select only those lines containing matches that form whole words. The
test is that the matching substring must either be at the beginning of
the line, or preceded by a non-word constituent character. Similarly,
it must be either at the end of the line or followed by a non-word
constituent character. Word-constituent characters are letters,
digits, and the underscore.
Example usage (using the above test file):
$ grep -F -w '<H' test
A <H A
(-F
is optional here as <H
does not have a special meaning, but if you intent to extend this literal pattern, it may be useful then)
To match the beginning of a word, you do need regular expressions though:
$ grep -w '<H.*' test # match words starting with `<H` followed by anything
A <H A
I said: <Hello>
grep '\<<'
test to match <
at beginning of each word. But it didnt work out. Any idea why it didnt work?
– user3539
Mar 06 '13 at 00:19
<
is not considered a word character. See the manual page under The Backslash Character and Special Expression
– Lekensteyn
Mar 06 '13 at 09:57
<
is not a special character in any grep. However, in GNU grep \<
is special and means the beginning of word (so the zero-width boundary before Hello
in all your input lines).
In all grep
s \
is special. It either can escape a special character to remove its special meaning (so it's matched literally) or add a special meaning to a character (that's typically used to introduce new operators without breaking existing scripts, another way is to use things that would otherwise be invalid like *?
or (?
) or for ANSI C escape sequences like \n
, \t
...
To remove the special meaning of \
, like the others, you need another \
.
So to match <Hello
, you need:
grep -E '<Hello'
And to match \<Hello
, you need:
grep -E '\\<Hello'
Note that both <
and \
are special to the shell as well so need quoting for the shell as well, hence the single quotes above (\
is also special (to the shell) inside double quotes, though only in front of other special characters inside quotes like newline, double quote, backslash, dollar or backtick, so you'd nee grep -E "\\\<Hello"
or grep -E "\\\\<Hello"
to match \<Hello
).
So that pattern matches the full line, add the -x
option to grep:
grep -xE '<Hello'
would match only lines whose content it exactly "<Hello"
.
To match at the beginning of the line:
grep -E '^<Hello'
(would match "<Hello"
and "<Hello world>"
, but not World <Hello
.
To match <Hello
not preceded by a non-blank character (my interpretation of your at the beginning of a word):
grep -E '(^|[[:blank:]])<Hello'
or with BRE:
grep '^\(.*[[:blank:]]\)\{0,1\}<Hello'
grep \' inputfile
– Rahul Patil Mar 05 '13 at 07:43<Hello
" (no quotes) instead of "'<Hello'
". Please clarify. – l0b0 Mar 05 '13 at 16:32<Hello
andHello <World
– user3539 Mar 05 '13 at 16:40