For GNU grep
at least, in a UTF-8 locale ,[^,]*,
will not match ,something,
if something
contains sequences of bytes that don't form valid characters.
For instance:
$ printf '1,\200,3,4,5,6,Nature Life,8\n' |
grep -cE '^([^,]*,){6}[^,]*Nature Life'
0
While, for awk field splitting, it does not matter:
$ printf '1,\200,3,4,5,6,Nature Life,8\n' | awk -F, '$7 ~ /Nature Life/'
1,�,3,4,5,6,Nature Life,8
Run grep
under LC_ALL=C
to avoid issues with text in the wrong encoding (as long as the string to search and the separator (,
) are in ASCII).
$ printf '1,\200,3,4,5,6,Nature Life,8\n' |
LC_ALL=C grep -cE '^([^,]*,){6}[^,]*Nature Life'
1