9

So, probably I should use grep for this. As long as I need recursive search I should use grep -r. But then I don't know what should I do next ;)

How can I do that?

pushandpop
  • 1,386
  • 2
  • 10
  • 11

5 Answers5

14

With greps that support the -r (recursive) and -P (PCRE) options (or pcregrep with -r):

grep -rP '^(?=.{101}).*?if' .

Or POSIXly:

find . -type f -exec awk 'length > 100 && /if/ {
   print FILENAME ": " $0}' {} +

(note that the behaviour will vary between implementations for non-text files (files containing non-characters, zero byte values, too long lines or data after the last newline). Also note that some grep implementations will search in non-regular files or will follow symbolic links).

  • I'm only aware of one grep that does, and even that one requires its support built-in to the lib c to work as I understand it. Are there more? – mikeserv Dec 11 '15 at 18:18
  • 1
    @mikeserv, not sure what you mean with lib c. PCRE regexp support is generally provided by libpcre. An exception is the grep from ast-open that has its own implementation of perl-like regexps. grep implementations that support grep -rP do include GNU, FreeBSD/OS/X (a rewrite of GNU grep now diverging) and ast-open's – Stéphane Chazelas Dec 11 '15 at 18:27
12

Use awk to count size of $0 and presence of substring if?
awk '( length($0) > 100 && index($0,"if") ){print}' file

If "if" should be a word (as opposed to a simple substring), you could use awk '( length($0) > 100 && match($0,/\<if\>/) ){print}' file

Dani_l
  • 4,943
  • 6
    The idiomatic way to write it in awk would be more like awk 'length > 100 && /if/' file. Note that for /\<if\>/, you need GNU awk. – Stéphane Chazelas Dec 10 '15 at 17:39
  • @StéphaneChazelas yeah, I saw your syntax and learned the right way, but since you already posted an answer I didn't see the point in modifying mine. I did upvote your answer, though. – Dani_l Dec 10 '15 at 17:44
8

You can use two greps connected by a pipe:

grep -r '.\{100\}' /path | grep 'if'

To exclude files with if in their paths or names, use ':.*if' instead of 'if' (could still break if your filenames or paths contain colons).

choroba
  • 47,233
3

Adapted from Find any lines exceeding a certain length any of the following will work to find lines longer than 100 chars

grep '.\{100\}' file

perl -nle 'print if length$_>99' file

awk 'length($0)>99' file

sed -n '/.\{100\}/p' file

chose your preferred method and pipe it through grep if

David King
  • 3,147
  • 9
  • 23
  • 2
    The sed version can do both checks at once; example using GNU extensions: sed -n '/.\{100\}/{/if/p}' file. Similarly the perl one: perl -nle 'print if length$_>99 && /if/' – Toby Speight Dec 10 '15 at 17:59
3

with a single grep:

grep -vxE '.{0,99}|([^i]|i[^f])*i*' <in >out

that will only select lines which cannot be described from head to tail with either statement. and so any line which can be described as consisting of between 0 and 99 characters will not be selected, and similarly any line which matches more than 99 characters and yet still does not contain at least a single if will also fail to be selected.

printf '^%-100b$\n' 'if\nif' 'hey if' i if |
grep -nvxE '.{0,99}|([^i]|i[^f])*i*'

3:^hey if                                                                                              $
5:^if                                                                                                  $

you might do better just to use two greps, though.

mikeserv
  • 58,310