2

I am trying to include the standard C package stdint.h to files which have the word LARGE_INTEGER, as a part of conversion from Windows to Linux drivers as discussed here for datatypes. I know the previous threads about find and xargs, here and here.

Code where GNU find part is based mainly on this thread:

gfind /tmp/ -type f                                                      \
    \( -name "*.h" -o -name "*.cpp" \)                                   \
    -exec ggrep -Pl "LARGE_INTEGER" {} +

and its pseudocode extension where I want also that the files must contain the word LARGE_INTEGER

gfind /tmp/ -type f                                                \
    \( -name "*.h" -o -name "*.cpp" \)                             \
    -and -exec ggrep -Pl "LARGE_INTEGER" {} \;                     \
    | xargs gsed -i '1s/^/#include <stdint.h>\n/'

where I am uncertain about -and and giving

gsed: can't read /tmp/: No such file or directory
...

I followed examples in commandlinefu here.

How can you combine a new command to the find based on GNU SED?

2 Answers2

4

I'd use1 find with two -exec actions e.g.:

find . -type f -exec grep -qF SOME_STRING {} \; -exec sed 'COMMAND' {} \;

The second command will run only if the first one evaluates to true i.e. exit code 0 so sed will process the file in question only if the file contains SOME_STRING. It's easy to see how it works:

find . -type f -exec grep -qF SOME_STRING {} \; -print

it should list only those files that contain SOME_STRING. Sure, you can always chain more than two expressions and also use operators like ! (negation) e.g.:

find . -type f -exec grep -qF THIS {} \; ! -exec grep -qF THAT {} \; -print

will list only those files that contain THIS but don't contain THAT.
Anyway, in your case:

gfind /tmp/ -type f \( -name "*.h" -o -name "*.cpp" \) \
-exec ggrep -qF LARGE_INTEGER {} \; \
-exec gsed -i '1s/^/#include <stdint.h>\n/' {} \;

1
I assume your xargs doesn't support -0 or --null option. If it does, use the following construct:

find . -type f -exec grep -lFZ SOME_STRING {} + | xargs -0 gsed -s -i 'COMMAND'

i.e. in your case:

gfind /tmp/ -type f \( -name "*.h" -o -name "*.cpp" \) \
-exec ggrep -lFZ LARGE_INTEGER {} + | \
xargs -0 gsed -s -i '1s/^/#include <stdint.h>\n/'

It should be more efficient than the first one.
Also, both will work with all kind of file names. Note that I'm using grep with -F (fixed string) as it is faster so remove it if you're planning to use a regex instead.

don_crissti
  • 82,805
  • Thank you for the update! Your codes 4-5 are great! This flag -size -5k is time-saver because I have had files which size is >200 Mb as binary data. Why do you use -print in codes 2-3? I do not find it intuitive. – Léo Léopold Hertz 준영 Jun 30 '15 at 19:36
1

Just pipe the output of gfind to xargs:

gfind /tmp/ -type f \( -name "*.h" -o -name "*.cpp" \) -exec ggrep -l "LARGE_INTEGER" {} + | xargs sed -i '1s/^/#include <stdint.h>\n/'

Notice that I've removed the -P option from ggrep, since you're matching a fixed string.

However this solution doesn't deal well with filenames containing newlines; a safer way to do this would be to force gfind to output NULL terminated filenames and to loop over the output in a while loop:

#!/bin/bash

gfind /tmp/ -type f \( -name "*.h" -o -name "*.cpp" \) -print0 | while read -d '' -r filepath; do
    [ "$(ggrep -l "LARGE_INTEGER" "$filepath")" ] && sed -i '1s/^/#include <stdint.h>\n/' "$filepath"
done

If you like one-liners:

gfind /tmp/ -type f \( -name "*.h" -o -name "*.cpp" \) -print0 | while read -d '' -r filepath; do [ "$(ggrep -l "LARGE_INTEGER" "$filepath")" ] && sed -i '1s/^/#include <stdint.h>\n/' "$filepath"; done
kos
  • 2,887
  • I like the while loop approach a lot since easy to read and edit afterwards. Any idea about efficiency in comparison to Don? – Léo Léopold Hertz 준영 Jun 30 '15 at 05:12
  • 1
    @Masi I think don_crissti's first showcased method is more efficient than my second showcased method, however as you said using a bash script might fit better if you're in sight of do further processing of the input. If that's it, just use don_crissti's first or second showcased method (or my first showcased method, which matches don_crissti's second showcased method), otherwise use my second showcased method, which will probably fit further expansions a little better. – kos Jun 30 '15 at 11:42