3

For replacing multiple empty lines with just one I need to find which files to apply this in the first place, in a big repository. How do I do that?

mirabilos
  • 1,733

2 Answers2

4

The pcregrep utility supports matching for multi-line patterns, so this is easy.

First, you need a list of files to search within; in a git repository, my own git find utility can be useful for this, but regular find(1) and other tools will also do.

Pass the list of files to pcregrep, dump its output into a temporary file, then hand-review the file list (e.g. to remove binaries that were present in the first list) before acting on it:

# easy to type version
git find | xargs pcregrep -l -M $'\n\n\n' >/tmp/x
# more secure version
git find -print0 | xargs -0r pcregrep -l -M $'\n\n\n' >/tmp/x

Note: the $'…' feature needs support from your shell (GNU bash, AT&T ksh93, mksh, zsh, and POSIX sh from the upcoming version of the standard support it). Otherwise, type ', hit Return thrice, then type ' again.

If your inital list is sane enough, you can act on the result list directly:

# easy to type version
$EDITOR $(git find \*.java | xargs pcregrep -l -M $'\n\n\n')
# somewhat more secure version
git find -print0 | xargs -0r pcregrep -l -M $'\n\n\n' | xargs $EDITOR --

⚠ The “more secure” version is required unless your filenames don’t contain any “funny” characters, not even spaces! The “somewhat more secure” in the last example refers to the problem that pcregrep’s -l option always LF-terminates output and has no option to NUL-terminate it, so filenames with embedded newlines are always insecure with this solution.

mirabilos
  • 1,733
2

With awk implementations that support nextfile:

... -print0 | xargs -r0 awk '
    FNR == 1 {n = 0}
    $0 == "" {
      if (++n == 2) {
        print FILENAME
        nextfile
      }
      next
    }
    {n = 0}'

Change to printf "%s\0", FILENAME for NUL delimited filenames. Change $0 == "" to !NF to check for blank lines instead of empty lines.