29

I have files that end in one or more newlines and should end in only one newline. How can I do that with Bash/Unix/GNU tools?

Example bad file:

1\n
\n
2\n
\n
\n
3\n
\n
\n
\n

Example corrected file:

1\n
\n
2\n
\n
\n
3\n

In other words: There should be exactly one newline between the EOF and the last non-newline character of the file.

Reference Implementation

Read file contents, chop off a single newline till there no further two newlines at the end, write it back:

#! /bin/python

import sys

with open(sys.argv[1]) as infile:
    lines = infile.read()

while lines.endswith("\n\n"):
    lines = lines[:-1]

with open(sys.argv[2], 'w') as outfile:
    for line in lines:
        outfile.write(line)

Clarification: Of course, piping is allowed, if that is more elegant.

slm
  • 369,824
Bengt
  • 741

10 Answers10

31

From useful one-line scripts for sed.

# Delete all trailing blank lines at end of file (only).
sed -e :a -e '/^\n*$/{$d;N;};/\n$/ba' file
  • 6
    Thanks, I used the following to do it in place for multiple files:

    find . -type f -name '*.js' -exec sed --in-place -e :a -e '/^\n*$/{$d;N;};/\n$/ba' {} \;

    – jakub.g Nov 22 '13 at 09:48
  • @jakub.g in place and recursive is exactly what I needed. thank you. – Buttle Butkus Dec 13 '15 at 10:41
  • 1
    To add to the excellent comment from @jakub.g you can invoke the command like this on OS X: find . -type f -name '*.js' -exec sed -i '' -e :a -e '/^\n*$/{$d;N;};/\n$/ba' {} \; – davejagoda Feb 19 '18 at 18:35
21
awk '/^$/ {nlstack=nlstack "\n";next;} {printf "%s",nlstack; nlstack=""; print;}' file
Hauke Laging
  • 90,279
19

Since you already have answers with the more suitable tools sed and awk; you could take advantage of the fact that $(< file) strips off trailing blank lines.

a=$(<file); printf '%s\n' "$a" > file

That cheap hack wouldn't work to remove trailing blank lines which may contain spaces or other non-printing characters, only to remove trailing empty lines. It also won't work if the file contains null bytes.

In shells other than bash and zsh, use $(cat file) instead of $(<file).

llua
  • 6,900
  • +1 to point out what looks like a bug to me : $(<file) isn't really reading the file? why does it discard trailing newlines? (it does, i just tested, thanks for pointing it out!) – Olivier Dulac Jul 04 '13 at 09:23
  • 3
    @OlivierDulac $() discards trailing newlines. That's a design decision. I assume that this shall make the integration in other strings easier: echo "On $(date ...) we will meet." would be evil with the newline that nearly every shell command outputs at the end. – Hauke Laging Jul 04 '13 at 11:31
  • @HaukeLaging: good point, it's probably the source of that behaviour – Olivier Dulac Jul 04 '13 at 12:13
  • I added a special case to avoid appending "\n" to empty files: [[ $a == '' ]] || printf '%s\n' "$a" >"$file". – davidchambers Apr 14 '14 at 19:11
  • To strip multiple newlines off the start of a file, insert tac into the process (I use gnu coreutils on Mac, so gtac for me) : a=$(gtac file.txt); printf '%s\n' "$a" | gtac > file.txt – r_alex_hall Oct 03 '18 at 12:53
5

This question is tagged with , but nobody has proposed an ed solution.

Here's one:

ed -s file <<'ED_END'
a

. ?.?+1,$d w ED_END

or, equivalently,

printf '%s\n' a '' . '?.?+1,$d' w | ed -s file

ed will place you at the last line of the editing buffer by default upon startup.

The first command (a) adds an empty line to the end of the buffer (the empty line in the editing script is this line, and the dot (.) is just for coming back into command mode).

The address of the second command (?.?) looks for the nearest previous line that contains something (even white-space characters), and then deletes (d) everything to the end of the buffer from the next line on.

The third command (w) writes the file back to disk.

The added empty line protects the rest of the file from being deleted in the case that there aren't any empty lines at the end of the original file.

Kusalananda
  • 333,661
5

You can use this trick with cat & printf:

$ printf '%s\n' "`cat file`"

For example

$ printf '%s\n' "`cat ifile`" > ofile
$ cat -e ofile
1$
$
2$
$
$
3$

The $ denotes the end of a line.

References

Bengt
  • 741
slm
  • 369,824
4

Here's a Perl solution that doesn't require reading more than one line into memory at a time:

my $n = 0;
while (<>) {
    if (/./) {
        print "\n" x $n, $_;
        $n = 0;
    } else {
        $n++;
    }
}

or, as a one-liner:

perl -ne 'if (/./) { print "\n" x $n, $_; $n = 0 } else { $n++ }'

This reads the file a line at a time and checks each line to see if contains a non-newline character. If it doesn't, it increments a counter; if it does, it prints the number of newlines indicated by the counter, followed by the line itself, and then resets the counter.

Technically, even buffering a single line in memory is unnecessary; it would be possible to solve this problem using a constant amount of memory by reading the file in fixed-length chunks and processing it character by character using a state machine. However, I suspect that would be needlessly complicated for the typical use case.

2

If your file is small enough to slurp into memory, you can use this

perl -e 'local($/);$f=<>; $f=~s/\n*$/\n/;print $f;' file
terdon
  • 242,166
0

This one is quick to type, and, if you know sed, easy to remember:

tac < file | sed '/[^[:blank:]]/,$!d' | tac

It uses the sed script to delete leading blank lines from useful one line scripts for sed, referenced by Alexey, above, and tac (reverse cat).

In a quick test, on an 18MB, 64,000 line file, Alexey's approach was faster, (0.036 vs 0.046 seconds).

freeB
  • 111
0

In python (I know it is not what you want, but it is much better as it is optimized, and a prelude to the bash version) without rewriting the file and without reading all the file (which is a good thing if the file is very large):

#!/bin/python
import sys
infile = open(sys.argv[1], 'r+')
infile.seek(-1, 2)
while infile.read(1) == '\n':
  infile.seek(-2, 1)
infile.seek(1, 1)
infile.truncate()
infile.close()

Note that it does not work on files where the EOL character is not '\n'.

jfg956
  • 6,336
0

A bash version, implementing the python algorithm, but less efficient as it needs many processes:

#!/bin/bash
n=1
while test "$(tail -n $n "$1")" == ""; do
  ((n++))
done
((n--))
truncate -s $(($(stat -c "%s" "$1") - $n)) "$1"
jfg956
  • 6,336