How to remove empty lines from beginning and end of file?

Question

I would like to remove empty lines from the beginning and the end of file, but not remove empty lines between non-empty lines in the middle. I think sed or awk would be the solution.

Source:

1:
2:
3:line1
4:
5:line2
6:
7:
8:

Output:

1:line1
2:
3:line2

those numbers 1: are not actually there? – RomanPerekhrest Nov 14 '19 at 15:05 — RomanPerekhrest, Nov 14 '19 at 15:05
@RomanPerekhrest it means the line number. – Feriman Nov 14 '19 at 15:06 — Feriman, Nov 14 '19 at 15:06

Stack EG · Accepted Answer · 2019-11-19T05:37:33.913

34

Try this,

To remove blank lines from the begin of a file:

sed -i '/./,$!d' filename

To remove blank lines from the end of a file:

sed -i -e :a -e '/^\n*$/{$d;N;ba' -e '}' file

To remove blank lines from begin and end of a file:

sed -i -e '/./,$!d' -e :a -e '/^\n*$/{$d;N;ba' -e '}' file

From man sed,

-e script, --expression=script -> add the script to the commands to be executed

b label -> Branch to label; if label is omitted, branch to end of script.

a -> Append text after a line (alternative syntax).

$ -> Match the last line.

n N -> Add a newline to the pattern space, then append the next line of input to the pattern space. If there is no more input then sed exits without processing any more commands.

edited Nov 19 '19 at 05:37

answered Nov 14 '19 at 15:26

Stack EG

1,636

2

Note that -i is a non-portable extension to the POSIX sed utility and will not be available on all systems. – Andrew Henle Nov 15 '19 at 11:27
i see that these commands work, but I'm not quite sure how. Could you explain them in more detail? In particular, in the second example, why doesn't the first clause delete embedded blank lines? Why does the second clause need to loop? It looks like it gets a bunch of newlines at once. Does any of this work on white space-only lines or are you considering them non-blank? – Joe Nov 16 '19 at 16:41
Please explain the individual commands, how they are working and what is the meaning of those flags. – Prvt_Yadav Nov 17 '19 at 10:26
1

It's better to do something like ^[[:space:]]$ instead of just a newline since there are DOS, Linux, and Mac kinds of newlines that will mess you up if you just try to strip out one kind of them. – labyrinth Sep 13 '20 at 02:58
In regards to @AndrewHenle's caveat, the command works just as well for streaming, if you don't want to worry about the difference between GNU -i and BSD -i '' – Gordon Jul 18 '21 at 17:29
1

sed '/[^[:space:]]/,$!d', sed -e :a -e '/^[[:space:]]*$/{$d;N;ba' -e '}' and sed -e '/[^[:space:]]/,$!d' -e :a -e '/^[[:space:]]*$/{$d;N;ba' -e '}' can also remove lines with only spaces. (re-comment to fix a bug of the previous one) – Míng Dec 20 '22 at 10:03

glenn jackman · Answer 2 · 2019-11-15T18:19:10.550

10

This little awk program will remove empty lines at the start of a file:

awk 'NF {p=1} p'

So we can combine that with tac that reverses lines and get:

awk 'NF {p=1} p' file | tac | awk 'NF {p=1} p' | tac

line1

line2

Stealing @guillermo chamorro's command substitution trick:

awk 'NF {p=1} p' <<< "$(< file)"

edited Nov 15 '19 at 18:19

answered Nov 14 '19 at 15:35

glenn jackman

85,964

2

Does this require that the lines are truly empty, or is it enough that they are blank? – Kusalananda Jul 18 '21 at 17:30
That's a good question. I seems that if we use the default FS, blank lines get ignored: echo $' \t \t ' | awk '{print NF}' prints 0, but if we specify a field separator: echo $' \t \t ' | awk -F '\t' '{print NF}' prints 3 – glenn jackman Jul 18 '21 at 18:26

score 8 · Answer 3 · answered Nov 14 '19 at 15:16

8

If the file is small enough to fit memory requirements:

$ perl -0777 -pe 's/^\n+|\n\K\n+$//g' ip.txt
line1

line2

-0777 to slurp entire input file
^\n+ one or more newlines from start of string
\n\K to prevent deleting newline character of last non-empty line
\n+$ one or more newlines at end of string

answered Nov 14 '19 at 15:16

Sundeep

12,008

3

or with (\s*\n)+ in stead of \n+ to also remove lines that only contain whitespace. – ilkkachu Nov 14 '19 at 16:04

score 7 · Answer 4 · edited Jul 19 '21 at 20:33

7

I propose this:

printf '%s\n' "$(cat file)" | sed '/./,$!d'

It will print the whole text except start-end blank lines. So, if we extend the example:

(blank)
(blank)
line1
line2
line1
line2
line1
line2
line1
line2
(blank)
(blank)

It will output:

line1
line2
line1
line2
line1
line2
line1
line2

edited Jul 19 '21 at 20:33

Quasímodo

18,865
4
36
73

answered Nov 14 '19 at 15:57

schrodingerscatcuriosity

12,396

2

Clever. The trick here is that command substitution ($(cat file)) strips off trailing newlines. I'd offer 2 suggestions: 1) use the bash builtin $(< file) instead of cat; 2) use a here string: sed '/[^[:blank:]]/,$!d' <<< "$(<file)" – glenn jackman Nov 15 '19 at 18:17

jubilatious1 · Answer 5 · 2021-07-20T04:35:36.187

Using Raku (formerly known as Perl_6):

If the file is read into Raku with lines, then clever use of the trim function can be used to clean-up blank lines (i.e. whitespace) at the beginning and end of the file:

$ raku -e 'lines.join("\n").trim.put;' start_end.txt
lineX
line1
line2
line1
line2
line1
line2
line1
line2
~$

The input file is the same one used by @schrodigerscatcuriosity (two blank lines at the start of the file, two blank lines at the end of the file). And if you only need to clean up the beginning/end of the file(s), then trim-leading and trim-trailing are your friends.

Alternatively, below is a pretty straightforward translation of @Sundeep's Perl5 code, using a few Raku features:

raku -e 'S:g/ ^\n+ || \n+$ //.put given slurp;' start_end.txt

For the Perl5-to-Raku translation: the file is slurp-ed in and Raku's S/// non-destructive substitution operator is used to return the resultant string. Alternation is accomplished with Raku's || 'first-matching' alternation operator, since Raku's | alternation operator denotes Longest Token Matching (LTM, an improvement).

The Raku equivalent of Perl5's /k and/or /K commands is simply <( ... )> , used singly or as a paired set. These operators instruct the regex engine to drop any matches before <( or after )>. [Note, however that the \K equivalent in Raku appears unnecessary for the problem at hand].

https://raku.org

score 2 · Answer 6 · answered Jul 20 '21 at 13:40

GNU sed has no limit on line length.

sed -z 's/^\n*\|\n*$//g' file

-z flag tells the editor to read the text up to the NUL character delimiter, and since there is no such delimiter in the file, it reads the entire file as one line.

But for compatibility reasons it is recommended to limit for the pattern and hold spaces no more than 4000 bytes. Also:

recursion is used to handle subpatterns and indefinite repetition. This means that the available stack space may limit the size of the buffer that can be processed by certain patterns.

score 1 · Answer 7 · answered Nov 16 '19 at 19:33

A simple 2 pass approach just for completeness:

$ awk 'NR==FNR{if (NF) { if (!beg) beg=NR; end=NR } next} FNR>=beg && FNR<=end' file file
line1

line2

The above treats lines of only blank chars as empty. If instead you only want lines with no chars at all to be considered empty then just change NF to /./.

score 1 · Answer 8 · answered Jul 18 '21 at 18:08

1

Expanding on @schrodigerscatcuriosity command-substitiution-trick:

cat <<< "$(tac <<< "$(tac file)")"

I guess there's still more room for shell-magics.

answered Jul 18 '21 at 18:08

markgraf

2,860

1

This would also store the compete file in the shell's memory twice. – Kusalananda Jul 20 '21 at 06:40

score 0 · Answer 9 · answered Aug 30 '21 at 11:23

Ed and Ex are POSIX editors that can handle this task.

They are quite similar and in the solutions presented here ed and ex are 100% interchangeable¹.

General solution

printf '%s\n' a '' . 0a '' . '?.?+1,$d' '1,/./-1d' w q | ex -s file

If the file is known to have empty lines at the beginning and end

printf '%s\n' '?.?+1,$d' '1,/./-1d' w q | ex -s file

If you actually meant blank² lines

printf '%s\n' a '' . 0a '' . '?[^[:blank:]]?+1,$d' '1,/[^[:blank:]]/-1d' w q | ex -s file

Explanation and breakdown

The manual is always the best explanation, but here is an overview:

Ed and Ex always start with the last line selected — so if we were to issue an unadorned d (delete command), it would delete the last line — and they can look for lines matching regular expressions.

Some commands take addresses ("line numbers"), e.g. 3,6d deletes from lines 3 to 6.

/regex/ looks ahead for the first line matching "regex".
?regex? looks behind for the first line matching "regex".

Guess what? A regex can be an address too.

# Insert an empty line at the end
a
.
Insert an empty line at the beginning
0a
.
Delete from line L1 up to line L2, where
L1 is the line below the last non-empty line: ?.?+1
L2 is the last line: $
?.?+1,$d
Delete from line L3 up to line L4, where
L3 is the first line: 1
L4 is the line above the first non-empty line: /./-1
1,/./-1d
Write the changes to the file and quit
w
q

Why do we need to temporarily add two empty lines for the general solution? Because otherwise 1,/./-1d would always delete the first line and ?.?+1,$d the last, even if not empty.

^{1: But IIRC a clean Debian installation lacks Ed, so I'm going with Ex.}
^{2: I.e., lines that are visually empty but may contain spaces and tabs.}

Praveen Kumar BS · Answer 10 · 2019-11-17T09:23:07.860

-1

command

sed -n '/[a-zA-Z]/,/[a-zA-Z]/p' file| awk 'OFS=":"{$2=$1;$1=NR;print }'

output

1 line1
2 
3 line2

edited Nov 17 '19 at 09:23

answered Nov 17 '19 at 08:53

Praveen Kumar BS

5,211

6

There's no need to use sed here as awk could easily do the same test. On the other hand, the line numbers are only for illustration in the question, which means that you may not need awk at all. In any case, it's almost never a reason to use both sed and awk in the same pipeline. – Kusalananda Nov 17 '19 at 09:47

How to remove empty lines from beginning and end of file?

10 Answers10

General solution

If the file is known to have empty lines at the beginning and end

If you actually meant blank² lines

Explanation and breakdown

Insert an empty line at the beginning

Delete from line L1 up to line L2, where

L1 is the line below the last non-empty line: ?.?+1

L2 is the last line: $

Delete from line L3 up to line L4, where

L3 is the first line: 1

L4 is the line above the first non-empty line: /./-1

Write the changes to the file and quit

Linked

How to remove empty lines from beginning and end of file?

10 Answers10

General solution

If the file is known to have empty lines at the beginning and end

If you actually meant blank2 lines

Explanation and breakdown

Insert an empty line at the beginning

Delete from line L1 up to line L2, where

L1 is the line below the last non-empty line: ?.?+1

L2 is the last line: $

Delete from line L3 up to line L4, where

L3 is the first line: 1

L4 is the line above the first non-empty line: /./-1

Write the changes to the file and quit

Linked

If you actually meant blank² lines