4

How do I replace the blank lines in tab delimited text file with the content of the row above on a Linux machine? For example:

101 abc group1
765 efg group2
345 hij group4

456 gfd group9 762 ert group7

554 fgt group11

Expected Output:

101 abc group1
765 efg group2
345 hij group3
345 hij group3
456 gfd group9
762 ert group7
762 ert group7
762 ert group7
554 fgt group11
αғsнιη
  • 41,407
LiNi
  • 101
  • perl -lpe '$_ ||= $last; $last = $_' works. By the way your expected output has "group3" in two lines, but there was no "group3" in the input. –  Dec 12 '20 at 14:30

6 Answers6

9

Here is one way with awk (p holds the previous line when NF is zero).

awk 'NF {p = $0} {print p}' file

When the line is not empty, we store the line into p (for future use) and print p.

When NF==0 (for empty lines) we only print p.

thanasisp
  • 8,122
  • can you add little more explanation , i am not good at awk , NF ? p ? $0 ? , I like your solution – codeholic24 Dec 17 '20 at 13:23
  • NF is the number of fields per line. p is just a variable, $0 is the whole line. You may find this read useful about the basic builtin awk variables and how to use them. – thanasisp Dec 17 '20 at 13:33
5

In awk (note that this one will print any empty lines that come before the first non-empty one):

$ awk '{ if(! NF){$0=last}else{last=$0;}}1' file
101 abc group1
765 efg group2 
345 hij group4 
345 hij group4 
456 gfd group9 
762 ert group7 
762 ert group7 
762 ert group7 
554 fgt group11  

Explanation:

NF holds the number of fields. If the line is empty, there are no fields so the variable will be 0.

  • if(! NF){$0=last}: if the number of fields is 0 (empty line), set the current line ($0) to the value of the variable last.
  • else{last=$0;}: if there are fields, so this line is not empty, set last to hold the contents of this line.
  • 1: the lone one at the end is an awk trick: when something evaluates to true (1 or any other integer greater than 0 is always true, since 0 is false) awk will print the current line. So that 1 is equivalent to print $0.
$ awk '! NF ? $0=last : last=$0;' file
101 abc group1
765 efg group2 
345 hij group4 
345 hij group4 
456 gfd group9 
762 ert group7 
762 ert group7 
762 ert group7 
554 fgt group11  

Explanation This is the same idea as above, but written in a more concise way. We are using the ternary operator. Since one of the two conditions will always be true (either NF is true or it is not true, so the ternary operator will always return true), both outcomes result in the line being printed (except for cases where the line is empty and no non-empty lines have been seen or if a line consistes of nothing but 0). However, if NF is not set, we set $0 to last and if it is set, we set last to $0. The result is the output we want.

Since the above will not print lines that are just 0, you can use this instead of that is a problem for you:

awk '{! NF ? $0=last : last=$0};1' file
terdon
  • 242,166
  • The two variants here are subtly different. In the second, if the very first input line is blank, last=$0 and the whole expression will also be blank, and that's falsy, so the line will not be printed. The first one will print an initial empty line, too. – ilkkachu Dec 11 '20 at 16:38
  • @ilkkachu thanks, I hadn't considered the case where the 1st line is empty. You're right that the 1st variant will print that first empty line, but the second won't. – terdon Dec 11 '20 at 17:33
  • awk '! NF ? $0=last : last=$0;' file will only print lines that don't evaluate numerically to zero (add a line that's just 0 somewhere to see that) since it's testing the value of $0 as a condition to decide whether to print the line or not. That seems like it'd probably be OK for the OPs input but YMMV - I'd always avoid relying on the result of some expression evaluating to decide whether to print or not unless you need to do so. – Ed Morton Dec 11 '20 at 18:17
  • 1
    @EdMorton thanks, I was trying to figure out how to test the return value of the ternary but couldn't. I added a note mentioning this. – terdon Dec 11 '20 at 18:50
  • 1
    d'oh, how did I not think of zeroes. – ilkkachu Dec 11 '20 at 22:06
5

Using the supplied input and sed:

$ sed -n '/^$/{g;};h;p' infile
101 abc group1
765 efg group2
345 hij group4
345 hij group4
456 gfd group9
762 ert group7
762 ert group7
762 ert group7
554 fgt group11 
$

Note: '/^$/{g;};h;p' is obviously more commonly/properly written as '/^$/g;h;p'. Just a style of mine!

As guest_7 pointed out (thank you), the sed command can also be written more simply as sed '/^$/g;h' infile

As terdon pointed out, and something I did not think initially, the "empty" lines may contain blank spaces or tabs (whitespaces). In that case, a more robust solution would be:

$ sed '/^\s*$/g;h' infile

and a more portable solution supporting the various locales is:

$ sed '/^[[:blank:]]*$/g;h' infile
fpmurphy
  • 4,636
3
$ awk '!NF{$0=p} {p=$0} 1' file
101 abc group1
765 efg group2
345 hij group4
345 hij group4
456 gfd group9
762 ert group7
762 ert group7
762 ert group7
554 fgt group11
Ed Morton
  • 31,617
  • 1
    This looks like a simple inversion of the logic in thanasisp's answer. Knowing you, I'm guessing this one will deal with certain edge cases better but I can't see which ones. Could you explain? – terdon Dec 12 '20 at 11:22
  • 1
    @terdon I wrote it before I noticed @thanasisp's one was very similar so I upvoted theirs but didn't delete mine just because I prefer to have the record to be printed at the end of the script stored in $0 rather than p so that if you're doing anything else in other scripts with my approach you already have $0, $1, etc. populated whereas with @thansisps you'd need to make it awk 'NF{p=$0} {$0=p} 1' which is less efficient since you're then doing field splitting twice for every line instead of just the empty lines or add an extra duplicate-negated test awk 'NF{p=$0} !NF{$0=p} 1'. – Ed Morton Dec 12 '20 at 14:49
0

Adding to the existing answers:

awk 'NR<2 && !NF{next} NF{print} !NF{print line} NF{line=$0}' test

... will also take care of an empty first line (by ignoring it, since it has no previous input).

0

Here's a shell-only solution:

while read -r l; do if test -z "$l"; then echo "$p"; else echo "$l"; p="$l"; fi; done < file

Tested with zsh and bash.

Alexander
  • 9,850