2

Ex:

Input file

******************
.WER
+ aaa bbb ccc
+ ddd eee 
+ fff ggg hhh
******************
.SDF
+ zzz xxx yyy 
+ iii  
+ kkk lll
******************
.XCV
+ uuu vvv ggg 
+ hhh qqq 
+ rrr ttt jjj
******************

Desired Output:

******************
.WER aaa bbb ccc ddd eee fff ggg hhh
******************
.SDF zzz xxx yyy iii kkk lll
******************
.XCV uuu vvv ggg hhh qqq rrr ttt jjj
******************

I want to append line which matches pattern "+" to previous line and replace "+" with a space.

Can anyone solve this problem by using awk or sed (even grep) command?

I am a beginner in Linux. Please explain the details of the whole command line.

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
yen
  • 95
  • 4

3 Answers3

2

You tagged your question with /linux, so you are probably using GNU sed. Then you can use the -z option to process the whole file in one buffer and use:

sed -z 's/\n+//g'

That means substitute each line break (\n) followed by a + sign by nothing, which means to join a line starting with + with the previous one, dropping the +.

Philippos
  • 13,453
  • With POSIX sed use sed 'H;1h;$!d;x;s/\n+//g' like explained in https://unix.stackexchange.com/questions/533277/how-do-i-process-the-whole-file-in-one-buffer-in-sed-without-gnu-z-option or you can't avoid a loop like sed -e :1 -e 'N;s/\n+//;t1' -e 'P;D' – Philippos Aug 28 '19 at 17:06
  • I'm very surprised that that doesn't remove the last newline in the input since -z tells sed to Treat the input as a set of lines, each terminated by a zero byte (the ASCII ‘NUL’ character) instead of a newline. (from the man page) so the \n at the end of the file should be treated the same as every other \n. The POSIX scripts you posted remove the final \n as expected, any idea why it doesn't get removed with the GNU sed -z version? – Ed Morton Sep 03 '19 at 13:27
  • I'm sorry, but I never really understood this description. Obviously they tried to express, that the input is not changed, but just treated differently. No newline is changed, neither the final one, but the parser looks for 0x00 instead of 0x0A to separate the input (and each file of input is followed by a null byte). I always believed I was too dumb to understand the manual, but maybe I'm not the only one confused by that sentence. (-; – Philippos Sep 03 '19 at 14:18
1

I came with this in awk:

awk 'BEGIN {RS=""}{gsub(/\n\+/,"", $0); print $0}' file

Output:

******************
.WER aaa bbb ccc ddd eee  fff ggg hhh
******************
.SDF zzz xxx yyy  iii   kkk lll
******************
.XCV uuu vvv ggg  hhh qqq  rrr ttt jjj
******************
1

Using any awk in any shell on every UNIX box and only reading 1 line at a time into memory (the other solutions posted so far read the whole input file into memory at once):

$ awk '{printf "%s%s", (sub(/^\+/,"") ? "" : ors), $0; ors=ORS} END{print ""}' file
******************
.WER aaa bbb ccc ddd eee fff ggg hhh
******************
.SDF zzz xxx yyy iii kkk lll
******************
.XCV uuu vvv ggg hhh qqq rrr ttt jjj
******************
Ed Morton
  • 31,617