6

I am looking for away contact lines based on the next line. So far the only way I see is to create a shell script that will read line by line and will do something along these lines:

while read line
    if $line does not start with "," and $curr_line is empty 
        store line in curr_line
    if $line does not start with "," and $curr_line is not empty
        flush $curr_line to file
        store $line in $curr_line
    if $line starts with "," append to $curr_file, flush to file empty curr_line
done < file

So I am trying to understand if could be achieved with sed or even grep with redirection. the rules of the file are simple. There is at max one and only one line starting with "," that needs to be appended to the previous line.

ex:

line0
line1
line2
,line3
line4
line5
,line6
line7
,line8
line9
line10
line11

The result file would be

line0
line1
line2,line3
line4
line5,line6
line7,line8
line9
line10
line11
don_crissti
  • 82,805
BitsOfNix
  • 5,117

4 Answers4

9

I'd do:

awk -v ORS= '
  NR>1 && !/,/ {print "\n"}
  {print}
  END {if (NR) print "\n"}' < file

That is, only prints that newline character that delimits the previous line if the current one does not start with a ,.

In any case, I wouldn't use a while read loop.

8

This is a classic use-case for sed, as explained in Sed One-Liners Explained, Part I: File Spacing, Numbering and Text Conversion and Substitution, 40. Append a line to the previous if it starts with an equal sign "=". (with the obvious modification of , for =)

sed -e :a -e '$!N;s/\n,/,/;ta' -e 'P;D' file
line0
line1
line2,line3
line4
line5,line6
line7,line8
line9
line10
line11
steeldriver
  • 81,074
7

All you need to do is slurp the file and remove any newlines before commas:

$ perl -0777pe 's/\n,/,/g' file
line0
line1
line2,line3
line4
line5,line6
line7,line8
line9
line10
line11
terdon
  • 242,166
  • perl -0007 is for BEL delimited records. Use perl -777 to slurp the whole file in, though any byte other than newline and , as the record delimiter would also work here. – Stéphane Chazelas Nov 15 '16 at 15:16
  • @StéphaneChazelas thanks, I always get this confused. Personally, I tend to just use -0pe for slurping so I don't remember the canonical one. – terdon Nov 15 '16 at 15:31
  • 1
    (sorry, meant -0777 above). Yes, it can be confusing. -0 is for NUL delimited -00 is for paragraph mode (I sometimes get confused between those two), -0<octal-greater-than-0> is to delimit on the corresponding byte, unless the number is greater than 255 (0377) in which case as it's not a valid byte value it slurps the whole file in. – Stéphane Chazelas Nov 15 '16 at 15:36
  • 1
    @StéphaneChazelas OK, so unless the file actually contains \0, using -0 to slurp should be fine, right? I've even asked a question about this, it's really confusing the way the numbers swing back. – terdon Nov 15 '16 at 15:52
2

This is a perfect use case for ex.

If you haven't heard of it, ex is the predecessor to vi and, like vi, it is specified by POSIX and available essentially* everywhere.

ex is actually designed for file editing, but you can use it without saving changes as well.


Print changes, do not save to file:

printf '%s\n' 'g/^,/-j!' %p | ex file.txt

Make changes and save to file:

printf '%s\n' 'g/^,/-j!' x | ex file.txt

Explanation:

I use the printf command as a wrapper for ex for scripted file edits. This form has the advantage that on any failure (e.g. you pass a command that isn't a real command, or you try to address a line number that doesn't exist), the command simply exits (without saving any changes) rather than waiting for other input.

You can view the exact commands passed by printf to ex by running the printf command by itself:

$ printf '%s\n' 'g/^,/-j!' %p
g/^,/-j!
%p

Okay, what do those commands do?

Well, g is the "global" command, and it runs the following command on all lines in the "buffer" (file) which match the regex ^, (start of line followed by a comma).

The command in this case is -j!. The - is an address, and it means execute the following command on the previous line to the current line. (In other words, on the lines before lines starting with a comma.)

j is for "join" and it joins the line with the following line. The exclam (!) suppresses using a space character to separate the original line from the line joined to it.

% is an address meaning "the entire buffer," and p means to "print."

x means to save changes and exit.

As I said, it's a perfect example of a use case for ex.


*Except Windows. :P

Wildcard
  • 36,499
  • Downvoter please comment. This answer does exactly what is requested and will work on any Unix or Linux system. – Wildcard Nov 16 '16 at 22:44