How to print two consecutive lines separated by one blank line into one line separated by "," (comma)

Question

How can I turn the text

St1
number1
1234
St2
number2
456

into the following one?

st1,number1,1234
st2,number2,1234

In the sample output the second line has 1234 where one would expect 456 - Is that a copy-paste error? — FelixJN, Mar 21 '22 at 07:37
Will your input file contain commas? In particular, will any single line in your input file contain an end-of-line comma, or two-or-more internal commas? — jubilatious1, Mar 21 '22 at 07:59

Stéphane Chazelas · Answer 1 · 2022-03-18T09:56:49.510

16

With the standard paste command:

$ paste -sd ',,\n\0' file
St1,number1,1234
St2,number2,456

serially pastes the lines of file with ,, ,, newline, nothing (not a NUL character as one might think) as delimiters in turn.

Or:

$ paste -d ',,\0' - - - - < file
St1,number1,1234
St2,number2,456

pastes stdin 4 times with ,, , and nothing as delimiters between them.

edited Mar 18 '22 at 09:56

answered Mar 18 '22 at 09:29

Stéphane Chazelas

544,893

thank you very much, I done it – aphabeta Mar 22 '22 at 02:56
@aphabeta then you should accept this answer – DanieleGrassini Mar 27 '22 at 13:31

terdon · Answer 2 · 2022-03-18T08:48:15.643

A Perl way:

$ perl -00 -ne 'print join(",",split(/\n/)) . "\n"; ' file
St1,number1,1234
St2,number2,456

The -00 turns on perl's "paragraph mode" where "lines" are defined by \n\n, so each paragraph is treated as a single "line". The -n means "read the input file line by line and apply the script given by -e to each line. The script will split the input line on \n (newline: end of line character) and then join the resulting elements with a ,. This is all printed along with a trailing newline.

You could write the same thing like this for clarity:

$ perl -00 -ne '@fields=split(/\n/); $out=join(",",@fields); print "$out\n" ' file
St1,number1,1234
St2,number2,456

Or like this, for fun:

$ perl -00 -pe '$_=join(",",split(/\n/))."\n";' file
St1,number1,1234
St2,number2,456

RudiC · Answer 3 · 2022-03-18T10:38:20.260

6

Try awk:

awk '$1=$1' OFS=, RS= file
St1,number1,1234
St2,number2,456

The empty record separator RS targets the "blank line", and assigning a field (here: $1) makes awk reassemble the entire record from the fields using OFS. It assumes $1 never being 0 or NULL/empty.

edited Mar 18 '22 at 10:38

answered Mar 18 '22 at 09:23

RudiC

8,969

6

you at least need {$1=$1; print } or {$1=$1};1 incase first field was zero. or force the string conversion by $1=$1"". – αғsнιη Mar 18 '22 at 09:49
1

I'd add FS='\n' to the variable list: not all whitespace should be replaced by commas, I'd imagine. – glenn jackman Mar 18 '22 at 13:54

Philippos · Answer 4 · 2022-03-18T12:05:25.923

4

For the sake of completeness, using any sed you could go

sed 'N;N;y/\n/,/;n;d' file

With N;N you join the Next two lines in the pattern space
y/\n/,/ replaces the newlines between the lines with commas (thanks @Stéphane for hinting to use y instead of s)
n prints the current pattern space while loading the next line (the empty one), which you then delese

edited Mar 18 '22 at 12:05

answered Mar 18 '22 at 11:27

Philippos

13,453

score 4 · Answer 5 · answered Mar 18 '22 at 14:46

Making use of paragraph mode (-00) in Perl:

perl -pals -F'\n' -00e 's/.*/@F/s' -- -\"=, file

Perl in line by line mode:

perl -pe '
  chomp($_.=<>.<>),tr/\n/,/ if/./;
  eof && s/.\K$/\n/;
' file

POSIXly sed :

sed '
  /./{H;$!d;}
  x;y/\n/,/;s/.//
' file

We can do this using GNU csplit + xargs paste pipeline

csplit --suppress-matched -sz file '/^$/' '{*}'
printf '%s\n' xx* | xargs -r paste -sd,

Using groupby method from the itertools module in Python along with list comprehension:


python3 -c 'import sys, itertools as it
ofs,(rs,ors) = ",","\n" * 2
g = lambda x: not len(x)
h = lambda x: x.rstrip(rs)
with open(sys.argv[1]) as f:
  print(*[ofs.join(igrp) for k,igrp in it.groupby(map(h,f),g) if not k],sep=ors)
' file

Output:-

St1,number1,1234
St2,number2,456

Or perl -F\\n -als00eprint@F -- -,=,... – Stéphane Chazelas Mar 18 '22 at 17:55 — Stéphane Chazelas, Mar 18 '22 at 17:55

Ed Morton · Answer 6 · 2022-03-18T18:20:24.583

4

Using any awk in any shell on every Unix box and no matter what your input values are and no matter how many lines are in each empty-line-separated record:

$ awk -v RS= -F'\n' -v OFS=',' '{$1=$1}1' file
St1,number1,1234
St2,number2,456

edited Mar 18 '22 at 18:20

answered Mar 18 '22 at 17:51

Ed Morton

31,617

It's more empty-line-separated records, blanks lines that are not empty will not be considered as delimiters with RS="". Note that POSIXly, you need {$1=$1};1. I'm not aware of any implementation where the ; is required, but when I asked for the requirement to be relaxed, it got rejected based on input from the gawk maintainer. – Stéphane Chazelas Mar 18 '22 at 18:02
@StéphaneChazelas Fair enough, I changed the word blank to empty. I'm aware of that ; issue and choose to ignore it since, documented or not, an implementation that enforced it would break many existing awk scripts and so it's just never going to happen. – Ed Morton Mar 18 '22 at 18:21

score 2 · Answer 7 · answered Mar 21 '22 at 07:41

2

Replace all newlines by commas. Empty newlines will create two consecutive commas: re-replace those with a proper newline. Also take care of the dangling comma at the end on the last line.

tr '\n' ',' <infile | sed 's/,,/\n/;$s/,$//'

answered Mar 21 '22 at 07:41

FelixJN

13,566

jubilatious1 · Answer 8 · 2022-04-02T15:59:01.660

Using Raku (formerly known as Perl_6)

Probably the most robust solution simulates Perl's -00 paragraph mode:

raku -e '.put for lines.join("\n").split(/\n**2..*/).map(*.trans: "\n" => ",");'

Sample Input

St1
number1
1234
St2
number2
456

Sample Output

St1,number1,1234
St2,number2,456

The answer above uses Raku's lines routine, which reads input lines lazily and strips newlines by default. Basically with the code above, lines are read in, join-ed back again with newlines, and split where \n**2..* two-or-more consecutive newlines are found. Then each element is map-ped into and remaining single \n newlines are translated to commas. Finally the resultant elements are output using Raku's .put for... idiom ('put' stands for 'print-using-[newline]-terminator').

Again, above is the most robust solution. However you can get a similar result with simpler code, assuming you you don't mind receiving output with terminal "," commas. The code exploits the difference between print (no terminal newline) and put (adds terminal newline):

raku -ne 'if .chars { print $_~"," } else { put $_ };'

The code above preserves line-spacing where sections are separated by multiple consecutive blank lines (unlike the first answer, which removes all blank lines).

https://unix.stackexchange.com/a/686651/227738
https://docs.raku.org/type/Cool#routine_lines
https://raku.org

DanieleGrassini · Answer 9 · 2022-03-20T22:35:19.577

0

If the consecutive lines to join are just 2:

sed '/^$/d;N;N;y/\n/,/' test

Remove all the empty lines
Get the next two lines
Replace all occurrence of \n with an ,

Join all lines, not just two, separed by one or more empty lines:

sed '
    /^$/d;:a
    H;N;$!{
    /\n$/!ba
    }
    y/\n/,/
    s/,$//
' test

edited Mar 20 '22 at 22:35

answered Mar 20 '22 at 10:43

DanieleGrassini

2,824

@StéphaneChazelas thanks. See if it is ok now. – DanieleGrassini Mar 20 '22 at 10:54
Thanks again @StéphaneChazelas ! – DanieleGrassini Mar 20 '22 at 22:41

score -1 · Answer 10 · answered Mar 18 '22 at 10:04

-1

for ((i=1;i<=countoflineinfile;i++)); do  sed -n "$i,$((i+3))p" gh.txt|awk 'ORS=","' ;echo "";i=$((i+3)); done|sed "s/,*$//g"

output

St1,number1,1234
St2,number2,456

answered Mar 18 '22 at 10:04

Praveen Kumar BS

5,211

How to print two consecutive lines separated by one blank line into one line separated by "," (comma)

10 Answers10