4

How can I turn the text

St1
number1
1234

St2 number2 456

into the following one?

st1,number1,1234
st2,number2,1234
  • 1
    In the sample output the second line has 1234 where one would expect 456 - Is that a copy-paste error? – FelixJN Mar 21 '22 at 07:37
  • Will your input file contain commas? In particular, will any single line in your input file contain an end-of-line comma, or two-or-more internal commas? – jubilatious1 Mar 21 '22 at 07:59

10 Answers10

16

With the standard paste command:

$ paste -sd ',,\n\0' file
St1,number1,1234
St2,number2,456

serially pastes the lines of file with ,, ,, newline, nothing (not a NUL character as one might think) as delimiters in turn.

Or:

$ paste -d ',,\0' - - - - < file
St1,number1,1234
St2,number2,456

pastes stdin 4 times with ,, , and nothing as delimiters between them.

6

A Perl way:

$ perl -00 -ne 'print join(",",split(/\n/)) . "\n"; ' file
St1,number1,1234
St2,number2,456

The -00 turns on perl's "paragraph mode" where "lines" are defined by \n\n, so each paragraph is treated as a single "line". The -n means "read the input file line by line and apply the script given by -e to each line. The script will split the input line on \n (newline: end of line character) and then join the resulting elements with a ,. This is all printed along with a trailing newline.

You could write the same thing like this for clarity:

$ perl -00 -ne '@fields=split(/\n/); $out=join(",",@fields); print "$out\n" ' file
St1,number1,1234
St2,number2,456

Or like this, for fun:

$ perl -00 -pe '$_=join(",",split(/\n/))."\n";' file
St1,number1,1234
St2,number2,456
terdon
  • 242,166
6

Try awk:

awk '$1=$1' OFS=, RS= file
St1,number1,1234
St2,number2,456

The empty record separator RS targets the "blank line", and assigning a field (here: $1) makes awk reassemble the entire record from the fields using OFS. It assumes $1 never being 0 or NULL/empty.

RudiC
  • 8,969
4

For the sake of completeness, using any sed you could go

sed 'N;N;y/\n/,/;n;d' file
  • With N;N you join the Next two lines in the pattern space
  • y/\n/,/ replaces the newlines between the lines with commas (thanks @Stéphane for hinting to use y instead of s)
  • n prints the current pattern space while loading the next line (the empty one), which you then delese
Philippos
  • 13,453
4

Making use of paragraph mode (-00) in Perl:

perl -pals -F'\n' -00e 's/.*/@F/s' -- -\"=, file

Perl in line by line mode:

perl -pe '
  chomp($_.=<>.<>),tr/\n/,/ if/./;
  eof && s/.\K$/\n/;
' file

POSIXly sed :

sed '
  /./{H;$!d;}
  x;y/\n/,/;s/.//
' file

We can do this using GNU csplit + xargs paste pipeline

csplit --suppress-matched -sz file '/^$/' '{*}'
printf '%s\n' xx* | xargs -r paste -sd,

Using groupby method from the itertools module in Python along with list comprehension:


python3 -c 'import sys, itertools as it

ofs,(rs,ors) = ",","\n" * 2 g = lambda x: not len(x) h = lambda x: x.rstrip(rs)

with open(sys.argv[1]) as f: print(*[ofs.join(igrp) for k,igrp in it.groupby(map(h,f),g) if not k],sep=ors) ' file


Output:-

St1,number1,1234
St2,number2,456

guest_7
  • 5,728
  • 1
  • 7
  • 13
4

Using any awk in any shell on every Unix box and no matter what your input values are and no matter how many lines are in each empty-line-separated record:

$ awk -v RS= -F'\n' -v OFS=',' '{$1=$1}1' file
St1,number1,1234
St2,number2,456
Ed Morton
  • 31,617
2

Replace all newlines by commas. Empty newlines will create two consecutive commas: re-replace those with a proper newline. Also take care of the dangling comma at the end on the last line.

tr '\n' ',' <infile | sed 's/,,/\n/;$s/,$//'
FelixJN
  • 13,566
0

Using Raku (formerly known as Perl_6)

Probably the most robust solution simulates Perl's -00 paragraph mode:

raku -e '.put for lines.join("\n").split(/\n**2..*/).map(*.trans: "\n" => ",");'  

Sample Input

St1
number1
1234

St2 number2 456

Sample Output

St1,number1,1234
St2,number2,456

The answer above uses Raku's lines routine, which reads input lines lazily and strips newlines by default. Basically with the code above, lines are read in, join-ed back again with newlines, and split where \n**2..* two-or-more consecutive newlines are found. Then each element is map-ped into and remaining single \n newlines are translated to commas. Finally the resultant elements are output using Raku's .put for... idiom ('put' stands for 'print-using-[newline]-terminator').

Again, above is the most robust solution. However you can get a similar result with simpler code, assuming you you don't mind receiving output with terminal "," commas. The code exploits the difference between print (no terminal newline) and put (adds terminal newline):

raku -ne 'if .chars { print $_~"," } else { put $_ };'    

The code above preserves line-spacing where sections are separated by multiple consecutive blank lines (unlike the first answer, which removes all blank lines).

https://unix.stackexchange.com/a/686651/227738
https://docs.raku.org/type/Cool#routine_lines
https://raku.org

jubilatious1
  • 3,195
  • 8
  • 17
0

If the consecutive lines to join are just 2:

sed '/^$/d;N;N;y/\n/,/' test
  • Remove all the empty lines
  • Get the next two lines
  • Replace all occurrence of \n with an ,

Join all lines, not just two, separed by one or more empty lines:

sed '
    /^$/d;:a
    H;N;$!{
    /\n$/!ba
    }
    y/\n/,/
    s/,$//
' test
-1
for ((i=1;i<=countoflineinfile;i++)); do  sed -n "$i,$((i+3))p" gh.txt|awk 'ORS=","' ;echo "";i=$((i+3)); done|sed "s/,*$//g"

output

St1,number1,1234
St2,number2,456