How can I turn the text
St1
number1
1234
St2
number2
456
into the following one?
st1,number1,1234
st2,number2,1234
How can I turn the text
St1
number1
1234
St2
number2
456
into the following one?
st1,number1,1234
st2,number2,1234
With the standard paste
command:
$ paste -sd ',,\n\0' file
St1,number1,1234
St2,number2,456
s
erially paste
s the lines of file
with ,
, ,
, newline, nothing (not a NUL character as one might think) as d
elimiters in turn.
Or:
$ paste -d ',,\0' - - - - < file
St1,number1,1234
St2,number2,456
paste
s stdin 4 times with ,
, ,
and nothing as d
elimiters between them.
A Perl way:
$ perl -00 -ne 'print join(",",split(/\n/)) . "\n"; ' file
St1,number1,1234
St2,number2,456
The -00
turns on perl's "paragraph mode" where "lines" are defined by \n\n
, so each paragraph is treated as a single "line". The -n
means "read the input file line by line and apply the script given by -e
to each line. The script will split the input line on \n
(newline: end of line character) and then join the resulting elements with a ,
. This is all printed along with a trailing newline.
You could write the same thing like this for clarity:
$ perl -00 -ne '@fields=split(/\n/); $out=join(",",@fields); print "$out\n" ' file
St1,number1,1234
St2,number2,456
Or like this, for fun:
$ perl -00 -pe '$_=join(",",split(/\n/))."\n";' file
St1,number1,1234
St2,number2,456
Try awk
:
awk '$1=$1' OFS=, RS= file
St1,number1,1234
St2,number2,456
The empty record separator RS
targets the "blank line", and assigning a field (here: $1
) makes awk
reassemble the entire record from the fields using OFS
.
It assumes $1
never being 0 or NULL/empty.
{$1=$1; print }
or {$1=$1};1
incase first field was zero. or force the string conversion by $1=$1""
.
– αғsнιη
Mar 18 '22 at 09:49
FS='\n'
to the variable list: not all whitespace should be replaced by commas, I'd imagine.
– glenn jackman
Mar 18 '22 at 13:54
For the sake of completeness, using any sed
you could go
sed 'N;N;y/\n/,/;n;d' file
N;N
you join the N
ext two lines in the pattern spacey/\n/,/
replaces the newlines between the lines with commas (thanks @Stéphane for hinting to use y
instead of s
)n
prints the current pattern space while loading the n
ext line (the empty one), which you then d
eleseMaking use of paragraph mode (-00) in Perl:
perl -pals -F'\n' -00e 's/.*/@F/s' -- -\"=, file
Perl in line by line mode:
perl -pe '
chomp($_.=<>.<>),tr/\n/,/ if/./;
eof && s/.\K$/\n/;
' file
POSIXly sed :
sed '
/./{H;$!d;}
x;y/\n/,/;s/.//
' file
We can do this using GNU csplit + xargs paste
pipeline
csplit --suppress-matched -sz file '/^$/' '{*}'
printf '%s\n' xx* | xargs -r paste -sd,
Using groupby
method from the itertools
module in Python along with list comprehension:
python3 -c 'import sys, itertools as it
ofs,(rs,ors) = ",","\n" * 2
g = lambda x: not len(x)
h = lambda x: x.rstrip(rs)
with open(sys.argv[1]) as f:
print(*[ofs.join(igrp) for k,igrp in it.groupby(map(h,f),g) if not k],sep=ors)
' file
Output:-
St1,number1,1234
St2,number2,456
Using any awk in any shell on every Unix box and no matter what your input values are and no matter how many lines are in each empty-line-separated record:
$ awk -v RS= -F'\n' -v OFS=',' '{$1=$1}1' file
St1,number1,1234
St2,number2,456
RS=""
. Note that POSIXly, you need {$1=$1};1
. I'm not aware of any implementation where the ;
is required, but when I asked for the requirement to be relaxed, it got rejected based on input from the gawk maintainer.
– Stéphane Chazelas
Mar 18 '22 at 18:02
;
issue and choose to ignore it since, documented or not, an implementation that enforced it would break many existing awk scripts and so it's just never going to happen.
– Ed Morton
Mar 18 '22 at 18:21
Replace all newlines by commas. Empty newlines will create two consecutive commas: re-replace those with a proper newline. Also take care of the dangling comma at the end on the last line.
tr '\n' ',' <infile | sed 's/,,/\n/;$s/,$//'
Using Raku (formerly known as Perl_6)
Probably the most robust solution simulates Perl's -00
paragraph mode:
raku -e '.put for lines.join("\n").split(/\n**2..*/).map(*.trans: "\n" => ",");'
Sample Input
St1
number1
1234
St2
number2
456
Sample Output
St1,number1,1234
St2,number2,456
The answer above uses Raku's lines
routine, which reads input lines lazily and strips newlines by default. Basically with the code above, lines
are read in, join
-ed back again with newlines, and split where \n**2..*
two-or-more consecutive newlines are found. Then each element is map
-ped into and remaining single \n
newlines are translated to commas. Finally the resultant elements are output using Raku's .put for...
idiom ('put' stands for 'print-using-[newline]-terminator').
Again, above is the most robust solution. However you can get a similar result with simpler code, assuming you you don't mind receiving output with terminal ",
" commas. The code exploits the difference between print
(no terminal newline) and put
(adds terminal newline):
raku -ne 'if .chars { print $_~"," } else { put $_ };'
The code above preserves line-spacing where sections are separated by multiple consecutive blank lines (unlike the first answer, which removes all blank lines).
https://unix.stackexchange.com/a/686651/227738
https://docs.raku.org/type/Cool#routine_lines
https://raku.org
If the consecutive lines to join are just 2:
sed '/^$/d;N;N;y/\n/,/' test
\n
with an ,
Join all lines, not just two, separed by one or more empty lines:
sed '
/^$/d;:a
H;N;$!{
/\n$/!ba
}
y/\n/,/
s/,$//
' test
for ((i=1;i<=countoflineinfile;i++)); do sed -n "$i,$((i+3))p" gh.txt|awk 'ORS=","' ;echo "";i=$((i+3)); done|sed "s/,*$//g"
output
St1,number1,1234
St2,number2,456
1234
where one would expect456
- Is that a copy-paste error? – FelixJN Mar 21 '22 at 07:37