5

I have a file roll.txt with below data in comma delimited format without any newline.

'123456789','987651234','129873645','213456789','987612345','543216789','432156789','876543291','213465789','542637819','123456','23456','22234','3456','7890543','34567891,'2345','567'

I need to insert a New Line after every 6th occurrence of the comma delimiter along with no comma at the end of each line.

Below is the expected output:

'123456789','987651234','129873645','213456789','987612345','543216789'
'432156789','876543291','213465789','542637819','123456','23456'
'22234','3456','7890543','34567891,'2345','567'

I am using below sed command which is not working.

sed 's/[^,]//g'

3 Answers3

5

At least with GNU sed and assuming your fields cannot contain embedded comma separators, you could do

sed 's/,/\n/6; P; D' roll.txt

which repeatedly attempts to replace the 6th comma with a newline, print, and then delete the portion of pattern space up to the newline.

NOTE: it is not necessary to implement an explicit labelled test/branch, since the D command implicitly "restarts the cycle" on the remainder of the line:

D
If pattern space contains no newline, start a normal new cycle as if the d command was issued. Otherwise, delete text in the pattern space up to the first newline, and restart cycle with the resultant pattern space, without reading a new line of input.

(credit to @RakeshSharma for clarifying this).

Ex.

sed 's/,/\n/6; P; D' roll.txt 
'123456789','987651234','129873645','213456789','987612345','543216789'
'432156789','876543291','213465789','542637819','123456','23456'
'22234','3456','7890543','34567891,'2345','567'

Alternatively, with Perl's Text::CSV module:

perl -MText::CSV -ne '
  BEGIN{$p = Text::CSV->new()} 
  @fields = $p->fields() if $p->parse($_); 
  do {
    print join ",", splice @fields, 0, 6; print "\n";
  } while @fields
' roll.txt
'123456789','987651234','129873645','213456789','987612345','543216789'
'432156789','876543291','213465789','542637819','123456','23456'
'22234','3456','7890543','34567891,'2345','567'
steeldriver
  • 81,074
  • The "test" t command in sed is not really needed. sed 's/,/\n/6;P;D' will suffice. The "D" command when operating on a pattern space without newline(s) behaves just like its lowercase counterpart. – Rakesh Sharma May 28 '18 at 09:05
  • If you replace the \n with a backslash followed by an actual newline, this would even be POSIX compliant, so the answer is not limited to GNU sed anymore. – Philippos Oct 17 '23 at 10:01
5

With tr&paste:

tr ',' '\n' <infile |paste -sd',,,,,\n'

for more readability and understandable:

tr ',' '\n' <infile |paste --serial --delimiters=',,,,,\n'

In such a case when you wanted to add a NewLine at every say, N=100 position, then you may not prefer to input 99 commas ',,,,,,,,,, ... ,\n'; instead let printf generate it for you with brace-expansion.

tr ',' '\n' <infile |paste -sd $(printf '%.1s' ,{1..99})'\n'

from man paste:

-d, --delimiters=LIST
       reuse characters from LIST instead of TABs

-s, --serial
       paste one file at a time instead of in parallel
αғsнιη
  • 41,407
  • paste -d, - - - - - - and pr -6ats, would work too – Sundeep May 28 '18 at 06:05
  • @Sundeep your tr ...| pr -6ats, does but not tr ...| paste -d, - - - - - -, also I suggest you post that as an answer instead of a comment. – αғsнιη May 28 '18 at 06:33
  • good point about trailing commas, forgot about that.. anyway, didn't add an answer as I feel the question is duplicate.. – Sundeep May 28 '18 at 07:53
0

A variation on αғsнιη's answer:

$ tr ',' '\n' <file | paste -d, - - - - - -
'123456789','987651234','129873645','213456789','987612345','543216789'
'432156789','876543291','213465789','542637819','123456','23456'
'22234','3456','7890543','34567891,'2345','567'

This assumes that none of the fields have embedded commas in them.

If the input does not have a multiple of six fields, you may get output like

'123456789','987651234','129873645','213456789','987612345','543216789'
'432156789','876543291','213465789','542637819','123456','23456'
'22234','3456','7890543','34567891,'2345','567'
hello,world,,,,
Kusalananda
  • 333,661
  • this won't give expected result untill a long line is not multiply of 6, add ,'something' at the end of the line and see the result. you will have training commas (6- number of printed column in last line) – αғsнιη May 28 '18 at 07:44
  • @αғsнιη Thanks, I've added a note about this. It may actually be preferable this way since there will always be six comma-separated columns in the output no matter what. – Kusalananda May 28 '18 at 07:49