7

I there a way to split single line into multiple lines with 3 columns. New line characters are missing at the end of all the lines in the file.

I tried using awk, but it is splitting each column as one row instead of 3 columns in each row.

awk '{ gsub(",", "\n") } 6' filename

where filename's content looks like:

A,B,C,D,E,F,G,H,I,J,K,L,M,N,O

Desired output has 3 columns in each line:

A,B,C
D,E,F
G,H,I
J,K,L
M,N,O
Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
Rakesh K
  • 151

2 Answers2

13

Using awk

$ awk -v RS='[,\n]' '{a=$0;getline b; getline c; print a,b,c}' OFS=, filename
A,B,C
D,E,F
G,H,I
J,K,L
M,N,O

How it works

  • -v RS='[,\n]'

    This tells awk to use any occurrence of either a comma or a newline as a record separator.

  • a=$0; getline b; getline c

    This tells awk to save the current line in variable a, the next line in varaible b, and the next line after that in variable c.

  • print a,b,c

    This tells awk to print a, b, and c

  • OFS=,

    This tells awk to use a comma as the field separator on output.

Using tr and paste

$ tr , '\n' <filename | paste -d, - - -
A,B,C
D,E,F
G,H,I
J,K,L
M,N,O

How it works

  • tr , '\n' <filename

    This reads from filename while converting all commas to newlines.

  • paste -d, - - -

    This paste to read three lines from stdin (one for each -) and paste them together, each separated by a comma (-d,).

Alternate awk

$ awk -v RS='[,\n]' '{printf "%s%s",$0,(NR%3?",":"\n")}' filename
A,B,C
D,E,F
G,H,I
J,K,L
M,N,O

How it works

  • -v RS='[,\n]'

    This tells awk to use any occurrence of either a comma or a newline as a record separator.

  • printf "%s%s",$0,(NR%3?",":"\n")

    This tells awk to print the current line followed by either a comma or a newline depending the value of the current line number, NR, modulo 3.

John1024
  • 74,655
5
sed 's/\(\([^,]\+,\)\{3\}\)/\1\n/g;s/,\n/\n/g' filename

I know that you asked for an awk solution, and I'll now try to submit that as an edit to this answer, but for me a sed solution was simpler... ... and user john1024 beat me to it, with a fine awk solution. See there. His paste and tr solution is probably the most proper classic unix-ish answer.

  1. This solution uses the extended regex features of GNU sed.

  2. \(..\) is a regex collection group. Note that the solution uses two, one nested within the other.

  3. [^,]+, is any string that doesn't have a comma, followed by a comma. In your case, a column or field.

  4. \{3\} is a regex multiplier, indicating to use the prior regex expression three times.

  5. \1 is a regex back-reference. to the prior regex.

  6. g means do it for all instances on the line.

  7. s/,\n/\n/g removes the trailing comma. It's necessary to include the newline character here, because sed is still considering the input as a single line.

user1404316
  • 3,078