2

I have a file like so

Hello,Hi,Hullo,Hammers,Based,Random

For n=2, output must be like so

Hello,Hi
Hullo,Hammers
Based,Random

For n=3, output must be like so

Hello,Hi,Hullo
Hammers,Based,Random

How could I accomplish this using awk/sed?

Edit: n is a factor of number of fields

  • 4
    For n=4, should the last line be Based,Random or Based,Random,,? The former does not fulfill "specific number of fields", but the latter does not entirely come from "splitting a single line of fields". – Kamil Maciorowski Jul 01 '22 at 23:48
  • Ed Morton is correct, I forgot to mention that it can be assumed that number of fields divides evenly by 'n' – SeetheMoar Jul 19 '22 at 14:44

9 Answers9

4
$ awk -v n=2 -F',' '{for (i=1;i<=NF;i++) printf "%s%s", $i, (i%n ? FS : ORS)}' file
Hello,Hi
Hullo,Hammers
Based,Random

$ awk -v n=3 -F',' '{for (i=1;i<=NF;i++) printf "%s%s", $i, (i%n ? FS : ORS)}' file
Hello,Hi,Hullo
Hammers,Based,Random

In your question you didn't address how to handle cases where the number of fields don't divide evenly by n so I haven't addressed it here either.

Ed Morton
  • 31,617
3

Another approach with tr and paste:

For n=2,

$ <input tr ',' '\n' | paste  -d ',' - -
Hello,Hi
Hullo,Hammers
Based,Random

For n=3,

$ <input tr ',' '\n' | paste  -d ',' - - -
Hello,Hi,Hullo
Hammers,Based,Random
r_31415
  • 516
2

Using perl:

$ echo 'Hello,Hi,Hullo,Hammers,Based,Random' | 
    perl -F, -le '
      BEGIN { $n = shift };
      for ($i=0; $i < @F; $i += $n) {
         print join(",", @F[$i .. ($i + $n - 1)]);
      }' 2
Hello,Hi
Hullo,Hammers
Based,Random

This uses the first argument as the number of entries printed per output line (using variable $n). STDIN and any filename arguments are used as the input.

Due to the -F, option (which implicitly enables the -a and -n options), it automatically reads each input line and splits it on commas into array @F, then iterates over the indices of the array, $n items at a time. $n elements are printed on each output line.

NOTE: use the Text::CSV module if you need to parse actual CSV with quoted fields and commas embedded in quotes rather than simple comma-delimited input.

Output with an argument of 3 instead of 2:

$ echo 'Hello,Hi,Hullo,Hammers,Based,Random' | perl -F, -le 'BEGIN{$n = shift};for($i=0;$i<@F;$i+=$n){print join(",",@F[$i..($i+$n-1)])}' 3
Hello,Hi,Hullo
Hammers,Based,Random

And again with 4:

$ echo 'Hello,Hi,Hullo,Hammers,Based,Random' | perl -F, -le 'BEGIN{$n = shift};for($i=0;$i<@F;$i+=$n){print join(",",@F[$i..($i+$n-1)])}' 4
Hello,Hi,Hullo,Hammers
Based,Random,,
cas
  • 78,579
2
sed 's/,/\n/2;P;D' 
m=3
sed "s/,/\\n/$m;P;D"
guest_7
  • 5,728
  • 1
  • 7
  • 13
  • 1
    I can't explain to myself how someone could have downvotes this answer. It's really elegant. – DanieleGrassini Jul 03 '22 at 16:11
  • 1
    @DanieleGrassini Assumes a particular non-standard implementation of sed, and contains a code injection vulnerability unless one assumes full control over the value in $m. – Kusalananda Jul 05 '22 at 11:48
  • @Kusalananda What isn't standard in this sed? – DanieleGrassini Jul 10 '22 at 23:10
  • 1
    @DanieleGrassini Inserting a newline using \n in the replacement string of the s command. – Kusalananda Jul 10 '22 at 23:15
  • @Kusalananda how we can speak about security implications with out knowing the full environnement? any of the other code coud be considered potentially dangerous in some circumstances. To mee, in regards to this question which dosnt state anything else that "How to split.." seem to be a good solutions. – DanieleGrassini Jul 10 '22 at 23:17
  • @Kusalananda I think is well supported anyway. Work with GNU sed with --posix flags and with busybox sed. And if some implementations of sed does not support it, is just matters to add it in line a part i think. – DanieleGrassini Jul 10 '22 at 23:25
  • @DanieleGrassini Your original enquiry was about how someone could have downvoted this answer, not whether the code could run on some number of various systems. I know how to modify it to work (and be safe), so there's no need to convince me of anything. Note too that using --posix does not remove all non-POSIX features from GNU sed (EDIT: It does not remove behaviour that is explicitly left unspecified in the POSIX spec.) – Kusalananda Jul 10 '22 at 23:29
2

awk again,
input any suite of values separated by , and newlines,
output a fixed-width csv:

awk '{printf((FNR>1?(FNR-1)%n?",":ORS:"")$0)}END{print ""}' RS='[,\n]' n=4 <<END
Hello
Hi,Hullo,Hammers,Based
Random
END

Hello,Hi,Hullo,Hammers Based,Random

1

With perl :

echo 'Hello,Hi,Hullo,Hammers,Based,Random' | perl -ne '
    @L = (/,?([^,]*,[^,]*)/g);
    $"="\n" ; print "@L"
'

This question make me think to python zip/iter builtin functions:

python3 -c 'from sys import argv as F; J = "\n".join
_, sep, data, sz = F
L = [*map(sep.join, zip(*[iter(data.split(sep))]*int(sz)))]
print(J(L))
' , "Hello,Hi,Hullo,Hammers,Based,Random" 2
1

Using Raku (formerly known as Perl_6)

~$ raku -ne '.put for .split(",").rotor(3);'  file

Sample Input:

Hello,Hi,Hullo,Hammers,Based,Random

Sample Output with .rotor(3) (from above):

Hello Hi Hullo
Hammers Based Random

Sample Output changing above to .rotor(2):

Hello Hi
Hullo Hammers
Based Random

The code above is a bare-bones implementation in Raku (returning single whitespace between columns). The rotor() call determines the number of columns [ see discussion below regarding the difference between rotor() and batch() ]. Just add a call to .join() if you want to join columns using commas, tabs, pipes, etc.:

~$ raku -ne '.join(",").put for .split(",").rotor(2);'  file
Hello,Hi
Hullo,Hammers
Based,Random

Note, by default rotor() only returns full groups and will drop partial groups at the very end. So perfoming a rotor(4) call on the above six-element sample will result in a single line of output, 4 elements long. To ensure no loss of data, use rotor(4, :partial) or batch(4).

~$ raku -ne '.join(",").put for .split(",").rotor(4);'  file
Hello,Hi,Hullo,Hammers

#COMPARE TO:

~$ raku -ne '.join(",").put for .split(",").batch(4);' file Hello,Hi,Hullo,Hammers Based,Random


Processing by an authentic CSV-parser (e.g. Raku's Text::CSV module) will validate the resulting CSV file. See the URL below for examples.

https://unix.stackexchange.com/a/701805/227738
https://raku.org

jubilatious1
  • 3,195
  • 8
  • 17
0

Using sed

$ sed -E 's/([^,]*,[^,]*),/\1\
/g' input_file
Hello,Hi
Hullo,Hammers
Based,Random
$ sed -E 's/(([^,]*,){2}[^,]*),/\1\
/g' input_file
Hello,Hi,Hullo
Hammers,Based,Random
sseLtaH
  • 2,786
-1

Apart from the obvious awk/Perl/sed approach I can recommend Miller

It can extract and modify Text based data very well and is more intuitive to use that it‘s counterparts.

jmk
  • 137