0

I have a file where the first column is the key. The rows can have up to 2800 delimiters. I need to pivot the data from rows to columns. Below is the sample input and required output.

Source File

123,A,B,,,,AC,DF,,,,,,,,,,,,n 
567,A,B,,C,D,,,,,,,,, 7
89,C,B

Output

123,A
123,B
123,
123,
.
.
123,AC
123,DF
567,A
567,B
567,C
567,D
567,7
89,C
89,B

Please advise.

sourcejedi
  • 50,249
Yarlly
  • 33
  • 1
    I am not really sure why we have a CSV tag. IMO the easiest tool to jump to for standard CSV would be python, but this site is not really supposed to be for programming questions except for "shell scripting". Maybe a bit of a grey area. https://unix.stackexchange.com/questions/7425/is-there-a-robust-command-line-tool-for-processing-csv-files – sourcejedi May 20 '19 at 20:53

2 Answers2

2
$ cat file
123,A,B,,,,AC,DF,,,,,,,,,,,,n
567,A,B,,C,D,,,,,,,,, 7
89,C,B
$ awk -F, 'BEGIN { OFS = FS } { for (i = 2; i <= NF; ++i) if ($i != "") print $1, $i }' file
123,A
123,B
123,AC
123,DF
123,n
567,A
567,B
567,C
567,D
567, 7
89,C
89,B

This assumes that the data is in a simple CSV format that does not require any quotes (no embedded commas or newlines in any field). The awk code simply iterates over the comma-delimited fields of each line, from field 2 onwards, and prints them along with the first field on new lines. Empty fields are ignored. If you don't want to ignore empty fields (it's unclear in the question), remove the if ($i != "") bit in the code.

sourcejedi
  • 50,249
Kusalananda
  • 333,661
2

Using Miller (http://johnkerl.org/miller/doc/) with

mlr --nidx --fs "," reshape -r '[^1]' -o item,value \
then filter -S -x '$value==""' \
then cut -f 1,value input.txt

you have

123,A
123,B
123,AC
123,DF
123,n
567,A
567,B
567,C
567,D
567, 7
89,C
89,B

If you want also null value is

mlr --nidx --fs "," reshape -r '[^1]' -o item,value \
then cut -f 1,value input.txt

And you will have

123,A
123,B
123,
123,
123,
123,AC
123,DF
123,
123,
123,
123,
123,
123,
123,
123,
123,
123,
123,n
567,A
567,B
567,
567,C
567,D
567,
567,
567,
567,
567,
567,
567,
567, 7
89,C
89,B
aborruso
  • 2,855