An example of my file looks like this:
201012,720,201011,119,201710,16
Output I want:
201012,720
201011,119
201710,16
An example of my file looks like this:
201012,720,201011,119,201710,16
Output I want:
201012,720
201011,119
201710,16
Using a Sed loop:
sed -e 's/,/\n/2' -e 'P;D' file
Ex.
$ echo '201012,720,201011,119,201710,16' | sed -e 's/,/\n/2' -e 'P;D'
201012,720
201011,119
201710,16
This replaces the second ,
with \n
, then prints and deletes up the \n
, repeatedly until the substitution is no longer successful.
BSD doesn't understand newline as \n
in right side of s
commands, this is a workaround for ksh,bash,zsh shells:
sed -e :a -e $'s/,/\\\n/2' -e 'P;D' file
Or, a general solution for (old) seds:
sed '
:a
s/,/\
/2
P;D
' file
201012,720n201011,119,201710,16
– jesse_b
May 18 '19 at 17:38
sed -e :a -e $'s/,/\\\n/2' -e 'P;D;ta' file
–
May 18 '19 at 17:51
ta
is never executed since the pattern space never survives the D
. We can add an extra command after the ta
to verify. So, this will suffice : sed -e 's/,/\n/2' -e 'P;D'
– Rakesh Sharma
May 18 '19 at 19:28
sed -e 'y/,/\n/' -e 's/\n/,/' -e 'P;D'
to workaround the newline limitation.
– Rakesh Sharma
May 18 '19 at 19:40
$ paste -d, - - < <( tr ',' '\n' <file )
201012,720
201011,119
201710,16
or, without the process substitution,
$ tr ',' '\n' <file | paste -d, - -
201012,720
201011,119
201710,16
This replaces all commas in the file with newlines using tr
, then uses paste
to create two columns separated by a comma from that.
If tr
feels a bit too simple, you may replace it with sed 'y/,/\n/'
, which does the same thing.
I was able to accomplish this with the following awk command:
awk -F, -v OFS=, '{for (i=1;i<=NF;i=i+2) {j=i+1; print $i,$j}}' input
This will loop through each column in the input (incrementing by 2 each iteration) and print that column plus the next adjacent column on a line before moving to the next.
$ cat input
201012,720,201011,119,201710,16
$ awk -F, -v OFS=, '{for (i=1;i<=NF;i=i+2) {j=i+1; print $i,$j}}' input
201012,720
201011,119
201710,16
1,2,3,""
is not the same as 1,2,3
. ... And, the sed solution do not generate that trailing comma.
–
May 18 '19 at 16:07
1
is not the same as 1,""
. One is a csv and the other is not.
– jesse_b
May 18 '19 at 16:10
,,value1,value2
would be necessary to ensure the values end up in the proper column
– jesse_b
May 18 '19 at 16:14
Using xargs
and printf
:
xargs -d, printf '%s,%s\n' < file
Output:
201012,720
201011,119
201710,16
The above code assumes each line has an even number of fields. If not, xargs
will print lone numbers and dangling commas. But this somewhat slower code should plow through most anything:
tr , '\n' < file | xargs -n2 printf '%s,%s\n' | sed '$s/,$//'
Which can be sped up by increasing -n2
to some reasonable maximum even number, e.g. suppose no number in the input is longer than 15 digits:
m=$(getconf ARG_MAX) m=$(( (m/16) + (m%2) ))
tr , '\n' < file | xargs -n"${m}" printf '%s,%s\n' | sed '$s/,$//'
Another sed
solution:
sed 's/\([^,]*,[^,]*\),/\1\n/g' file
This replaces each second comma with a newline.
awk -F'\n' -vRS=, '{l=$1; $0=""; getline; print l RS $1}'
or
awk -F'\n' -vRS=, '{print $1 RS (getline > 0 ? $1 : "")}'
You can omit the -F'\n'
if the fields don't contain spaces. Or set it to the same value as the record separator (eg. with -F,
) if your fields may also contain newlines (if eg. in the output of echo 1,2,3,4
the last field should be 4\n
, not 4
).
echo $'201012,720,201011,119,201710,16,201705\n201709,115,201708,23\n201707' | awk -F'\n' -vRS=, '{l=$1; $0=""; getline; print l","$1}'
.
–
May 18 '19 at 18:07
You could use grep
- beware not all versions support -o
and that this will not work for an odd number of fields
grep -E -o '[^,]+,[^,]+' file
Or some overkill
gawk 'BEGIN{FPAT="[^,]+,[^,]+";OFS="\n"}; {$1=$1; print}' file