I'm using the following gawk
script to read values from the first column of the csv file file.csv.
I use gawk
since I don't want any embedded commas to be ignored.
col=`gawk '
BEGIN {
FPAT="([^,]+)|(\"[^\"]+\")"
}
{print $1 }' file.csv`
For example, file.csv is:
col1,col2
"a,a","a,a1"
,"b1"
"c","c1"
The problem is that since the second row of the first column is empty, when it reads the values from the first column it takes the value of the second column as the value of the second row.
echo $col
returns
a,a
b1
c
but I would like it to acknowledge the empty string as follows:
a,a
c
How could I achieve such behaviour?
Thank you!
UPDATE:
I noticed that if the empty string/space is in the last row, this method ignores it.
col=`gawk '
BEGIN {
FPAT="([^,]*)|(\"[^\"]*\")+"
}
{print $1 }' file.csv`
For example, if the file.csv is the following:
col1,col2
"a,a","a,a1"
"b","b1"
,"c1"
The result would be
col1
a,a
b
instad of
col1
a,a
b
What can I do to fix this issue?
col
that's causing your problem (tryprintf 'foo\nbar\n\n'
thencol=$(printf 'foo\nbar\n\n'); echo "$col"
to see what I mean) but that would be a different question. – Ed Morton Jul 29 '21 at 21:30