0

I am generating a CSV file which ideally should have 1-row or 2-row data but it is coming like

11,ABC,3,4,5,6
,,,,,

I have tried sed and awk commands to remove that empty row but it didn't work out

sed -i 's/,,//g' fileName.csv

(as this isn't a dynamic solution when we have odd number of commas)

awk -F, 'length>NF+1' fileName.csv > fileName.csv

(when I try to put that into the file it's removing all the rows from the file. Maybe my inexperience with sed and awk commands is the issue).

Kusalananda
  • 333,661
  • How are you generating the CSV data? It seems that it should be possible to produce the data correctly from the start instead of relying on a post-processing step to correct it after the fact. – Kusalananda Sep 24 '21 at 06:40
  • It is coming through a tool with which I can't do much. Editing the data through the shell script for sending it ahead would be an easier solution – Mayank Parihar Sep 24 '21 at 06:41
  • 1
    There are 4 files which needs to be treated. And each of them have different number of fields – Mayank Parihar Sep 24 '21 at 06:49
  • Having suffered this issue myself, you might need to beware of additional junk such as completely blank lines, training whitespace, and carriage returns. The problem seems to arise from slapdash selection of data during Excel exports. – Paul_Pedant Sep 24 '21 at 06:58
  • Can your data also have lines of all commas BEFORE the last line that contains non-commas and, if so, do you want those lines removed too or only the lines after the last line that contains non-commas? See the sample input/output in my answer for example. – Ed Morton Sep 24 '21 at 17:50

3 Answers3

2

Passing the data through

grep '[^,]'

would extract lines that contain any character that is not a comma. It would effectively remove lines that only includes commas.

For example:

grep '[^,]' filename.csv >newname.csv

Note that you can't redirect to the same name you are reading from as the shell would truncate (empty) the file to which you are redirecting before executing grep.

You could also do this as an in-place edit with sed if your sed implementation supports this syntax:

sed -n -i '/[^,]/p' filename.csv

or

sed -i '/[^,]/!d' filename.csv

In-place editing using GNU awk:

awk -i inplace '/[^,]/' filename.csv
Kusalananda
  • 333,661
0

A simple way to do what you asked for, Remove the last rows with no data but just commas from a CSV file through shell script, would be:

tac file | awk '/[^,]/{f=1} f' | tac

For example:

$ cat file
a,b,c
d,e,f
,,,
g,h,i
,,,
,,,

$ tac file | awk '/[^,]/{f=1} f' | tac
a,b,c
d,e,f
,,,
g,h,i
Ed Morton
  • 31,617
-1

Answer provided by Kusalananda

sed -n -i '/[^,]/p' filename.csv

worked totally fine for me.

Also, I found a workaround to that using awk and file

awk -F ',' '$2>" " {print $0}' file.csv > temp.csv && mv temp.csv file.csv

This is not the ideal solution but works too

Kusalananda
  • 333,661
  • 1
    -i does not make sense by itself with GNU awk. Your awk code could be shortened to just $2 > " " (it would print the current line by default), but this looks odd. Why would you arbitrarily test the 2nd field and no other field? – Kusalananda Sep 24 '21 at 07:28
  • sorry, not the -i – Mayank Parihar Sep 24 '21 at 07:30
  • Don't post a new answer that contains someone else's answer (the sed script) and that awk script doesn't make any sense. – Ed Morton Sep 24 '21 at 16:33