I'm using the command line to combine hundreds of CSV files (with identical columns) in the manner described in these questions.
The problem is that some of my CSVs have an NA
value in cell A1
. In these cases, the values from the first row of the file are added to the last row of the previous file.
Here is a simple example.
csv1
col1,col2
csv2
,16
17,18
cat *.csv >merged.csv
yields this output
col1,col2,16
17,18
my desired output
col1,col2
,16
17,18
One option is to modify the source files so that the first column never contains missing values. Due to the quantity of data involved, I'd like to avoid that if possible. Is it possible to fix this behavior within the cat
command?