2

I am trying to use sed to change the format of decimal numbers in a large CSV file before importing it into a SQLite database. They all have two decimal places, may be negative use comma as a decimal separator and are therefore escaped with double quotes. I was trying the following:

sed 's/"(-?)([:digit:]+),([:digit:]{2})"/$1$2.$3/g' input.csv > output.csv

The regex seems to work on a text editor on a sample of the file, but when running it through sed, there are no changes to the original file. What am I doing wrong?

2 Answers2

1

Since -r is unavailable, use this leaning toothpick forest:

sed 's/"\(-?[[:digit:]]\+\),\([[:digit:]]\{2\}\)"/\1.\2/g' input.csv > output.csv

sed -r is a GNU extension. And sadly, most tools that use regular expressions implement the language slightly differently (grep/sed, awk, perl, ...)

glenn jackman
  • 85,964
0

I find Perl's syntax simpler for such things (I am ignoring the quotes but you can add them if you wish):

perl -pe 's/(-*)(\d+),(\d{0,2})/$1$2.$3/g' input.csv > output.csv

You can also use the -i option to edit the original file directly.

terdon
  • 242,166