0

How can I can use sed to remove from before-last column the group digit char comma and quotes itself?

Please note that in the sample below the target column is not contained in double quotes.

0,1,,,"10,815,197",
6,7,010202,,"5,589",
6,7,010202,,589,

An expetect result would be:

0,1,,,10815197,
6,7,010202,,5589,
6,7,010202,,589,
jherran
  • 3,939

3 Answers3

2

Awk will be the best for your scenario.

$ awk -F'"' '{gsub(",", "", $2);print}' file.txt 
0,1,,, 10815197 ,
6,7,010202,, 5589 ,
6,7,010202,,589,

How it works

-F'"' - causes AWK to use double quotes ( " ) as record separator.

gsub(",","",$2) - gsub function will search and replace all occurrence of double quotes with empty string.

print - It prints the modified content to the output.

Kannan Mohan
  • 3,231
1

I think it's easier with awk. You can try something like this:

$ awk -v v='"' 'BEGIN{FS=OFS=v}{gsub(",","",$2);gsub("\"","",$0);print }' file.txt
0,1,,,10815197,
6,7,010202,,5589,
6,7,010202,,589,
  • Basically you are telling awk that use a regular expression -v v='"' to use it as field separator.
  • With FS=OFS=v you say that the field separator is the same as output field separator which is the ".
  • gsub (",","",$2) replace the , with nothing on the second field $2 (delimited in the start and the end with ").
  • gsub("\"","",$0) takes the whole line and replace " with nothing before the printout of the line.
jherran
  • 3,939
0

sed is not the right tool for this.

$ perl -pe 's|"([\d,]+)"(?=[^"]*$)|$1=~y/,//dr|eg' file
0,1,,,10815197,
6,7,010202,,5589,
6,7,010202,,589,

Through Python.

#!/usr/bin/python3
import sys
import re
file = sys.argv[1]
with open(file, 'r') as f:
    for line in f:
        print(re.sub(r'"([\d,]+)"(?=[^"]*$)', lambda m: m.group(1).replace(',', ''), line), end = "")

Save the above script to a file , say script.py and run then run the sript by firing the below command on the terminal.

$ python3 script.py inputfile
Avinash Raj
  • 3,703