How to pass variable in pattern while using sed command?

Question

I have file abc.sh

search_dir='dummy'
filename='numbers.txt'

for entry in "$search_dir"/*
do
  while read p;
  do 
    sed -i '' "/$p/d" $entry
  done < $filename
done

Trying to delete a line with the matching pattern. Basically, the pattern is just a string which I am passing from the file. But unfortunately, it is not working.

What I am able to debug is, I am not passing the variable in pattern correct way.

EDIT: numbers.txt

2018061300006178
2018061300006179
2018061300006325
2018061300006326
2018061400006505

the content of files that is present in search_dir is :

1888~2018061400006505~0101~1~OWNED~SELF EMPLOYED~~~~3~~AGRICULTURE~~~OTHERS~AGRICULTURIST~~~AGRICULTURE~~~~~~~~N~N~Y~N~N~~300000-500000~~~49582E95361D5FA0C10C4C419B2940591C17E94EF329C31047A6B7DE26E68638
1889~2018061400006505~0101~2~OWNED~SELF EMPLOYED~~~~32~~AGRICULTURE~~~OTHERS~AGRIC

So numbers.txt contains 2018061400006505 and file also contain numbers related data, so I want to delete the line which matches the given numbers.

@steeldriver I tried double quotes things but it still not working, I have added the example in the edit section of the question. Hope it helps. — Ankur_009, Jun 17 '18 at 17:49
@Ankur_009 there is a single quote after -i in sed. is that a typo? — Siva, Jun 17 '18 at 18:01
@SivaPrasath not the typo, as somewhere i have read that is necessary while using sed cmd. But I have tried without using that too. — Ankur_009, Jun 17 '18 at 18:04
@SivaPrasath this is not duplicate, as I have tried solution mentioned there too. — Ankur_009, Jun 17 '18 at 18:07
@Ankur_009 i hope its not required, can u try this ...sed -i '/'"$p"'/d' — Siva, Jun 17 '18 at 18:08
@SivaPrasath that is giving me error "sed: -i may not be used with stdin" — Ankur_009, Jun 17 '18 at 18:09
@SivaPrasath, it doesn't matter one bit if the slashes are single-quoted, double-quoted or not-at-all-quoted. The thing with -i is the only issue here, GNU sed wants the backup filename suffix as part of the same argument, so you'd use -i.bak, or just -i for no backup. BSD sed is different here. — ilkkachu, Jun 17 '18 at 18:12
Related: https://unix.stackexchange.com/questions/92895/how-can-i-achieve-portability-with-sed-i-in-place-editing — Kusalananda, Jun 17 '18 at 18:19
Note, this question is not actually related to passing variables into sed, it's about the portability of sed -i and/or about working with DOS text files. — Kusalananda, Jun 18 '18 at 10:26

Kusalananda · Answer 1 · 2018-06-18T09:51:17.803

As long as the numbers in your example does not contain the delimiter that sed is using (by default /), the $p in your code will be interpreted as a regular expression (with all what that means).

Your code:

search_dir='dummy'
filename='numbers.txt'

for entry in "$search_dir"/*
do
  while read p;
  do 
    sed -i '' "/$p/d" $entry
  done < $filename
done

Here, you want to delete all lines in the files under $search_dir that contains any of the numbers in $filename. Whether this work or not depends on how your sed treats -i ''. With some implementations of sed you would have to use -i without an argument.

Related to sed -i and portability: How can I achieve portability with sed -i (in-place editing)?

It is safer to write the result to a temporary file and then to move that file to the original filename:

for entry in "$search_dir"/*
do
  while read p;
  do 
    sed "/$p/d" "$entry" >"$entry.tmp" && mv "$entry.tmp" "$entry"
  done <"$filename"
done

This ensures that it will work regardless of what sed implementation you happen to be working with. In general, it's a bad idea to try to make in-place changes to files while testing out a script, so you may well want to comment out that mv before you are happy with the way the script otherwise works.

This is still a bit unsafe as a general solution though since you're actually "using data as code" (the numbers are data, and you use them a part of your sed script). This means that you easily could cause a syntax error in the sed script by just inserting a / in one of the numbers in your numbers file.

Since the operation is so simple, we may instead use grep. This also gets rid of the inner while loop:

for entry in "$search_dir"/*
do
  grep -Fv -f "$filename" "$entry" >"$entry.tmp" && mv "$entry.tmp" "$entry"
done

This will cause grep to read its patterns from $filename and to apply these to the $entry file. The -v means we'll discard any line containing the pattern and -F means grep will not interpret the numbers as regular expressions but as fixed strings. With -f "$filename" we get grep to read the strings from $filename.

If there may be directories under $search_dir we would want to skip these:

for entry in "$search_dir"/*
do
  [ ! -f "$entry" ] && continue
  grep -Fv -f "$filename" "$entry" >"$entry.tmp" && mv "$entry.tmp" "$entry"
done

Another, even safer way to do your operation is to use awk. Since with both the sed and grep solutions above, the number is matched anywhere on the line, it is conceivable that we might delete the wrong lines. With awk it's easy to match just the second ~-delimited field in the data:

for entry in "$search_dir"/*; do
    [ ! -f "$entry" ] && continue
    awk -F '~' 'NR==FNR { num[$0]; next } !($2 in num)' "$filename" "$entry" >"$entry.tmp" &&
    mv "$entry.tmp" "$entry"
done

The awk program first populates an associative array/hash with the numbers as keys, and then prints every line from the $entry file whose second ~-delimited column is not a key in that hash.

thanks for the detailed info. I appreciate. But I have tried both the way(grep and sed) , it doesn't work. — Ankur_009, Jun 17 '18 at 18:15
@Ankur_009 Would you mind telling me what actually happens? It's a bit difficult to see your screen from here. — Kusalananda, Jun 17 '18 at 18:17
@Ankur_009 I also added an awk variation at the end. Please let me know if this works. — Kusalananda, Jun 17 '18 at 18:27
i have tried the awk it has deleted the line matching with the last string only , line matching with the other number are still there. — Ankur_009, Jun 17 '18 at 18:34
@Ankur_009 Is one or both of your files DOS text files? In that case run dos2unix on them and try again. — Kusalananda, Jun 17 '18 at 18:35
no, one is csv file and other which containing the numbers is txt file. — Ankur_009, Jun 17 '18 at 18:37
@Ankur_009 Were they at any point in time created or edited on a Windows system? — Kusalananda, Jun 17 '18 at 18:37
csv file might be open in windows system, but numbers file is created in linux environment.. — Ankur_009, Jun 17 '18 at 18:39
@Ankur_009 Then they may well have trailing carriage return characters at the end of each line. This would confuse sed, grep and awk. Just run dos2unix on the files to make sure they are Unix text files, then try again. — Kusalananda, Jun 17 '18 at 18:39

How to pass variable in pattern while using sed command?

1 Answers1