Performing Prime Key function using 'Sed' in Bash

Question

I am using bash and trying to employ sed command for matching a string and replace it with another between two files.

Objective: To replace all strings[sp_*] by matching them from another file containing [sp_* Var_Names]. Please be noted that: 1. The order is synchronized but not consecutive in both files, so cannot use paste. 2. Functionality is similar to primeKey operations in mysql.

File 1

+--sp_O00574_
|
+--sp_Q9TV16_
|
|     +--sp_O18983_
|  +--| (52)
|  |  |  +--sp_Q9BDS6_
|  |  +--| (26)
|  |     |  +--sp_O19024_
|  |     +--| (29)
|  |        +--sp_Q9XT45_

File2

O00574  CXCR6_HUMAN
Q9TV16  CXCR6_PANTR
O18983  CXCR6_CHLAE
Q9BDS6  CXCR6_MACFA
O19024  CXCR6_MACNE
Q9XT45  CXCR6_MACMU

Purpose: To sed -ie 's/O00574/CXCR6_HUMAN/g' File1

inline Bash script:

cat File2 | while read id; do upID=`echo $id | cut -d " " -f1`; upName=`echo $id | cut -d " " -f2`; sed -ie 's/sp_$upID/$upName/g' File1; done

Script.sh

#/bin/bash

cat File2 | while read id;
do
    upID=`echo $id | cut -d " " -f1`
    upName=`echo $id | cut -d " " -f2`

    sed -ie 's/sp_$upID/$upName/g' File1
done

Problem: The sed command does not work in the loop. No change is observed in File1, at all. If I echo the sed command from the script and then run it in the terminal it works as expected. I cannot figure out what could be the problem.

Thank you for your valuable comments and solution.

You are using ' ' quotes in your sed statement which prevents variable expansion. Use " " quotes instead. Also you can make it simpler with "while read upID upName" and not need the two cut statements. — Stephen Harris, Jun 02 '16 at 15:38
Dear Stephen, Thanks for the error and added information. It would be very useful in solving many other cases. Thanks :) Somehow, I was saturated to see the single quotes. ;P — Tarun JaiRaj Narwani, Jun 02 '16 at 16:06
For the "while upId upName" the upName overwrites the upId everytime. — Tarun JaiRaj Narwani, Jun 02 '16 at 16:08

Rob · Answer 1 · 2016-06-02T17:03:44.510

2

generate a sed script from your index file (File2) instead of a loop then run that script against your File1.. It will be MUCH faster :).

 awk '{ print "s/sp_"$1"/"$2"/g"}' File2.txt > tranform.sed

then do:

 sed -i -f transform.sed File1.txt

so your entire script could be:

awk '{ print "s/sp_"$1"/"$2"/g"}' File2.txt > transform.sed
sed -f transform.sed File1.txt

## if you want to remove your transformation file
rm transform.sed

edited Jun 02 '16 at 17:03

answered Jun 02 '16 at 16:48

Rob

818

1

Nice, but if this is scripted you shouldn't use a fixed temp file name. Since the Original Poster specified bash, I'd opt for process substitution instead: sed -i -f <(awk '{print "s/sp_"$1"/"$2"/g"}' File2.txt) File1.txt – Wildcard Jun 02 '16 at 23:36
So, you use the powerful awk to generate an input file for the less powerful sed, instead of using awk directly? – Michael Vehrs Jun 03 '16 at 07:42
The real question is does it work? :) There is always ten different ways to do something. awk is neither better or worse than sed. They are tools to get something done. Can you do it is awk in two easy to read lines? – Rob Jun 03 '16 at 10:45
Thanks everyone for the response. As pointed out in the comments by +Stephen Harris, the problem was with the single quoting 's///g'. With double quotes it works perfectly fine. If I try the awk + sed it takes longer than the doing while directly on the file and sed -i. – Tarun JaiRaj Narwani Jun 03 '16 at 12:03

Performing Prime Key function using 'Sed' in Bash

1 Answers1