1

I have 3 text files. I want to search file3 for a string in file2 and replace it with a string in file1 in found. I need to append a custom tag from file1 to the end of the string in file3, replacing the partial string found from file2.

file3

aws ec2 create-tags --region us-east-1 --resourcesi-XXXXX --tags Key=Developer Name,Value=XXXXX Key=Resource Group,Value=arn:aws:iam::XXXXX:root
aws ec2 create-tags --region us-east-1 --resourcesi-XXXX --tags Key=Developer Name,Value=XXXXX Key=Resource Group,Value=arn:aws:iam::XXXXX:user/user

file2

arn:aws:iam::XXXXX:root 
arn:aws:iam::XXXXX:user/user

file1

my_custom_tag_1
my_custom_tag_2

Desired output:

aws ec2 create-tags --region us-east-1 --resourcesi-XXXXX --tags Key=Developer Name,Value=XXXXX Key=Resource Group,Value=my_custom_tag_1
aws ec2 create-tags --region us-east-1 --resourcesi-XXXX --tags Key=Developer Name,Value=XXXXX Key=Resource Group,Value=my_custom_tag_2

I've tried loading the lines from the file into an array and including the index in a sed replace.

sed "s|${file2array[0]}|${file1array[0]}|g" file3.txt

But this returns a "no previous regular expression" error. I've also tried writing the array indexes to unique variables with a for loop and using the same approach above with the variables

sed "s|$var2|$var1|g" file3.txt

This also fails

Interestingly,

sed "s|${file2array[0]}|customtext}|g" file3.txt

fails but

sed "s|customtext|${file1array[0]}|g" file3.txt

succeeds.

Any help is greatly appreciated. Been working on this for dozens of hours now.

Satō Katsura
  • 13,368
  • 2
  • 31
  • 50

2 Answers2

1

Try:

awk 'FNR==NR{a[FNR]=$0; next} NR<=length(a)+FNR{b[FNR]=$0; next} {for (i=1;i<=length(a);i++) gsub(a[i], b[i])} 1' file2 file1 file3

For example:

$ awk 'FNR==NR{a[FNR]=$0; next} NR<=length(a)+FNR{b[FNR]=$0; next} {for (i=1;i<=length(a);i++) gsub(a[i], b[i])} 1' file2 file1 file3
aws ec2 create-tags --region us-east-1 --resourcesi-XXXXX --tags Key=Developer Name,Value=XXXXX Key=Resource Group,Value=my_custom_tag_1
aws ec2 create-tags --region us-east-1 --resourcesi-XXXX --tags Key=Developer Name,Value=XXXXX Key=Resource Group,Value=my_custom_tag_2

How it works

  • FNR==NR{a[FNR]=$0; next}

    This saves all the lines in file2 in array a.

    FNR is the number of lines read from the current file. NR is the number of lines read in total. Thus, if FNR==NR, we are reading the first named file, file2. a[FNR]=$0 adds the current line, denoted $0, into array a under the key FNR.

    The command next tells awk to skip the remaining commands and start over on the next line.

  • NR<=length(a)+FNR{b[FNR]=$0; next}

    This saves all the lines of file1 in array b.

    Here, we use a similar test, NR<=length(a)+FNR, to determine if we are reading the second file. b[FNR]=$0 adds the current line, denoted $0, into array b under the key FNR.

    The command next tells awk to skip the remaining commands and start over on the next line.

  • for (i=1;i<=length(a);i++) gsub(a[i], b[i])

    If we get here, we are reading the third file. This replaces any text matching a line in file2 with the corresponding text from file1.

    The loop for (i=1;i<=length(a);i++) loops over the line number of every line in array a.

    gsub(a[i], b[i]) replaces any occurrence of text a[i] with the text b[i].

    Note that the text in file2 is treated as a regular expression. If you need to have any regex-active characters in this file, they should be escaped.

  • 1

    This is awk's cryptic short-hand for print-the-line.

John1024
  • 74,655
  • Thank for the well thought out answer. Replacing the generic file names at the end with the corresponding file names it doesn't seem to work. It just echos the file3 file to the screen. Let me know if I have overlooked something in your answer. Truly appreciate the help. – Jeff Carson Sep 21 '16 at 04:41
  • @JeffCarson First, did it work for you on the test files shown in your question? As you can see above, it worked for me. "Replacing the generic file names at the end with the corresponding file names it doesn't seem to work." It is important that the files be in the same order that are specified here: file2 file1 file3. The first file specified, file2, is the one with the text that we want to remove. – John1024 Sep 21 '16 at 04:46
  • Actually, file3 (the very last part of each line) is what I am needing to remove and replace with a line from file1. File 2 holds the string that we are searching for in file3. If that string is found (which will be a part of the line in file 3) I want to delete that match and replace with a line from file1, or the custom tag. Does that make sense? – Jeff Carson Sep 21 '16 at 04:49
  • @JeffCarson Did it work for you, as it did for me, on the test files shown in your question? – John1024 Sep 21 '16 at 04:50
  • checking now... – Jeff Carson Sep 21 '16 at 04:52
  • It does seem to work for the 2nd line, but not the first. My output is: aws ec2 create-tags --region us-east-1 --resourcesi-XXXXX --tags Key=Developer Name,Value=XXXXX Key=Resource Group,Value=arn:aws:iam::XXXXX:root aws ec2 create-tags --region us-east-1 --resourcesi-XXXX --tags Key=Developer Name,Value=XXXXX Key=Resource Group,Value=my_custom_tag_2 so the second one replaces but the first one doesnt'. – Jeff Carson Sep 21 '16 at 04:54
  • @JeffCarson In that case, look for extraneous characters, like trailing blanks in the first line of file2. – John1024 Sep 21 '16 at 04:58
  • Genious, there was one extra white space after the first line in file2. Very much appreciated... – Jeff Carson Sep 21 '16 at 05:00
  • @JeffCarson Excellent! If there are any troubles with your real files, let me know. – John1024 Sep 21 '16 at 05:02
  • Nope, same troubles there. Extra white space on first line of file 2. All working perfectly now. I can not thank you enough... – Jeff Carson Sep 21 '16 at 05:03
  • @JeffCarson Glad it worked for you. – John1024 Sep 21 '16 at 05:14
  • This is working perfectly. I had the need to edit the code so that I had 2 occurrences on the same line of the same string for which I 'm processing this substitution. It works for the first but not the second. Should gsub be global and replace all occurrences? I also tried processing twice, since after the command is run the first time the string match won't be present on the first occurrence anymore. But I can't get it to make the substitution on both occurrences. I tried adding 'g' to the gsub(a[i], b[i]) but it still only processed the first occurrence. – Jeff Carson Sep 29 '16 at 13:31
  • 1
    Apologies. I've worked on it for hours before I ask, and one minute after I ask I find the issue. The second occurrence was missing a letter, hence not an exact match. Thank you once again. – Jeff Carson Sep 29 '16 at 13:34
0

I'd write

awk '
  BEGIN { FS = OFS = "=" }
  FILENAME == "file1" {
    tag[FNR] = $0
  }
  FILENAME == "file2" {
    str[$0] = FNR
  }
  FILENAME == "file3" {
    if ($NF in str) $NF = tag[str[$NF]]
    print
  }
' file1 file2 file3

I think it's pretty straightforward. Let me know if you have questions.

glenn jackman
  • 85,964