Search and replace occurrences of pattern with sequential variable

Question

I have a requirement to update a variable across multiple files/sub-directories. The variable to be replaced starts with the same 6 characters, everything after this is random. I will use these first 6 characters as the pattern to find/replace on. I will replace the random characters following with a sequential variable.

I'm not sure what utility is best to achieve this but I imagine sed in some kind of loop? I'm struggling to visualise how best to achieve this. I imagine it could be done with something like;

#!/bin/bash  
i=0  
grep -r '/parent/sub/' -e 'pattern' | while read line  
do  
sed 's/pattern*/pattern$i/g'  
((i++))  
done

My first issue is I don't know if sed can be used this way, secondly as it's nested in the loop how do can I feed it the required lines from the grep command (or is there a better method than grep to be used here?)

Thanks

I think the first thing you would need to identify is how deeply nested your directory structure is. Everything else can probably be done with sed or awk - however, note that the search pattern you indicated is that of "wildcards" (globs in the shell language), whereas for sed you will need a "regular expression". Also, be careful that in your example, you would be replacing the pattern with literal $i as variable expansion is disabled inside single quotes. — AdminBee, Jul 09 '20 at 10:01
yes, please [edit] your question and show us an example of your input files, the output you would desire from that file. How do you decide where the "random characters" end? Can the pattern be found more than once on the same line? More than once in the same file? — terdon, Jul 09 '20 at 10:03
Never use the word pattern in that context as it's ambiguous, always use string or regexp, whichever one you mean. — Ed Morton, Jul 10 '20 at 01:14

Giuseppe Clemente · Accepted Answer · 2020-07-09T10:57:23.503

1

The following script does the job:

#!/bin/bash
i=0 
grep -rl --exclude=${0:2} . -e pattern | while read -r line               
do                                                
    sed -i 's/pattern\(.*\)/pattern'$i'/' "$line"    
    ((i++))                                           
done

The while loop cycles over all filenames in the current directory (and recursively on subfolders) containing an instance of "pattern". Using sed, you replace everything that follows 'pattern' in that line, indicated by the group $.*$, by 'pattern' and the value of the counter i.

Notes:

This works also if you have many 'pattern' strings in different lines, but would not work if you have multiple 'pattern' strings in the same line, since everything that follows 'pattern' would be replaced.
sed -i replace the files inplace, if you just want to check if it replace everything correctly, remove -i.
I added the option --exclude=${0:2} in the grep call, since if you search in the current directory without it, it will match also the script filename, since it contains the strings 'pattern'!!

edited Jul 09 '20 at 10:57

answered Jul 09 '20 at 10:12

Giuseppe Clemente

180

1

Please note that your approach may suffer from the same problem that leads to the discouragement of parsing the output of ls, in that the loop may choke on filenames with special characters (or even spaces, which are not so uncommon anymore) - grep doesn't have the equivalent of the -print0 option that GNU find has, as far as I remember ... – AdminBee Jul 09 '20 at 10:16
You are right, I missed the whitespace management, I will edit the answer. – Giuseppe Clemente Jul 09 '20 at 10:30
Thanks for this Giuseppe, I've marked this as the answer as after initial testing it's definitely on the right path for what I'm attempting to achieve. – Gtt Jul 09 '20 at 10:40
You are welcome! I noticed a funny loophole however. If you search the pattern in the current directory ., in the loop would be also the script, so it replace everything after 'pattern'!! If you use something like the above as commands in the shell it will work, but using that as a script would be a problem. I'm tring to figure out how to manage that. – Giuseppe Clemente Jul 09 '20 at 10:44
Ok, fixed that. – Giuseppe Clemente Jul 09 '20 at 10:57
1

That would strip leading/trailing white space from every file name, if you're going to use a while-read loop you should always use while IFS= read -r unless you have a specific reason not to set IFS= or not to use -r. – Ed Morton Jul 10 '20 at 01:11

score 1 · Answer 2 · answered Jul 10 '20 at 01:26

The tool to find files is named find, not grep. There is a big clue in the name of the tool :-). grep is to Globally find a Regular Expression within a file/stream and Print the result - it's the letters from the ed command g/re/p. The GNU guys really messed up by giving grep options to find files - hopefully they don't have any plans to have it take on the functions of sort, tr, sed, wc, etc. next!

Here's one way to do what you want robustly (making assumptions about when you want i incremented, what you mean by pattern, which characters pattern can contain, etc.):

i=0
while IFS= read -rd '' file; do
    grep -q 'pattern' "$file" &&
    sed -i 's/pattern*/pattern'"$i"'/g' "$file"
    ((i++))
done < <(find '/parent/sub/' -type f -print0)

Search and replace occurrences of pattern with sequential variable

2 Answers2