Duplicate and replace a pattern in a text file

Question

Let’s consider an input text file like this:

some text …
% BEGIN
blabla
foo bar
blabla
blabla
% END
some text …

and a foobar.txt file like this:

2 3
8 9 
1 2

what is the simplest way using sed (maybe awk ?) to obtain this output text file:

some text …
% BEGIN
blabla
2 3
blabla
blabla
% END
% BEGIN
blabla
8 9
blabla
blabla
% END
% BEGIN
blabla
1 2
blabla
blabla
% END
some text …

Does this help? https://unix.stackexchange.com/questions/73969/how-can-i-use-sed-or-awk-to-replace-placeholders-in-a-template-file-with-variabl#74132 — Stephen Rauch, May 13 '17 at 20:51
So, the input text file contains only one %BEGIN...%END block, and that block must be duplicated as many times as foobar.txt has values (with the replacement of foo bar changing for each copy)? And the text outside that %BEGIN...%END block must be left as-is? — ilkkachu, May 13 '17 at 22:13
@ilkkachu My understanding exactly the same. Looks like homework :), but it is interesting. — MiniMax, May 14 '17 at 11:10

score 2 · Answer 1 · answered May 13 '17 at 23:38

Here's a pure awk way to do it, using getline:

awk '
  /% BEGIN/ {
    s = 1;
  }

  s == 1 {
    b = b == "" ? $0 : b ORS $0
  }

  /% END/ {
    while ((getline repl < "foobar.txt") > 0) {
      tmp = b;
      sub(/foo bar/, repl, tmp);
      print tmp;
    }
    b = "";
    s = 0;
    next;
  }

  s == 0 {
    print;
  }' input

With GNU awk, you can make the substitution without a temporary - using gensub:

gawk '
  /% BEGIN/ {
    s = 1;
  }

  s == 1 {
    b = b == "" ? $0 : b ORS $0
  }

  /% END/ {
    while ((getline repl < "foobar.txt") > 0) {
      print gensub(/foo bar/, repl, 1, b);
    }
    b = "";
    s = 0;
    next;
  }

  s == 0 {
    print;
  }' input

Testing:

$ gawk '
>   /% BEGIN/ {s = 1;}
>   s == 1 {b = b == "" ? $0 : b ORS $0}
>   /% END/ {while ((getline repl < "foobar.txt") > 0) {print gensub(/foo bar/, repl, 1, b);} s = 0; next;}
>   s == 0 {print}' input
some text …
% BEGIN
blabla
2 3
blabla
blabla
% END
% BEGIN
blabla
8 9 
blabla
blabla
% END
% BEGIN
blabla
1 2
blabla
blabla
% END
some text …

Good solution, I have analyzed it for awk learning. b = ""; in the /% END/ section not needed. I am about pure awk script. Because b variable doesn't used anymore, after BEGIN to END block have passed. Right? Have removed it, and script work as expected. Or there is a reason to do this? — MiniMax, May 14 '17 at 18:27
@MiniMax yes you're correct - in fact I didn't include that originally, then added it as a kind of "best practice" thing (to make the solution extensible to the case of more than a single % BEGIN . . . % END block) — steeldriver, May 14 '17 at 18:33

score 1 · Answer 2 · answered May 14 '17 at 12:59

perl -nMFatal=open -e '$l = $_;
   @ARGV and open my $fh, "<", $ARGV[0];
   print +(/^%\hBEGIN/ ? $a=0 : $a++) == 1 ? $l : $_ while <$fh>;
' foobar.txt input.txt

Working

For every line read from the foobar.txt file, we open a lexical filehandle $fh to the file input.txt. The reason it has to be lexical is because it closes by itself when the next line of input from foobar.txt is read in.
We initialize the counter $a when we see the % BEGIN line in input.txt. And 1 line after this, we replace the line in input.txt with the line from foobar.txt.
Order of arguments is: foobar.txt and then input.txt.
We include the pragma Fatal.pm which handles errors in opening files automatically.

Results

some text --
% BEGIN
blabla
2 3
blabla
blabla
% END
some text --
some text --
% BEGIN
blabla
8 9
blabla
blabla
% END
some text --
some text --
% BEGIN
blabla
1 2
blabla
blabla
% END
some text --

John1024 · Answer 3 · 2017-05-13T21:19:30.347

Try this:

while read line; do awk -v f="$line" '{gsub(/foo bar/, f)} 1' input; done <foobar.txt

This reads line by line from foobar.txt. For each line in foobar.txt, the file input is read and the line from foobar.txt is substituted in for each occurrence of foo bar.

How it works

while read line; do

This starts a while-loop that reads lines from foobar.txt.
awk -v f="$line" '{gsub(/foo bar/, f)} 1' input

This reads the file input and substitutes in $line everywhere that foo bar occurs.

In more detail:
- -v f="$line"
  
  This creates an awk variable f whose value is the contents of shell variable line.
- gsub(/foo bar/, f)
  
  For each line that awk reads in, this looks for occurrences of the regex foo bar and substitutes in the value of f
- 1
  
  This is awk's shorthand for print-the-line.
The reason for using awk here, rather than sed, is that awk has better handling for capturing the value of shell variables.
done <foobar.txt

This signals the end of the while-loop and tells the loop to use the file foobar.txt as its standard input.

Multi-line version

For those who like their commands spread out over multiple lines:

while read line
do
    awk -v f="$line" '{gsub(/foo bar/, f)} 1' input
done <foobar.txt

Nice solution. I prefer the one with sed but thanks a lot for the contribution. — Jean-Pierre, May 14 '17 at 22:59

RomanPerekhrest · Accepted Answer · 2017-05-13T21:28:23.257

0

Complex bash + sed solution:

foobar_replacer.sh script:

#!/bin/bash
head -n1 "$2"  # print the first line

while read -r line
do
    sed '1d;$d;{s/^foo bar$/'"$line"'/g}' "$2"        
done < "$1"

tail -n1 "$2" # print the last line

Usage:

bash foobar_replacer.sh foobar.txt input.txt

The output:

some text …
% BEGIN
blabla
2 3
blabla
blabla
% END
% BEGIN
blabla
8 9
blabla
blabla
% END
% BEGIN
blabla
1 2
blabla
blabla
% END
some text …

sed command details:

1d;$d; - delete the first and the last line from input.txt

s/^foo bar$/'"$line"'/g - substitute the line containing foo bar with next item $line from foobar.txt

edited May 13 '17 at 21:28

answered May 13 '17 at 21:22

RomanPerekhrest

30,212

@Jean-Pierre Really? :) Are you tested it? Try add in the input file couple lines, like: first line one, second line two, etc. After some text line. I see without running this script, that it is not fulfil your task. – MiniMax May 15 '17 at 12:47
It does not contain BEGIN ... END block location checking. Though it should contain it. – MiniMax May 15 '17 at 12:54

MiniMax · Answer 5 · 2017-05-14T22:07:11.993

bash script with sed using. Usage: ./search_and_replace.sh < input.txt, result will be in the new output.txt file

#!/bin/bash

begin_str="% BEGIN"
end_str="% END"
pattern="foo bar"
write_to_var_flag=0
output_file=output.txt
foobar_file=foobar.txt
begin_to_end_block_var=""

# clean output file if it exist, else create it
> "$output_file"

function read_foobar_file () {
    while read -r line; do
        echo -ne "$begin_to_end_block_var" | sed "s/$pattern/$line/" >> "$output_file"
    done < "$foobar_file"
}

while read -r line; do
    if  [ "$line" == "$begin_str" ]; then
        write_to_var_flag=1
    fi

    if (( $write_to_var_flag )); then
        begin_to_end_block_var+="$line\n"
    else
        echo "$line" >> "$output_file"
    fi

    if [ "$line" == "$end_str" ]; then
        read_foobar_file 
        write_to_var_flag=0
    fi
done

score 0 · Answer 6 · answered May 15 '17 at 04:22

sed -e '
   1{
      :loop
         N
      /\n\.$/!bloop
      s///;h
      N;s/.*\n//
   }

   G
   y/\n_/_\n/
   s/^\([^_]*\)_\(.*_% BEGIN_[^_]*_\)[^_]*/\2\1/
   y/\n_/_\n/
' input.txt foobar.txt

Working

In this method, the order of arguments is: input.txt then foobar.txt
Since POSIX sed has no idea when one file ends & the next one begins, we either need to add an eof distinguish-er, say, ., OR based on the kind of data in the two files to help tell which file we"re in. In our case, I choose to go by the first method.
We first store the input.txt file in the hold space, the whole of it.
Then for every line read in from the foobar.txt file, we append the hold space to it and then replace the 2nd line after the % BEGIN line in the pattern space with the first line. Note: We have what is a multiline pattern space which is ...\n...\n...\n...

Duplicate and replace a pattern in a text file

6 Answers6

How it works

Multi-line version

Working