4

Using BSD sed;

How can I perform the following substitution?:

Before:

hello hello hello
hello hello hello

After:

hello world hello
hello hello hello

In other words; how can I replace only the Nth occurence of a pattern?
(Or in this case; the 2nd occurrence of a pattern?)

voices
  • 1,272

2 Answers2

1

With any POSIX sed:

$ sed -e'/hello/{' -e:1 -e'$!N;s/hello/world/2;t2' -eb1 -e\} -e:2 -en\;b2 <file
hello world hello
hello hello hello
  • After the first match /hello/, we run into a loop.

  • Inside loop :1, we read each Next line to the pattern space, doing substitute command for 2nd occurrence only. We test if the substitution success or not. If yes, we run into loop :2, else repeat the loop with b1.

  • Inside loop :2, we just print remain lines till the end of file.

Note that this approach will store all things between two hello in pattern space. It will be a problem with huge files, when the first and the second are far from each other.

cuonglm
  • 153,898
-1

It can be easier if you use two seds. In fact, many things are, and they are often faster that way, as well, on multicore systems, at least.

:    infile =;<<"" \
sed -e's/$/ /;s/hello/&\n\n/g' -e'# marks lines with " $" and splits matches' |
sed -e:n   -e's/ $//;t'  -eG   -e'# sets up a test label, branches for " $"'  \
    -e's/o\n\{20\}$/o world/'  -e'# stacks a byte per match, edits nth match' \
    -e'x;N;x;N;s/\n\n*//;tn'   -e'# completes the stacking; recycles to top'  \
>outfile
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello

hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello world hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello

(With a BSD sed you'll want a literal newline in place of the n for the \n escapes in the right-hand substitution field)

It is usually easier to adapt the stream than to adapt the stream editor. The above sequence does just that: it marks each whole line in input with a trailing space, but otherwise splits output lines for each occurrence of hello. The second sed then needs only to look for a line which does not end in space to know that it should increment its stack count, and then only to explicitly match the 20th.

Of course it doesn't have to be that strict. You could drop the leading o before \n\{20\}$ and leave it off the replacement. That would replace only from the 20th match through to the last in input. Or else you could do \n\{20,25\} to handle only a range of matches. Or even: \n\{20,25\}\(\n\{15\}\)*$ to handle a range of 20,25 and every 10,15th occurrence thereafter.

Here's an output sample given the same input for that last mentioned...


hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello world hello world hello world hello world hello world hello world hello hello
hello hello hello hello hello hello hello hello world hello world
hello world hello world hello world hello world hello hello hello hello hello
hello hello hello hello hello world hello world hello world hello world hello world
hello world hello hello hello hello hello hello hello hello
mikeserv
  • 58,310