BSD sed: Replace only the Nth occurrence of a pattern

Question

Using BSD `sed`;

How can I perform the following substitution?:

Before:

hello hello hello
hello hello hello

After:

hello world hello
hello hello hello

In other words; how can I replace only the Nth occurence of a pattern?
(Or in this case; the 2nd occurrence of a pattern?)

I played with this for a bit. Conclusion: You're much better off using GNU sed or Perl for this; there's no general way to do this reliably in BSD sed. Even for a fixed string it's hard; for a complex regex it's probably impossible. The following doesn't work: sed 's/^$hello.*$hello/\1world/' Can you see why? — Wildcard, Jan 13 '16 at 23:27
Unfortunately it's not always available. Also; I can't see why that would work. Although, you could just be using some RegEx tricks that I don't yet understand. However, I think you'll want the -E flag; i.e. sed -E. — voices, Jan 14 '16 at 00:35
If it's just a fixed string hello hello hello then you could just do `s/hello hello/hello world/'. — Wildcard, Jan 14 '16 at 02:45
Related: Using sed to replace nth occurrence of a word, How to add text before the Nth occurrence of a text using sed only?, sed or awk: replace only the n-th occurrence of a string, How to delete the n-th word from standard input?, and Print everything after nth delimiter — G-Man Says 'Reinstate Monica', Sep 06 '22 at 00:55

cuonglm · Answer 1 · 2016-01-15T04:16:31.740

With any POSIX sed:

$ sed -e'/hello/{' -e:1 -e'$!N;s/hello/world/2;t2' -eb1 -e\} -e:2 -en\;b2 <file
hello world hello
hello hello hello

After the first match /hello/, we run into a loop.
Inside loop :1, we read each Next line to the pattern space, doing substitute command for 2nd occurrence only. We test if the substitution success or not. If yes, we run into loop :2, else repeat the loop with b1.
Inside loop :2, we just print remain lines till the end of file.

Note that this approach will store all things between two hello in pattern space. It will be a problem with huge files, when the first and the second are far from each other.

score -1 · Answer 2 · answered Jan 14 '16 at 23:21

It can be easier if you use two seds. In fact, many things are, and they are often faster that way, as well, on multicore systems, at least.

:    infile =;<<"" \
sed -e's/$/ /;s/hello/&\n\n/g' -e'# marks lines with " $" and splits matches' |
sed -e:n   -e's/ $//;t'  -eG   -e'# sets up a test label, branches for " $"'  \
    -e's/o\n\{20\}$/o world/'  -e'# stacks a byte per match, edits nth match' \
    -e'x;N;x;N;s/\n\n*//;tn'   -e'# completes the stacking; recycles to top'  \
>outfile
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello

hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello world hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello

(With a BSD sed you'll want a literal newline in place of the n for the \n escapes in the right-hand substitution field)

It is usually easier to adapt the stream than to adapt the stream editor. The above sequence does just that: it marks each whole line in input with a trailing space, but otherwise splits output lines for each occurrence of hello. The second sed then needs only to look for a line which does not end in space to know that it should increment its stack count, and then only to explicitly match the 20th.

Of course it doesn't have to be that strict. You could drop the leading o before \n\{20\}$ and leave it off the replacement. That would replace only from the 20th match through to the last in input. Or else you could do \n\{20,25\} to handle only a range of matches. Or even: \n\{20,25\}$\n\{15\}$*$ to handle a range of 20,25 and every 10,15th occurrence thereafter.

Here's an output sample given the same input for that last mentioned...

hello hello hello hello hello hello hello hello hello
hello hello hello hello hello hello hello hello hello
hello hello world hello world hello world hello world hello world hello world hello hello
hello hello hello hello hello hello hello hello world hello world
hello world hello world hello world hello world hello hello hello hello hello
hello hello hello hello hello world hello world hello world hello world hello world
hello world hello hello hello hello hello hello hello hello

BSD sed: Replace only the Nth occurrence of a pattern

Using BSD `sed`;

2 Answers2

Linked

BSD sed: Replace only the Nth occurrence of a pattern

Using BSD sed;

2 Answers2

Linked

Using BSD `sed`;