How to add a line after nth occurrence of a keyword using sed?

Question

Using sed, I want to add check after nth occurrence

Input:

DCR
DCR
DCR

Output:

DCR
DCR
check
DCR

Is it possible using sed?

sed is Turing complete, so it is possible. But something else, like awk or perl might be more suited to this task. — muru, Apr 21 '15 at 15:05
There is a duplicate here: http://unix.stackexchange.com/questions/88038/print-matching-line-and-nth-line-from-the-matched-line/88051#88051 — Valentin Bajrami, Apr 21 '15 at 15:07
@val0x00ff it's very similar but I don't think it's a duplicate — Chris Davies, Apr 21 '15 at 16:04

score 5 · Answer 1 · answered Apr 21 '15 at 15:07

5

With GNU sed, you can replace the nth pattern in a line

$ echo "foofoofoofoo" | sed 's/foo/&\nbar/2'
foofoo
barfoofoo

But for the nth line that contains the pattern, awk is easier:

awk -v n=2 -v patt=foo '{print} $0 ~ patt && ++count == n {print "bar"}' <<END
foo1
foo2
foo3
foo4
END

foo1
foo2
bar
foo3
foo4

answered Apr 21 '15 at 15:07

glenn jackman

85,964

There is a duplicate here: http://unix.stackexchange.com/questions/88038/print-matching-line-and-nth-line-from-the-matched-line/88051#88051 – Valentin Bajrami Apr 21 '15 at 15:07
@glenn I want to add line after the line where nth occurrence of pattern occurs. – Menon Apr 22 '15 at 11:26
Be specific: after the nth line that contains the pattern? What happens if the pattern matches n times on the 1st line? – glenn jackman Apr 22 '15 at 13:41
@glenn yes, after the nth line that contains the pattern. – Menon Apr 23 '15 at 14:11
This awk command meets that requirement. – glenn jackman Apr 23 '15 at 14:31

Costas · Accepted Answer · 2015-04-21T19:08:56.400

4

With GNU sed:

sed -z 's/DCR/&\ncheck/2' <input >output

For non-uptodate versions:

sed '/DCR/{p;s/.*/1/;H;g;/^\(\n1\)\{2\}$/s//check/p;d}' <input >output

If there are more than 1 occurence DCR in line:

sed '
/DCR/{p
      x                               # tests if already have met pattern
      /^\(\n\a\)\{2\}/!{              #+apropriate times and, if so, cancel
        x                             #+the rest of commands
        s/DCR/\a/g                    # exchange DCR by \a symbol
        s/^[^\a]*\|[^\a]*$//g         # delete everything before & after it  
        s/[^\a]\+/\n/g                # substitute everything between by \n
        H
        g
        /^\(\n\a\)\{2\}/s/.*/check/p} # add 'check' for double pattern
      d}' <input >output

edited Apr 21 '15 at 19:08

answered Apr 21 '15 at 15:43

Costas

14,916

The first one is OK (+1 for using -z); the second one works OK only if there's one pattern per line (try it with a file where the first two occurrences of pattern are on the same line and the third is on another line). It is unclear though, whether the op wants to count lines matching pattern or just patterns... – don_crissti Apr 21 '15 at 17:46
@don_crissti It is rather different what OP asked but if you wants '/DCR/{p;s/DCR/\a/g;H;g;s/\n\?[^\n\a]*/\n/g;/^$\n\a$\{2\}\n\?$/s/.*/check/p;d}' where \a is \x07 symbol (can be any which sure will not met in the text. – Costas Apr 21 '15 at 18:39
@Costas I am getting below error for second command: sed:command garbled: /DCR/{p;s/.*/1/;H;g;/^$\n1$\{2\}$/s//check/p;d} – Menon Apr 22 '15 at 11:26
@Menon Try to add ; after last d. What version of sed do you use? (sed --version) – Costas Apr 22 '15 at 11:36
@Costas Now there is no error but there is no change in output. The version extension doesnot work in the unix environment I use. But it is pretty old version. – Menon Apr 22 '15 at 11:48
@Menon Try to divide script by \new lines instead of ;. Other solution use script-file if your version support -f option. Any way try man sed. – Costas Apr 22 '15 at 11:53
1

sed command garbled is at least a Solaris error message. GNU sed doesn't writw any errors like that. – mikeserv Jun 29 '15 at 07:09

mikeserv · Answer 3 · 2015-06-29T19:01:04.847

You can do this with sed on a stack...

sed '/match$/N
     s/\n/&INSERT&/3;t
     $n;N;P;D'

That would insert INSERT following every 3rd non-sequential occurrence of match in input. It is the most efficient way I know to do it with sed because it does not attempt to store all lines that occur between different matches, nor does it necessitate buffer swaps or back-ref comparisons, but instead simply increments sed's only means of counting at all - its line-number via its line-cycle.

There is some added overhead, of course - with each match pattern space gets a little bigger - but it is still the same stream, and there is no back-tracking. It's just first-in,first-out - which, as I think, is a method very well suited to sed. In fact, rather than going back to check for a match, sed can advance further ahead for each match. I'm a little proud of it, and don't know why I never thought of it before.

The version above, though, would squeeze repeats to some extent because it only works one line behind input. And the solution to that is to advance still further and requires only a little additional complexity in the form of a branch :label short-circuit loop inside the N;P;D loop to keep it current.

It works like this:

seq 100000| sed -ne':n
            s/\n/&\tCheck&\t/5p;t
            N;/0000$/bn'  -eD

...which, for me, prints...

You see, in order to maintain the count, it increments its line-buffer for each occurrence of match and tacks another line onto its sliding window on pattern space. In that way all that is needed to verify that the match has been found is to attempt to substitute away the s///nth \newline character in pattern space. If it can be done, we've encountered n matches so far, and test can branch us out of the current iteration and clear the increment entirely.

In the example above the buffer is incremented once for every pattern-space which ends with the string 0000. When 5 of those are found, sed prints the current pattern-space - and its whole buffer - and clears the counter.

For your thing:

printf DCR\\n| tee - - - - - |
sed -e:n -e's/\n/&\tCheck&\t/2;t
     $n;    N;/DCR$/bn' -eP\;D

DCR
DCR
    Check
    DCR
DCR
DCR
    Check
    DCR

Now, if you wanted to mark only the nth occurrence, it's also easy:

printf DCR\\n        |
tee - - - - - - - - -|
sed -e:n -e's/\n/&\tCheck&\t/3;te
     $n;  N;/DCR$/bn' -e'P;D;:e
     n;be'

...if you really look at it, it might occur to you that we only barely scratched the surface here...

DCR
DCR
DCR
    Check
    DCR
DCR
DCR
DCR
DCR
DCR
DCR

Does this work with POSIX sed? The spec said t with no label will branch to the end of script. i tried with three sed from heirloom toolchest, they worked. — cuonglm, Jun 29 '15 at 13:34
@cuonglm - yes, it works. I tested it with those as well. But you've maybe misunderstood the statement branching to the end of the script. That is what it does, of course. The end of script is not the end of file - for each line-cycle sed reads its script all the way - well, normally. It can be done otherwise - like I did in that !! answer where the script is read from start to finish in tandem with the infile. Anyway, when you branch to end of script you branch out to the next line-cycle to try the script again. — mikeserv, Jun 29 '15 at 14:45
Well, I really misunderstood the spec, but not like you thought. I think end of script mean the end of the -e part associated with t command. My bad! — cuonglm, Jun 29 '15 at 15:50
@cuonglm - oh yeah. that makes sense. but sed concatenates all of its scripts into a single one before ever getting started - so by the time it starts executing there only ever is the one. — mikeserv, Jun 29 '15 at 16:13
@cuonglm - you didn't like this answer or something? i only found out i could do this yesterday. This is a cool answer. — mikeserv, Jun 29 '15 at 18:04
No, I really like this answer, I just try myself to figure out all of the part in answer. Thinking sed way is cool and sometime, it's hard to me. I always learn some things new in your sed answers. — cuonglm, Jun 29 '15 at 18:08
@cuonglm - well, i always do too - that's i do them. Its why i do any of it. — mikeserv, Jun 29 '15 at 18:37

score 2 · Answer 4 · answered Apr 21 '15 at 15:06

2

I don't have a direct answer in sed. In awk, on the other hand, it is easy:

echo -e "DCR\nDCR\nDCR" |\
awk 'BEGIN {t=0}; { print }; /DCR/ { t++; if ( t==2) { print "check" } }'

answered Apr 21 '15 at 15:06

steviethecat

226

I used this command in bash script: awk 'BEGIN {t=0}; { print }; /DCR/ { t++; if ( t==2) { print "check" } }' file > newfile But I am getting error:awk: syntax error near line 1 awk: bailing out near line 1 – Menon Apr 22 '15 at 11:31

Thor · Answer 5 · 2015-04-21T20:26:41.743

GNU sed

sed is not well suited for this task, but of course you can still do it. Here is one way that saves a string that is n long in the hold-space, and uses that to count the number of DCR occurrences:

n=2

((yes | head -n$n | tr -d \\n; echo); cat infile) | 
sed '
  1 {h;d}            # save counting string
  /DCR/ {            #
    x; s/.//; x      # n--
    T chk            # if n=0 goto "chk"
  }
  P;D 
  :chk               # insert check
  i\check
  :a; N; ba          # print rest of file
'

awk

As noted by glenn, awk is much cleaner, here is a golfed version, but similar logic:

<infile awk '!n { print "check" } /DCR/ { n-- } 1' n=2

score 0 · Answer 6 · answered Apr 21 '15 at 15:07

0

    sed '2 a\
    check
    ' file

Append after line 2 with a newline then add the word "check" with another newline and print the whole file to standard out.

answered Apr 21 '15 at 15:07

fd0

1,449

Roger Freeman · Answer 7 · 2015-06-29T04:20:57.890

-1

AWK solution is a lot easy to read for this kind of tasks, here is just a correction to steviethecat's solution (the ; won't work for awk, need to replace it with a newline):

echo -e "DCR\nDCR\nDCR" | awk 'BEGIN {t=0}

{ print }

/DCR/ { t++; if ( t==2) { print "check" } }'

edited Jun 29 '15 at 04:20

answered Jun 29 '15 at 00:46

Roger Freeman

31

Welcome to U&L.SE. Please explain why the correction is needed, and you may get an upvote. – eyoung100 Jun 29 '15 at 02:20
The one steviethecat posted has problem, and an user has used it and got error. – Roger Freeman Jun 29 '15 at 04:19
Update your Answer by Clicking Edit... Don't tell us in a comment, i.e explain what the error the user got, and then tell us how that code fixes it. You've done half of that, but posting an untested code blob is discouraged. – eyoung100 Jun 29 '15 at 04:26

How to add a line after nth occurrence of a keyword using sed?

7 Answers7

GNU sed

awk

Linked