5

I need to implement a 3-line sliding window with Sed in order change any occurrence of the 3 following grouped lines in a big text file :

Fax: 05.11.22.33.44<LF>
<LF>
<LF>

with this :

Fax: 05.11.22.33.44<LF>
###
<LF>

I tried to do that with the following command line (sed running in a msdos batch file, but it doesn't work too under my linux bash) :

sed -i ":a;$!N;s/\nFax: \([ 0-9\.]*\n\n\);tenough;$!ba;:enough/\nFax: \1###\n/;$!ba;P;D" file.txt

What's wrong ?

Syl33
  • 183
  • 2
    Does <LF> stand for line feed ? That is, when fax blahblah is followed by two empty lines add ### to the beginning of the first one of them ? – don_crissti Jun 26 '17 at 22:24
  • Yes, "" stand for "\n" (I've already removed the "\r" before with another sed command line and controled the result with Notepad++). And yes too for the rest of your comment. – Syl33 Jun 26 '17 at 23:11
  • 1
    Would you please revise your question? not clear what you are asking for? what is the meaning of 3lines sliding window?!! The folks who answered the question are very experienced to understand your aims! –  Sep 04 '18 at 16:52

3 Answers3

3

You got the P;D part right. The rest is a failed attempt at pulling lines in the pattern space until a substitution is successful, which isn't necessarily a bad thing but definitely not a sliding window.
You should pull in one line when on the first line, then use a N;P;D cycle, (that way you always have three lines in the pattern space) and attempt to substitute each time you pull in a new line

sed '1N;$!N;s/\(PATTERN\n\)\(\n\)$/\1###\2/;P;D' infile
don_crissti
  • 82,805
  • 1
    I was understanded your point now. My solution pulls 3 lines, checks them for matching, then pulls next three lines, etc. Not slinding window at all. But there was another problem, which I was trying overcome - I couldn't catch three consequencing newlines using this way: s/\n\n\n/\n###\n/. Now got right way: s/\n\n$/\n###\n/. Last \n not match in the literal form, $ sign is needed to use. Thanks for correction. – MiniMax Jun 27 '17 at 14:03
1

I think this is close to your original attempted implementation:

sed ':a; $q; N; s/\(Fax:.*\n\)\n$/\1###\n/; 3,${P;D}; ba'

Ex.

$ sed ':a; $q; N; s/\(Fax:.*\n\)\n$/\1###\n/; 3,${P;D}; ba' input > output
$ diff -y input output
Fax: 05.11.22.33.44                                             Fax: 05.11.22.33.44
Fax: 05.11.22.33.44                                             Fax: 05.11.22.33.44

Fax: 05.11.22.33.44                                             Fax: 05.11.22.33.44
                                                              | ###

Fax: 05.11.22.33.44                                             Fax: 05.11.22.33.44
Fax: 05.11.22.33.44                                             Fax: 05.11.22.33.44
                                                              | ###

Fax: 05.11.22.33.44                                             Fax: 05.11.22.33.44

The trick is the 3,${P;D}, that's what maintains the 3-line window (by popping one line off the pattern space each time round the loop, but only after the line count reaches 3).

steeldriver
  • 81,074
-1

The solution from steeldriver has an advantage: It can work across five lines as well, seven or more works as well too. In my case: Find the matching line, and replace the two before, the matched, and the two lines after with the first part from the line, and then an empty "List" value instead of the existing.

The Input is abridged, the lines are originally > 2000 long:

Frame 64 (List 213 [(LM 0 0 836 216 112 0.681952 0.260603)])
Frame 65 (List 236 [(LM 0 0 836 216 112 0.680071 0.187739)])
Frame 66 (List 235 [(LM 0 0 836 216 112 0.678168 0.315848)])
Frame 67 (List 98 [(LM 149 129 1456 216 112 0.525970 11.970105)])
Frame 68 (List 217 [(LM 0 4 1084 216 112 0.837058 0.658243)])
Frame 69 (List 212 [(LM 0 0 1084 216 112 0.829624 0.339764)])
Frame 70 (List 218 [(LM 0 0 1084 216 112 0.829624 0.200893)])

Sed command matching line 67 (out of whack values) is:

sed -re ":a; $q; N; s/(Frame .[0-9] ).*(Frame .[0-9] ).*(Frame .[0-9] ).*LM\ [0-9][0-9][0-9].*(Frame .[0-9] ).*(Frame .[0-9] ).*/\1(List 0 \[\]\)\n\2\(List 0 \[\]\)\n\3\(List 0 \[\]\)\n\4\(List 0 \[\]\)\n\5\(List 0 \[\]\)/; 5,${P;D}; ba" transform1.trf > transform2.trf

The sed command matches Frame 67, output is:

Frame 64 (List 213 [(LM 0 0 836 216 112 0.681952 0.260603) <cut>])
Frame 65 (List 0 [])
Frame 66 (List 0 [])
Frame 67 (List 0 [])
Frame 68 (List 0 [])
Frame 69 (List 0 [])
Frame 70 (List 218 [(LM 0 0 1084 216 112 0.829624 0.200893) <cut>])

See here https://trac.ffmpeg.org/ticket/6816 why I searched for this. I don't have enough reputation to post it as comment, or to up-vote a solution, so I post it this way. Others might be able to use it. So my thanks go to steeldriver.