447

I've noticed that, if I add \n to a pattern for substituting using sed, it does not match. Example:

$ cat > alpha.txt
This is
a test
Please do not
be alarmed

$ sed -i'.original' 's/a test\nPlease do not/not a test\nBe/' alpha.txt

$ diff alpha.txt{,.original}

$ # No differences printed out

How can I get this to work?

AmanicA
  • 205

18 Answers18

354

In the simplest calling of sed, it has one line of text in the pattern space, ie. 1 line of \n delimited text from the input. The single line in the pattern space has no \n... That's why your regex is not finding anything.

You can read multiple lines into the pattern-space and manipulate things surprisingly well, but with a more than normal effort.. Sed has a set of commands which allow this type of thing... Here is a link to a Command Summary for sed. It is the best one I've found, and got me rolling.

However forget the "one-liner" idea once you start using sed's micro-commands. It is useful to lay it out like a structured program until you get the feel of it... It is surprisingly simple, and equally unusual. You could think of it as the "assembler language" of text editing.

Summary: Use sed for simple things, and maybe a bit more, but in general, when it gets beyond working with a single line, most people prefer something else...
I'll let someone else suggest something else.. I'm really not sure what the best choice would be (I'd use sed, but that's because I don't know perl well enough.)


sed '/^a test$/{
       $!{ N        # append the next line when not on the last line
         s/^a test\nPlease do not$/not a test\nBe/
                    # now test for a successful substitution, otherwise
                    #+  unpaired "a test" lines would be mis-handled
         t sub-yes  # branch_on_substitute (goto label :sub-yes)
         :sub-not   # a label (not essential; here to self document)
                    # if no substituion, print only the first line
         P          # pattern_first_line_print
         D          # pattern_ltrunc(line+nl)_top/cycle
         :sub-yes   # a label (the goto target of the 't' branch)
                    # fall through to final auto-pattern_print (2 lines)
       }    
     }' alpha.txt  

Here it is the same script, condensed into what is obviously harder to read and work with, but some would dubiously call a one-liner

sed '/^a test$/{$!{N;s/^a test\nPlease do not$/not a test\nBe/;ty;P;D;:y}}' alpha.txt

Here is my command "cheat-sheet"

:  # label
=  # line_number
a  # append_text_to_stdout_after_flush
b  # branch_unconditional             
c  # range_change                     
d  # pattern_delete_top/cycle          
D  # pattern_ltrunc(line+nl)_top/cycle 
g  # pattern=hold                      
G  # pattern+=nl+hold                  
h  # hold=pattern                      
H  # hold+=nl+pattern                  
i  # insert_text_to_stdout_now         
l  # pattern_list                       
n  # pattern_flush=nextline_continue   
N  # pattern+=nl+nextline              
p  # pattern_print                     
P  # pattern_first_line_print          
q  # flush_quit                        
r  # append_file_to_stdout_after_flush 
s  # substitute                                          
t  # branch_on_substitute              
w  # append_pattern_to_file_now         
x  # swap_pattern_and_hold             
y  # transform_chars                   
Peter.O
  • 32,916
  • 5
    You don't need a label to use the t command here—when not given a label it defaults to branching to the end of the script. So sed '/^a test$/{$!{N;s/^a test\nPlease do not$/not a test\nBe/;t;P;D}}' alpha.txt does exactly the same as your command in all circumstances. Of course for this particular file, sed '/test/{N;s/.*/not a test\nBe/}' alpha.txt does the same thing also, but my first example is logically equivalent for all possible files. Also note that \n in a replacement string doesn't produce a newline; you need a backslash \ followed by an actual newline to do that. – Wildcard Oct 24 '15 at 13:13
  • 24
    Note that that syntax is GNU specific (# command not separated from the previous one, \n in RHS of s). With GNU sed you can also use -z to use NUL delimited records (and then slurp in the whole input if it's text (which by definition doesn't contain NULs)). – Stéphane Chazelas Aug 11 '16 at 06:33
  • 1
    It should also be noted that several sed implementations (not GNU's) have a (sometimes rather low like on the original Unix implementation) limit on the size of their pattern space, so you can't always assume you can slurp in as many lines as you want. – Stéphane Chazelas Aug 11 '16 at 06:35
  • 4
    @StéphaneChazelas Your comment of -z for GNU sed should actually be an answer, not a footnote to an answer. – Samveen Aug 27 '16 at 05:33
  • 1
    The sed command summary at mik.ua is copyrighted text copied without permission. – Todd Walton May 15 '19 at 15:23
295

Use perl instead of sed:

$ perl -0777 -i.original -pe 's/a test\nPlease do not/not a test\nBe/igs' alpha.txt
$ diff alpha.txt{,.original}
2,3c2,3
< not a test
< Be
---
> a test
> Please do not

-pi -e is your standard "replace in place" command-line sequence, and -0777 causes perl to slurp files whole. See perldoc perlrun to find out more about it.

nnyby
  • 123
codehead
  • 4,960
  • 8
    Thanks! For multiline work, perl wins hands down! I ended up using $ perl -pi -e 's/bar/baz/' fileA to change the file in-place. – Nicholas Tolley Cottrell Feb 04 '13 at 14:36
  • 4
    It is very common that the original poster ask for sed and replies using awk or perl appear. I think it is not on topic, hence, sorry, but I fired a minus one. – Rho Phi Aug 26 '15 at 22:07
  • 111
    +1 & disagree with Roberto. Often questions phrased specifically for ignorance of better methods. When there isn't a substantive contextual difference (like here), optimal solutions should get at least as much profile as the question-specific ones. – geotheory Sep 04 '15 at 15:47
  • 100
    I think the sed answer above proves that a Perl answer is on topic. – reinierpost Nov 24 '15 at 10:46
  • 1
    Note that slurping the whole input in memory before processing it like here is not always practical (or possible like when post-processing the output of tail -f). – Stéphane Chazelas Aug 11 '16 at 06:36
  • 1
    What's more, there are incompatibilities between sed implementations. It seems the sed solution in the accepted answer does not work on my Mac (and probably BSD). – Yongwei Wu Dec 05 '16 at 12:20
  • 16
    A little bit easier: With "-p0e" the "-0777" is not necessary. http://unix.stackexchange.com/a/181215/197502 – Weidenrinde Mar 03 '17 at 11:59
  • This was so much easier. – Raj Jan 17 '19 at 04:16
  • 2
    How can this approach be used for multiline inputs that aren't coming from a file, but rather as part of a pipeline? In my case echo -e 'first \n second' | sed 's/$/kangaroo/' is not working because instead of one kangaroo I am getting two. Can I apply a similar method to fix my problem? – temporary_user_name Jun 03 '19 at 01:34
  • I need to pattern-replace useradd -m $USER -u $USERID\n mkdir -p /home/$USER\n chown -R $USER /home/$USER. This solution won't work directly, because it seems that perl is expanding $USER. working on how to not expand it. – jay Feb 01 '22 at 18:27
163

I think, it's better to replace \n symbol with some other symbol, and then work as usual:

e.g. not-worked source code:

cat alpha.txt | sed -e 's/a test\nPlease do not/not a test\nBe/'

can be changed to:

cat alpha.txt | tr '\n' '\r' | sed -e 's/a test\rPlease do not/not a test\rBe/'  | tr '\r' '\n'

If anybody doesn't know, \n is UNIX line ending, \r\n - windows, \r - classic Mac OS. Normal UNIX text doesn't use \r symbol, so it's safe to use it for this case.

You can also use some exotic symbol to temporarily replace \n. As an example - \f (form feed symbol). You can find more symbols here.

cat alpha.txt | tr '\n' '\f' | sed -e 's/a test\fPlease do not/not a test\fBe/'  | tr '\f' '\n'
user530873
  • 257
  • 1
  • 8
xara
  • 1,647
  • 20
    +1 for this clever hack! Especially useful is the advice about using an exotic symbol to temporarily replace newline unless you're absolutely certain about the content of the file you're editing. – L0j1k Feb 06 '15 at 21:26
  • 1
    This doesn't work as written on OS X. Instead, one needs to replace all instances of \r in the argument to sed with $(printf '\r'). – abeboparebop Feb 11 '16 at 15:33
  • @abeboparebop: great find! alternatively, install GNU sed using homebrew: http://stackoverflow.com/a/30005262 – ssc Jan 24 '17 at 20:05
  • @abeboparebop, On OSX, you just need to add a $ before the sed string to prevent it from converting the \r to an r. Short example: sed $'s/\r/~/'. Full example: cat alpha.txt | tr '\n' '\r' | sed $'s/a test\rPlease do not/not a test\rBe/' | tr '\r' '\n' – wisbucky Aug 05 '19 at 21:49
112

GNU sed has a -z option that allows to use the syntax the OP attempted to apply. (man page)

Example:

$ cat alpha.txt
This is
a test
Please do not
be alarmed
$ sed -z 's/a test\nPlease do not\nbe/not a test\nBe/' -i alpha.txt
$ cat alpha.txt
This is
not a test
Be alarmed

Be aware: If you use ^ and $ they now match the beginning and end of lines delimited with a NUL character (not \n). And, to ensure matches on all your (\n-separated) lines are substituted, don't forget to use the g flag for global substitutions (e.g. s/.../.../g).


Credits: @stéphane-chazelas first mentioned -z in a comment above.

Peterino
  • 1,221
  • 11
    I'm not sure why this isn't the accepted answer. It's reasonably clean sed with a simple commandline flag. – James G Mar 02 '21 at 08:07
  • 4
    As a side note, sed doesn't accept joined options, for instance sed -iz ... will not work, you need to specify them individually sed -i -z ... as in the answer. – pmiguelpinto90 Jul 08 '21 at 13:54
  • 1
    the '-z' option worked for me. much more simpler. – Bruce Tong Feb 15 '22 at 07:31
  • Good one, sir! =D – Eduardo Lucio Aug 07 '22 at 03:39
  • 3
    “I'm not sure why this isn't the accepted answer.” → Probably because this solution is specific to one implementation of sed. It won't work on systems without GNU sed (e.g. Mac OS X, Busybox, BSD…). – Denilson Sá Maia May 29 '23 at 04:09
  • if you are using "-z" why do we still need "\n" in the search pattern? – PraveenMak Jul 10 '23 at 21:44
  • @PraveenMak When you use the -z flag \n is just like any other character. It loses its special meaning and simply represents a newline character inside a block of text you want your pattern to match. You're free to omit it, use a . character (to match any single character instead) or whatever works for you. – Peterino Jul 11 '23 at 14:30
81

All things considered, gobbling the entire file may be the fastest way to go.

Basic syntax is as follows:

sed -e '1h;2,$H;$!d;g' -e 's/__YOUR_REGEX_GOES_HERE__...'

Mind you, gobbling the entire file may not be an option if the file is tremendously large. For such cases, other answers provided here offer customized solutions that are guaranteed to work on a small memory footprint.

For all other hack and slash situations, merely prepending -e '1h;2,$H;$!d;g' followed by your original sed regex argument pretty much gets the job done.

e.g.

$ echo -e "Dog\nFox\nCat\nSnake\n" | sed -e '1h;2,$H;$!d;g' -re 's/([^\n]*)\n([^\n]*)\n/Quick \2\nLazy \1\n/g'
Quick Fox
Lazy Dog
Quick Snake
Lazy Cat

What does -e '1h;2,$H;$!d;g' do?

The 1, 2,$, $! parts are line specifiers that limit which lines the directly following command runs on.

  • 1: First line only
  • 2,$: All lines starting from the second
  • $!: Every line other than last

So expanded, this is what happens on each line of an N line input.

  1: h, d
  2: H, d
  3: H, d
  .
  .
N-2: H, d
N-1: H, d
  N: H, g

The g command is not given a line specifier, but the preceding d command has a special clause "Start next cycle.", and this prevents g from running on all lines except the last.

As for the meaning of each command:

  • The first h followed by Hs on each line copies said lines of input into sed's hold space. (Think arbitrary text buffer.)
  • Afterwards, d discards each line to prevents these lines from being written to the output. The hold space however is preserved.
  • Finally, on the very last line, g restores the accumulation of every line from the hold space so that sed is able to run its regex on the whole input (rather than in a line-at-a-time fashion), and hence is able to match on \ns.
antak
  • 1,070
  • 1
    this can also modify existing files if the -i switch is added (the perl answer calls this "replace in-place") – textral Feb 03 '20 at 05:49
  • 1
    /!\ do not forget to add g at the end of your substitution command: s/.../.../g -- unless you only want to change the first match of the file of course. – Mathieu CAROFF May 15 '20 at 13:36
  • It's not important to the main point here, but the example used for demonstrating a "real" pattern s/([^\n]*)\n([^\n]*)\n/Quick \2\nLazy \1\n/g greedily over-consumes some--but not all--of what were originally any empty lines \n\n+ from its gobbled input, so they are eliminated from the interleaved output. – Glenn Slayden Sep 30 '21 at 08:40
54

sed has three commands to manage multi-line operations: N, D and P (compare them to normal n, d and p).

In this case, you can match the first line of your pattern, use N to append the second line to pattern space and then use s to do your substitution.

Something like:

/a test$/{
  N
  s/a test\nPlease do not/not a test\nBe/
}
andcoz
  • 17,130
20

You can but it's difficult. I recommend switching to a different tool. If there's a regular expression that never matches any part of the text you want to replace, you can use it as an awk record separator in GNU awk.

awk -v RS='a' '{gsub(/hello/, "world"); print}'

If there are never two consecutive newlines in your search string, you can use awk's "paragraph mode" (one or more blank lines separate records).

awk -v RS='' '{gsub(/hello/, "world"); print}'

An easy solution is to use Perl and load the file fully into memory.

perl -0777 -pe 's/hello/world/g'
11

I think this is the sed solution for 2 lines matching.

sed -n '$!N;s@a test\nPlease do not@not a test\nBe@;P;D' alpha.txt

If you want 3 lines matching then ...

sed -n '1{$!N};$!N;s@aaa\nbbb\nccc@xxx\nyyy\nzzz@;P;D'

If you want 4 lines matching then ...

sed -n '1{$!N;$!N};$!N;s@ ... @ ... @;P;D'

If the replacement part in the "s" command shrink lines then a bit more complicated like this

# aaa\nbbb\nccc shrink to one line "xxx"

sed -n '1{$!N};$!N;/aaa\nbbb\nccc/{s@@xxx@;$!N;$!N};P;D'

If the repacement part grow lines then a bit more complicated like this

# aaa\nbbb\nccc grow to five lines vvv\nwww\nxxx\nyyy\nzzz

sed -n '1{$!N};$!N;/aaa\nbbb\nccc/{s@@vvv\nwww\nxxx\nyyy\nzzz@;P;s/.*\n//M;P;s/.*\n//M};P;D'

this second method is a simple copy & paste verbatim substitution for usual small sized text files ( need a shell script file )

#!/bin/bash

# copy & paste content that you want to substitute

AA=$( cat <<\EOF | sed -z -e 's#\([][^$*\.#]\)#\\\1#g' -e 's#\n#\\n#g'
a test
Please do not
EOF
)

BB=$( cat <<\EOF | sed -z -e 's#\([&\#]\)#\\\1#g' -e 's#\n#\\n#g'
not a test
Be
EOF
)

sed -z -i 's#'"${AA}"'#'"${BB}"'#g' *.txt   # apply to all *.txt files
mug896
  • 965
  • 9
  • 12
  • This should make its way to the top! I just used the "-i" instead of "-n" for the two line substitution, because that's what I need, and incidentally, it's also in the asker's example. – Nagev May 15 '18 at 13:22
6
sed -i'.original' '/a test/,/Please do not/c not a test \nBe' alpha.txt

Here /a test/,/Please do not/ is considered as a block of (multi line) text, c is the change command followed by new text not a test \nBe

In the case of the text to be replaced is very long, I would suggest ex syntax.

gibies
  • 361
  • oops the problem is that sed will replace all eventual text between /a test/ and /Please do not/ as well... :( – noonex Nov 24 '16 at 08:47
5

Apart from Perl, a general and handy approach for multiline editing for streams (and files too) is:

First create some new UNIQUE line separator as you like, for instance

$ S=__ABC__                     # simple
$ S=__$RANDOM$RANDOM$RANDOM__   # better
$ S=$(openssl rand -hex 16)     # ultimate

Then in your sed command (or any other tool) you replace \n by ${S}, like

$ cat file.txt | awk 1 ORS=$S |  sed -e "s/a test${S}Please do not/not a test\nBe/" | awk 1 RS=$S > file_new.txt

( awk replaces ASCII line separator with yours and vice versa. )

guest
  • 61
  • 2
    Don't use random values for processes designed to be deterministic. They will make your solution randomly fail, in the edge cases that match the random values, and you will have a hard time to reproduce the problem (because it was caused at random). Use a command designed for the problem instead. sed -z would be such a solution, since text streams typically don't contain NUL characters. – Peterino Aug 04 '20 at 12:59
4
sed -e'$!N;s/^\(a test\n\)Please do not be$/not \1Be/;P;D' <in >out

Just widen your window on input a little bit.

It's pretty easy. Besides the standard substitution; you need only $!N, P, and D here.

mikeserv
  • 58,310
2

This is a small modification of xara's clever answer to make it work on OS X (I'm using 10.10):

cat alpha.txt | tr '\n' '\r' | sed -e 's/a test$(printf '\r')Please do not/not a test$(printf '\r')Be/'  | tr '\r' '\n'

Instead of explicitly using \r, you have to use $(printf '\r').

  • 2
    While printf '\r' (or echo -e '\r') do work properly, please note that you can just use the shell syntax $'\r' to refer to escaped literals. For example, echo hi$'\n'there will echo a newline between hi and there. Similarly, you can wrap the entire string so that each backslash \ will escape its subsequent character: echo $'hi\nthere' – Dejay Clayton Feb 28 '18 at 01:40
2

Expanding on Peter.O's brilliant accepted answer, if you are like me and need a solution to replace more than 2 lines in 1 go, try this:

#!/bin/bash

pattern_1="<Directory &quot;/var/www/cgi-bin&quot;>" pattern_2="[ ]AllowOverride None\n" pattern_3="[ ]Options +ExecCGI\n" pattern_4="[ ]AddHandler cgi-script .cgi .pl\n" pattern_5="[ ]Require all granted\n" pattern_6="</Directory>"

complete_pattern="$pattern_1\n$pattern_2$pattern_3$pattern_4$pattern_5$pattern_6"

replacement_1="#<Directory &quot;/var/www/cgi-bin&quot;>\n" replacement_2=" #AllowOverride None\n" replacement_3=" #Options +ExecCGI\n" replacement_4=" #AddHandler cgi-script .cgi .pl\n" replacement_5=" #Require all granted\n" replacement_6="#</Directory>"

complete_replacement="$replacement_1$replacement_2$replacement_3$replacement_4$replacement_5$replacement_6"

filename="test.txt"

echo "" echo "SEDding" sed -i "/$pattern_1/{ N;N;N;N;N s/$complete_pattern/$complete_replacement/ }" $filename


Let your input file be:

#
#This is some test comments
#    Skip this
#

<Directory "/var/www/cgi-bin"> AllowOverride None Options +ExecCGI AddHandler cgi-script .cgi .pl Require all granted </Directory>

After running the sed script, the file will be replaced in-place to:

#
#This is some test comments
#    Skip this
#

#<Directory "/var/www/cgi-bin"> #AllowOverride None #Options +ExecCGI #AddHandler cgi-script .cgi .pl #Require all granted #</Directory>


Explanation

  • pattern_1="<Directory \"\/var\/www\/cgi-bin\">" -- Special characters must be escaped with a backslash \.

  • [ ]* -- This will match 0 or many whitespaces. Standard RegEx notation

  • sed -i "/$pattern_1/{ -- This will search the file, line-by-line, for pattern_1 [<Directory "/var/www/cgi-bin">]. Note that the search pattern MUST NOT CONTAIN NEWLINE

    If, and only if, sed finds $pattern_1 in the file, then it will proceed with executing the sub-code within the curly braces {}. It will start from the pattern matching line of file

  • N;N;N;N;N -- N tells sed to read the next line after the pattern and attach it to current line. It is important to understand that sed is designed to replace only 1 line at a time, so N will basically force sed to read 2 lines and consider them as a single line with a single newline \n between the two. The newline in the second line will be ignored. By chaining 5 Ns, we are instructing sed to read 6 lines of the file starting with the pattern matching line.

  • s/$complete_pattern/$complete_replacement/ -- Replace $complete_pattern with $complete_replacement. Note the presence of newlines in the variables. Understanding this part will take some trial and error.

muru
  • 72,889
1

I wanted to add a few lines of HTML to a file using sed, (and ended up here). Normally I'd just use perl, but I was on box that had sed,bash and not much else. I found that if I changed the string to a single line and let bash/sed interpolate the \t\n everything worked out:

HTML_FILE='a.html' #contains an anchor in the form <a name="nchor" />
BASH_STRING_A='apples'
BASH_STRING_B='bananas'
INSERT="\t<li>$BASH_STRING_A<\/li>\n\t<li>$BASH_STRING_B<\/li>\n<a name=\"nchor\"\/>"
sed -i "s/<a name=\"nchor"\/>/$INSERT/" $HTML_FILE

It would be cleaner to have a function to escape the double quotes and forward-slashes, but sometimes abstraction is the thief of time.

1

Sed breaks input on newlines. It keeps only one line per loop.
Therefore there is no way to match a \n (newline) if the pattern space doesn't contain it.

There is a way, though, you can make sed keep two consecutive lines in the pattern space by using the loop:

sed 'N;l;P;D' alpha.txt

Add any processing needed between the N and the P (replacing the l).

In this case (2 lines):

$ sed 'N;s/a test\nPlease do not/not a test\nBe/;P;D' alpha.txt
This is
not a test
Be
be alarmed

Or, for three lines:

$ sed -n '1{$!N};$!N;s@a test\nPlease do not\nbe@not a test\nDo\nBe@;P;D' alpha.txt 
This is
not a test
Do
Be alarmed

That's assuming the same amount of lines gets replaced.

1

While ripgrep specifically doesn't support inline replacement, I've found that its current --replace functionality is already useful for this use case and is preferable to using sed, e.g.:

rg --replace $'not a test\nBe' --passthru --no-line-number \
--multiline 'a test\nPlease do not' alpha.txt > output.txt

Explanation:

  • --replace 'string' enables replacement mode and sets the replacement string. Can include captured regex groups by using $1 etc.
  • $'string' is a Bash expansion so that \n becomes a newline for a multiline string.
  • --passthru is needed since ripgrep usually only shows the lines matching the regex pattern. With this option it also shows all lines from the file that don't match.
  • --no-line-number / -N is because by default ripgrep includes the line numbers in the output (useful when only the matching lines are shown).
  • --multiline / -U enables multiline processing which is disabled by default.
  • > output.txt, with the --passthrough and no-line-number options the standard output matches the desired new file with replacements and can be saved as usual.
  • --multiline-dotall can optionally be added if you want the dot ('.') regex pattern to match newlines (\n).

However, this command isn't as useful for processing multiple files, as it needs to be run separately per file.

Silveri
  • 161
0

Since this already explains most sed operations, I'll add how you can search within a block.

Suppose you want to change x within padding but not offset:

{
  padding: {
    x: 2,
    y: 0
  },
  offset: {
    x: 0,
    y: 1
  }
}

You first select the block from padding: { to }

sed -r '/padding: \{/,/\}/ {
    # and inside the block you replace the value of x:
    s/^( +x:).*/\1 1,/
}'

This also works to answer the question though it is not as elegant as the JSON sample:

echo -e 'This is\na test\nPlease do not\nbe alarmed' | sed -r '
  /a test/,/Please do not/ {
    s/a test/not a test/
    s/Please do not/Be/
  }'
laktak
  • 5,946
0

Another option is sed ranges. I'm not sure it's applicable for this specific question, but as I landed to here and found sed ranges useful my case, I find a value in sharing the following answer here: Using sed to replace multiline

MaMazav
  • 101
  • 1
    It's best if you please summarize the important information here instead of only linking somewhere else. The external information might change, and the fewer links someone has to click to get to it, the better. – cryptarch Apr 07 '22 at 23:02