1

I'd like to change \n by | between 2 patterns (patterns excluded) with sed.

I have a file with an adress sequence several time :

Adress:
1540 Broadway
New York
NY 10036
United-states
###

I would like to get that :

Adress:1540 Broadway|New York|NY 10036|United-states
###

I use the following line code :

sed -i "/^Adress:/!b;:a;/###/bb;$!{N;ba};:b;s/\n/\|/g;tb" file.txt

...but it includes the two patterns and I get this wrong result :

Adress:|1540 Broadway|New York|NY 10036|United-states|###

How to change it to exclude patterns from the substitute?

Syl33
  • 183

1 Answers1

2

Using a loop for this kind of job isn't recommended, unless you're dealing with a small no. of lines1. You're better off using ranges and the hold space:

sed '/Address/,/###/{
/###/!H;/Address/h;/###/!d;x;s/\n//;s/\n/|/g;G
}' infile

That is, for each line in that range do the following: if it's not the last line in range append to hold space (overwriting if it's the first line in range) and delete the line, else exchange buffers, remove the first embedded newline and replace the remaining ones with |. Then append the hold buffer content to the pattern space.
This will fail if the last Address is not followed by ### so to avoid that, use a second condition and delete only if it's not the last line of input, otherwise append to hold buffer, exchange and quit:

sed '/Address/,/###/{
/###/!H;/Address/h;/###/!{
$!d;H;x;q
}
x;s/\n//;s/\n/|/g;G
}' infile

1: the more lines you have to pull in, the slower it gets, due to the need of constantly checking the pattern space for a match - see the results here (it's a different requirement, I know, but just to give you an idea...)

don_crissti
  • 82,805
  • It works, but the last record of the file is dropped (all the strings starting from Adress: to the is changed to a simple \n) because it has not the "###" ending tag, but just a character at the end of the last adress line, like that :
    United-states<end of file>
    
    

    I tried this but it doesn't work :

    sed -i "/Adress:/,/\(###|$\)/{/###/!H;/Adress:/h;/###/!d;x;s/\n//;s/\n/|/g;G}" file.txt
    
    

    How can I handle the fact that the last record is not terminated with a "###" line ?

    – Syl33 Jun 27 '17 at 12:22
  • @Syl87 - Yeah, I didn't take into account open ranges... Updated. – don_crissti Jun 27 '17 at 13:09
  • In memory editing, while considerably faster, would not scale to larger than memory sized input. I'm aware this is not a consideration for most use cases. – Dani_l Jun 27 '17 at 13:27