delete specific text based on the next line contents

Question

I need a command to delete specific text based on the contents of the next line, specifically I want to delete "comma" if the next line is "]", and delete that next line "]" as well.

example

987678680,
]
123435434-
]
2345643,
]
2345632-
]
234563,
]
1234567654,
]

output

Please clarify whether the , is always at the end of the line. — agc, Oct 08 '18 at 14:37

score 1 · Answer 1 · edited Oct 10 '18 at 12:25

When dealing with this kind of task (edit/do something if consecutive lines match a certain pattern) the simplest way to do it with sed is probably via the N;P;D cycle aka the "sliding window":

sed -e '$!N;s/,\nPATTERN//;t' -e 'P;D' file

This gets the Next line into the pattern space and unconditionally attempts to substitute per the requirement. It then tests if the replacement was successful: if so, it branches to the end of script (no label) and autoprints the pattern space, otherwise it Prints and Deletes the first line from the pattern space and restarts the cycle.

Another GNU sed method:

sed ':x /,$/{N;s/,\n]//;T x}' file

This works correctly even when the trailing comma is on an even-numbered line. Example:

printf '%s\n' 1, 2, ']' | sed ':x /,$/{N;s/,\n]//;T x}'

Output:

1,
2

How it works:

In most programming languages address labels are entirely passive -- labels mark code, but never change the action of that code. Not sed though. In sed jumping to a label at the beginning of a program actually changes the action of the code, or rather it avoids the implicit next-line code that a sed cycle usually starts with.

The Test and branch if failed command T x checks if the prior substitute command did nothing, and if so jumps to the :x label at the beginning without either printing anything or reading a new line. Which means any odd-numbered line fetched by the append Next line that wasn't substituted will be re-scanned, as it should be.

For non-GNU sed, (when the T command isn't available and the syntax isn't as permissive), this should be more portable:

sed ':x
/,$/{
N
s/,\n]//
t
b x
}' file

score 0 · Answer 2 · edited Oct 08 '18 at 14:48

0

You can used sed as follows:

sed -e '/,$/{N; /\]/s/,[^,]*$//;}' file
987678680
123435434-
]
2345643
2345632-
]
234563
1234567654

Or per @steeldriver, this can be simplified in Bash as follows:

sed '$!N; s/,\n]//'

edited Oct 08 '18 at 14:48

agc

7,223

answered Oct 08 '18 at 13:45

1

Does it really need to be more complicated than sed '$!N; s/,\n]//' ? – steeldriver Oct 08 '18 at 13:56
@steeldriver thanks! in bash yes but in csh no – Oct 08 '18 at 14:01
2

@steeldriver - no, assuming the input will always consist of alternating lines just like the OP sample... Goro, this has nothing to do with bash or any shell, really... – don_crissti Oct 08 '18 at 14:02
2

... most likely because csh is a "great" shell and you have to escape the !... – don_crissti Oct 08 '18 at 14:07
@agc - how would this "incorrectly delete a ," if the next line wasn't a ] ? You probably mean the opposite: this would not delete a trailing ,even if the next line was a ] if the line with the trailing comma was an even line ... – don_crissti Oct 08 '18 at 14:17
@don_crissti, Thanks. Worse -- what happened was my eye skipped the ] in s,\n]//, or something like that... Now rolled back. (Your opposite note is worth adding however.) – agc Oct 08 '18 at 14:53
Actually you may safely remove the $! portion from N, since if the last line happened to be one with a trailing comma no next line follows it. – Rakesh Sharma Oct 08 '18 at 15:06
@agc - no problem. I'll leave my note as a comment – don_crissti Oct 08 '18 at 15:25
This answer won't work if the input is printf '%s\n' 1, 2, ']'. – agc Oct 09 '18 at 13:16

delete specific text based on the next line contents

2 Answers2