3

I have a file containing special color encoding characters:

$ cat zz
aaa.gpg
bbb.gpg
ccc.gpg

$ cat -A zz ^[[38;5;216maaa.gpg^[[00m$ ^[[38;5;216mbbb.gpg^[[00m$ ^[[38;5;216mccc.gpg^[[00m$

I need to use sed command, to match the ending .gpg and remove it. So, if there were no special characters, I would use:

cat zz | sed 's/\.gpg$//'

So how can I match the .gpg^[[00m$ pattern with sed ?

I tried all possible permutations, but still does not work. For example:

cat zz | sed 's/\.gpg\^\[\[00m$//'
Martin Vegter
  • 358
  • 75
  • 236
  • 411

5 Answers5

3

In order to remove ansi sequences (color and move) we can run something along the lines of

perl -pe 's/\e\[[0-9;]*[mGKHF]//g'

After that, things became much more clear...

JJoao
  • 12,170
  • 1
  • 23
  • 45
3
c=$(printf '\\(\33\\[[0-9;]*m\\)*')

Would store in $c a regexp that matches any number of graphic attribute setting sequences (colouring, bold, reverse video...), also known as sgr (set graphic rendition).

Then:

sed "s/${c}\.${c}g${c}p${c}g\(${c}\)\$/\5/"

Would remove a trailing .gpg including interspersed and preceding SGR sequences, but preserving trailing ones (like your \e[00m (sgr0) to restore default graphic rendition).

  • Is there some advantage in keeping the reset sequence? Say someone has a preference for a special colour (which he for example sets at the end of $PS1, then cat file1 | sed "..."; cat file2 the second file will be printed with undesired colour. – peterph Dec 24 '20 at 20:38
  • @peterph, it makes sense for that ^[[38;5;216maaa.gpg^[[00m for instance to be change to ^[[38;5;216maaa^[[00m instead of ^[[38;5;216maaa as the ^[[00m was intended to terminate the ^[[38;5;216m which we leave alone here. – Stéphane Chazelas Dec 24 '20 at 20:43
2

What you see on the terminal as ^[ is the escape character. The second [ is a [.

You need to include the code for escape.

replace the ^[ with an escape character.

esc="$(echo '\033')"
sed 's/\.gpg'"${esc}"'\[00m$//'

or

esc='\x1b`
...
1

First read a bit on ANSI escape sequences. SGR (Select Graphic Rendition - colours and similar) ends with the m character - so something like:

sed -r 's/^[\[[0-9;]*m//g'

should do the trick for well-behaved input. By well behaved I understand such, where the escape sequence is not interleaved by space characters (other than a space) - like \n or \r".

Note that ^[ is the escape character, not the characters ^ and [ themselves. As for entering the escape character itself, in the a console it is easiest by pressing Ctrl+V followed by Esc.

peterph
  • 30,838
1

I'm assuming you want to leave the escape sequences in place, if they are already there. You can do this:

sed -E 's/\.gpg([[:cntrl:]]|$)/\1/' zz

This will match .gpg followed by either the end of the line or any control character (e.g. the ESC character.). If a control character is matched, it is preserved in the substitution with the \1.

If there are no escape sequences, then .gpg at the ends of lines will also be removed.