0

I am trying to clean all color control characters from log file. I am able to clean all other control characters except ^[(B . Please help me to clean this control character also.

I am using these combination to clean control characters.

cat $LOGFILE | sed -e 's/\x1b\[[0-9;]*m//g' > $LOGDIR/Temp.txt
Baba
  • 3,279

1 Answers1

2

Your sed command only matches color control sequences. ^[(B is for setting the font/character set mapping (see console_codes(4)).

If the only control characters in your logfile are color sequences and ^[(B, you can remove them all with

sed -e 's/\x1b\(\[[0-9;]*m\|(B\)//g'

For an expression that matches all possible control sequences, see eg. https://stackoverflow.com/a/33925425/4228744 (Python)

JigglyNaga
  • 7,886
  • The doco of Sindre Sorhus' ansi-regex is spectacularly wrong. It has the date for ECMA-48:1976 wrong by two decades. This made me look hard at the regular expression itself. The doco is spectacularly wrong there, too. Far from covering more than all ECMA-48 control sequences, as it claims, it covers less. There is quite a range of what ECMA-48 §5.4 defines that it won't correctly match. Interestingly, the Stack Overflow answer is pointing out much the same thing about another regular expression with poor coverage. – JdeBP Jun 05 '16 at 09:29
  • agree - there's no point in replicating poor answers. – Thomas Dickey Jun 05 '16 at 09:47
  • I've removed the reference to the misleading one. I considered just pointing to the SO answer, but the Python syntax is sufficiently different to sed that it wasn't really helpful on its own. – JigglyNaga Jun 05 '16 at 10:11
  • You rock Naga....perfect solution..thanks for help – Amit Singh Jun 05 '16 at 17:46