3

I'm trying to post-process the output of script into a more readable form, similar to Removing control chars (including console codes / colours) from script output, but I've noticed that col doesn't always work.

For instance,

$ cat -v uncolored 
foo^H^H^Hbfoo^H^H^Hafoo^H^H^Hr^M
$ col -bp < uncolored
baroo

Why doesn't col -bp output just bar? Where are the extra two os coming from?

Jeffrey
  • 133
  • 4

2 Answers2

3

^H in this case is backspace, AKA dec/hex 8 or oct 10 or \b. All it is doing is moving the cursor; take this example:

$ printf 'bravo\10\10X'
braXo

We have moved the cursor back 2, but we only wrote over one letter, the v. We didn't write over the o, so it remains. If you want to get rid of the rest of the letters, you have to overwrite them with something, usually a space character:

$ printf 'bravo\10\10X '
braX

http://wikipedia.org/wiki/Backspace#%5eH

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
Zombo
  • 1
  • 5
  • 44
  • 63
  • Ah, the uncolor script from the other post removes all the ^[[K controls which would've cleared the rest of the line, except col doesn't understand those either. – Jeffrey Jun 11 '18 at 23:48
1

Here's a hacky workaround:

sed -re ':b; s,[^\x08]\x08,,g; tb'

  • :b: Label b
  • s,[^\x08]\x08,,g: Pair a non-backspace character with a backspace character and remove both
  • tb: If the previous s directive did something, jump back to label b
Jeffrey
  • 133
  • 4
  • A better workaround is https://github.com/RadixSeven/typescript2txt, which actually emulates a terminal and processes backspaces and ANSI escape sequences. – Jeffrey Jun 13 '18 at 03:11