That goes back to tele-typewriters (ttys!) in the 70s.
Sending X<backspace>X
(^H
being the ASCII BS character) to a tele-typewriter, causes it to write X, go back one character and write X again on top of itself. It being written twice makes it appear as bold.
Similarly, for underline, you'd write _<backspace>X
which would write X
on top of an underscore X̲
.
roff
, the typesetting system used by man
was one of the first thing written for Unix in the 70s as that's how the Unix authors got their funding.
Using man
then on a tele-typewriter would send those sequences to write bold and underline.
Tele-typewriters soon got replaced with Cathod-Ray-Tube terminals. There, the BS character just moves the cursor backward and characters override the character underneath.
So sending X<backspace>X
or _<backspace>X
there just displays X
. CRTs also have a limited screen space (as opposed to paper in tele-typewriters), so pagers like more
were born.
Pagers were enhanced to understand those X<BS>X
sequences and use corresponding escape sequences to tell the terminal to display bold or underline.
Nowadays pagers, including more
, less
, most
, w3m
still understand those sequences.
And man
still uses them to display bold or underline when the output goes to a pager.
When man
(at least some implementations) detects that the output doesn't go to a terminal, it doesn't invoke a pager and does not use those sequences, which is why you don't see them when you redirect to a file.
If you want to remove those sequences, you can use the col -b
command.