77

How can I 'cat' a man page like I would 'cat' a file to get just a dump of the contents?

Caleb
  • 70,105
LanceBaynes
  • 40,135
  • 97
  • 255
  • 351

7 Answers7

111

To get an ASCII man page file, without the annoying backspace/underscore attempts at underlining, and weird sequences to do bolding:

man ksh | col -b > ksh.txt
  • 3
    Hi, why does man piped output contains duplicate characters? and How did col -b removes its? Thanks in advance. – saurabheights Apr 20 '17 at 13:04
  • 6
    @saurabheights - man attempts to do underlines and bold text and maybe some other things with backspaces, duplicate characters, escape sequenc es, etc etc. Tricks that might work if you print the man output on a dot matrix or other printer, or show it as text on a terminal. I haven't read col source, but it probably just examines stdin byte by byte and doesn't pass backspaces, etc to stdout. col's man page reads like someone wrote it specifically to filter man output. –  Apr 21 '17 at 13:00
  • 1
    ok, that makes sense. Such characters(hidden) could cause the duplicate characters. Thank you Bruce. – saurabheights Apr 21 '17 at 13:02
  • You really really deserve mode upvotes. Does "col" for column? – Wizard Aug 18 '18 at 04:30
  • 6
    As another answer below mentioned, add x to col to remove the space/tab mix in the output: man ksh | col -bx > ksh.txt – friederbluemle Apr 04 '20 at 20:39
  • Why does it works and man ksh > ksh.txt produces "strange" output? – Leonardo Maffei Oct 27 '21 at 12:44
56

First of all, the man files are usually just gziped text files stored somewhere in your file system. Since your mileage will vary finding them and you probably wanted the processed and formatted version that man gives you instead of the source, you can just dump them with the man tool. By looking at man man, I see that you can change the program used to view man pages with the -P flag like this:

man -P cat command_name

It's also worth noting that man automatically detects when you pipe its output instead of viewing it on the screen, so if you are going to process it with something else you can skip straight to that step like so:

man command_name | grep search_string

or to dump TO a file:

man command_name > formatted_man_page.txt
Caleb
  • 70,105
  • 4
    using -P doesn't make the output file neatly readable. It's scribbled with all the ctrl-H characters. I used to do man cmd >! man.cmd and open the man.cmd and do '%s/^H.//g' to remove the annoying control characters for representing bolds and italics. But this still has some problem when there are other special characters. I'm still looking for a good method to avoid manual editing to the output. – Chan Kim Apr 25 '16 at 08:34
  • @ChanKim You're doing something wrong or have some non-standard configuration getting in your way because both of the methods here do in fact produce clean output formatted in plain text with no extra control characters. Are you sure you don't have man aliased to something or flags forced on in your shell that are separating your from the normal function of man? – Caleb Apr 25 '16 at 08:39
  • 1
    @Caleb, I confirm OP's problem. CentOS release 6.7 (Final), /usr/bin/man gcc >j, edit 'j', all of the ctrl-H's are in there. Best answer I've found is at http://www.commandlinefu.com/commands/view/2417/convert-man-page-to-text-file – Charles Roth May 19 '16 at 15:42
  • 2
    man command_name > formatted_man_page.txt will cause some word duplicate. – Zigii Wong Aug 21 '18 at 08:20
  • man {whatever} | col -b > {whatever}.txt will remove the backspaces – Chris Davies Jun 01 '22 at 22:18
33

Man pages are usually troff pre-processed files, and you can get to the plain text with,

groff -t -e -mandoc -Tascii manpage.1 | col -bx > manpage.txt

groff is a wrapper for troff.

More information here.

You might need to use gzip to uncompress the man page files first, and you'll still have plenty of formatting information in the output.

EightBitTony
  • 21,373
19

I do this all the time. This command line makes me happy:

man man | col -bx > man.txt

col -b removes backspaces.

col -bx also replaces tabs with spaces which is my strong preference.

If I want the text to be formatted to a width of my preference while reading, then I change the command to this:

MANWIDTH=10000 man man | col -bx > man.txt
sotosoc
  • 191
  • 1
  • 2
6

Just use the man command - you can pipe the output into other things just as you can with cat for a file.

TomH
  • 3,002
4

If you just want to cat a manpage, you can simply pipe it to cat:

man ls | cat

If you want to dump its content to a file:

man ls > ls_manpage_dump.txt
Sheharyar
  • 453
0

A possible helper might look as follows:

#!/usr/bin/env sh

{ # We can safely export the following variables unless we source this file export TERM=dumb export MANPAGER=cat export MANWIDTH=100

# Here is how it works:
#
# 1. 'col -b' removes backspaces, 'col -x' replaces tabs with spaces
# 2. Drop lines from the top up to USAGE word
# 3. Drop two lines from the bottom
man "$1"                  \
    | col -bx             \
    | grep -A 100 USAGE   \
    | sed '$d' | sed '$d'

} > "$2"

Usage:

$ mandump ksh ksh.txt
serghei
  • 145