0

I'm not sure if something has changed about my system but I find that filtering Unix manual pages through my grep command isn't working. Do you know what's wrong?

For example, consider the jq manual page compiled from the following code.

.
.IP "\(bu" 4
\fB\-\-slurp\fR/\fB\-s\fR:
.
.IP
Instead of running the filter for each JSON object in the input, read the entire input stream into a large array and run the filter just once\.
.

If I want to quickly look up the --slurp switch, then I usually filter the jq manual page through grep like the following terminal session shows.

$ man jq | grep -- --slurp
$

Notice how nothing is returned. That's weird, the expected result is something like the following.

$ man jq | grep -- --slurp
       •   --slurp/-s:

That's expected because if I actually run man jq command and use the / search keystroke, then search for --slurp, then that works.

# ...
       •   --slurp/-s:
       Instead of running the filter for each JSON object in the input, read the entire input stream into a large array and run the filter just once.

I'm guessing there is a problem with special characters but I'm not sure. I tried putting the manual page through cat command but that also didn't work. I also tried removing all special characters—see How to remove all special characters in Linux text—but that also didn't work.

$ man jq | cat | grep -- --slurp
$
$ man jq | sed $'s/[^[:print:]\t]//g' | grep -- --slurp
$

In case this is relevant, please see the following details about my system.

$ neofetch --off
OS: macOS 13.0.1 22A400 x86_64
Host: MacBookPro16,1
Kernel: 22.1.0
Uptime: 6 days, 1 hour, 3 mins
Packages: 145 (brew)
Shell: bash 3.2.57
Resolution: 1792x1120@2x
DE: Aqua
WM: Quartz Compositor
WM Theme: Blue (Light)
Terminal: iTerm2
Terminal Font: Monaco 12
CPU: Intel i7-9750H (12) @ 2.60GHz
GPU: Intel UHD Graphics 630, AMD Radeon Pro 5300M
Memory: 10425MiB / 16384MiB
mbigras
  • 3,100

1 Answers1

1

I saw the same thing as you on my Mac laptop. I used man -P cat to set the pager to cat rather than setting the PAGER or MANPAGER environment variables. As I looked through the man page for man, I found this near the end:

To get a plain text version of a man page, without backspaces and underscores, try
     # man foo | col -b > foo.mantxt

So I tried man jq | col -b | grep slurp and I got the lines with slurp in them, including the --slurp argument lines.

I think the best clue is the phrase in the man page 'without backspaces and underscores'. Sure enough if I allow for two characters between each letter of slurp, then I see the lines I expect:

man -P cat jq | grep s..l..u..r..p

Using od -a to examine one of the lines with --slurp gives me this:

0000420   sp   U   s   e  sp   -  bs   -   -  bs   -   s  bs   s   l  bs
0000440    l   u  bs   u   r  bs   r   p  bs   p   f  bs   f   i  bs   i
0000460    l  bs   l   e  bs   e  sp   i   n   s   t   e   a   d   .  nl

It's the portion of a line that says Use --slurpfile instead. The renderer apparently uses backspace and printing the letter again to represent bold printing of the string --slurpfile. The backspace and extra letter are what defeated your simple grep commands. (Mine too)

So it looks like piping the man page through col -b before grep will let you search the output the way you want.

Sotto Voce
  • 4,131