-2

Why is pgrep needed? If we can use ps and grep together, why do we need pgrep? It'd be weird if we have a command lsgrep or curlgrep.

But one difference I noticed was, if we first start a tmux session with

tmux new -s foo

then

ps aux | grep tmux

won't be able to find the tmux server process, but

pgrep -l tmux

can. But still, why don't we have a flag with ps so that we can grep like pgrep does to be able to see the tmux server process? What are the differences between ps with grep and pgrep?

Kusalananda
  • 333,661
nonopolarity
  • 3,069
  • I've already told you why ps aux | grep won't find your process. You have to adjust ps's output format with -o ....comm,... if you want it to show the process name. –  Apr 08 '20 at 16:56
  • And no, you can't use ps | grep together because the output format of ps is NOT standardized. In addition to the fact that the grep may find itself. It's the same thing as with ls | grep: you don't use ls | grep, you use find (call it lsgrep if you want). –  Apr 08 '20 at 16:59
  • how can you make ps -o show the tmux server? How about curlgrep? In a way ls is more like showing directory content and find is to find something deep down (and report the exact path)... – nonopolarity Apr 08 '20 at 17:13
  • ps -eo pid,comm | grep 'tmux.*server' –  Apr 08 '20 at 17:14
  • thank you! I found ps -e | grep tmux works well too – nonopolarity Apr 08 '20 at 17:16

1 Answers1

2

The ps command has two fields that, generally, one searches in this way, the args and the comm. The first is the program argument string, NUL-delimited. The second is a "name" for the program. These are stored separately, and (on various operating systems) can both be altered by the program itself, at runtime. Programs such as tmux do indeed do that.

The output of ps is not machine parseable. Several fields can contain unencoded whitespace which makes it impossible to determine field boundaries reliably, because arbitrary length whitespace is also the field separator. args and comm are indeed two such fields. The output of ps is only human readable.

When you grep the output of ps you therefore are pattern matching entire lines, with no reliable way to anchor that pattern to the specific field concerned, except by eliminating pretty much everything else that is of any use, and that you might be trying to find by this method in the first place.

For examples:

% ps -a -x -e -o sid,comm,args |
  grep dbus-daemon |
  head -n 4
   25 nosh                cyclog dbus-daemon/ (nosh)
   25 dbus-daemon         dbus-daemon --config-file ./system-wide.conf --nofork --address=unix:path=/run/dbus/system_bus_socket
  989 dbus-daemon         dbus-daemon --config-file ./per-user.conf --nofork --address=unix:path=/run/user/JdeBP/bus
15107 grep                grep dbus-daemon
% 
% clearenv --keep-path \
  setenv WIBBLE tmux \
  ps -a -x -e -o sid,comm,command |
  grep tmux
15107 ps                  PATH=/usr/local/bin:/usr/bin:/bin WIBBLE=tmux ps -a -x -e -o sid,pid,comm,command
%

Put another way: grep is for operating upon text files comprising lines. The process table is not a text file, and treating it as if it were a text file (by translating it with the ps command) loses information about fields.

The way to perform such a search is to look at the process table with something other than ps. On Linux, one can look directly at /proc/${PID}/comm and the similar psuedo-files for the argument strings, environment strings, and so forth.

Or one can write a tool that fishes out the specific data to be matched from the process table, and that runs pattern maching on just that field alone. This tool is not for text files, but is for process tables. One can call it pgrep.

Of course, on the gripping hand one could write a ps whose output one can process with (say) awk, because it is machine readable, encoding whitespace with vis() and thus providing fields that awk can properly recognize. The downside is that then it is less human-readable and not quite what a conformant ps is supposed to be. I pass its output through console-flat-table-viewer to read it. ☺

% system-control ps -p 740 -o sid,comm,args
SID COMMAND COMMAND
25  dbus-daemon dbus-daemon\040--config-file\040./system-wide.conf\040--nofork\040'--address=unix:path=/run/dbus/system_bus_socket'
% 
% system-control ps -A -o sid,comm,args,envs,tree |
  awk '{ if ("dbus-daemon"==$2) print $3; }'
dbus-daemon\040--config-file\040./system-wide.conf\040--nofork\040'--address=unix:path=/run/dbus/system_bus_socket'
dbus-daemon\040--config-file\040./per-user.conf\040--nofork\040'--address=unix:path=/run/user/JdeBP/bus'
/usr/local/bin/dbus-daemon\040--fork\040--print-pid\0405\040--print-address\0407\040--session
% 
% system-control ps -A -o sid,comm,args,envs,tree |
  awk '{ if ("dbus-daemon"==$2) print $3; }' |
  unvis
dbus-daemon --config-file ./system-wide.conf --nofork '--address=unix:path=/run/dbus/system_bus_socket'
dbus-daemon --config-file ./per-user.conf --nofork '--address=unix:path=/run/user/JdeBP/bus'
/usr/local/bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session
%

Further reading

JdeBP
  • 68,745
  • There are no system-control and clearenv commands on any of my systems, what are those? Of course the argv+envp block is simple memory that the process can write to (I can run a program like int main(int ac, char **av){ memcpy(av[0], "FOO=bar", 8); putenv(av[0]); pause(); } and have ps show its "environment"), but it needs more explanation why this is poignant and how it could happen with regular programs. –  Apr 08 '20 at 22:30
  • It's not that that is poignant. There's an explanation that tmux is one such program in the first paragraph. And you need to read the further reading, although I didn't originally put the manual page for clearenv there as it is fairly tangential. – JdeBP Apr 09 '20 at 08:14