How to grep a specific line _and_ the first line of a file?

Question

Assuming a simple grep such as:

$ psa aux | grep someApp
1000     11634 51.2  0.1  32824  9112 pts/1    SN+  13:24   7:49 someApp

This provides much information, but as the first line of the ps command is missing there is no context for the info. I would prefer that the first line of ps be shown as well:

$ psa aux | someMagic someApp
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
1000     11634 51.2  0.1  32824  9112 pts/1    SN+  13:24   7:49 someApp

Of course, I could add a regex to grep specifically for ps:

$ ps aux | grep -E "COMMAND|someApp"

However, I would prefer a more general solution as there are other cases in which I would like to have the first line as well.

Seems like this would be a good use case for a "stdmeta" file descriptor.

The complexity required by these answers shows how the Unix philosophy of "do one thing and do it well" sometimes fails us when measured by the yard stick of usability: knowing all these commands well enough to apply them to this common problem (filtering process info and still seeing the column labels) shows the downside of the approach: sometimes things don't fit together very cleanly. This is why tools like ack are so useful, and why perl rocketed past sed,awk, etc. in popularity: it's important for the parts to sum up into a coherent whole. — iconoclast, Sep 12 '12 at 19:56
of course, for this particular example, you could use the -C argument to ps and you wouldn't need to pipe it into grep. e.g. ps u -C someApp or even ps u -C app1 -C app2 -C app3 — cas, Sep 13 '12 at 06:06
@iconoclast: of course the Unixy solution would be a tool that can multiplex multiple lines each to be filtered through different set of filters. Kinda a generalized version of ps aux | { head -1; grep foo; } mentioned by @Nahuel Fouilleul below (his is probably the only solution that I'd be able to recall on the spot if needed) — Lie Ryan, Sep 13 '12 at 17:13
@iconoclast: Lacking experience with, and knowledge of the tools, what the tools really do well will always seem entirely useless. Knowing a command well is no where on the yard stick of usability, it's on the yard stick of read the fine manual and practice. These tools have been around for decades. They work and fit together very nicely (and cleanly). — Ярослав Рахматуллин, Sep 15 '12 at 11:19
@ЯрославРахматуллин: I think you may have completely misunderstood what I said. (Perhaps because English is not your first language?) "Usability" is related to UX ("user experience") not utility (or "usefulness"). Pointing out that when a simple operation is this complex it hurts usability is *NOT* the same as saying the tools are useless. Quite obviously they are not useless. No one in their right mind would say they are useless. — iconoclast, Sep 17 '12 at 02:47
Thank you for clarifying. I'll do the same. The command line interface is not particularly user friendly. It's not supposed to be. If you (read: anyone) want it to be useful or usable or "user experience" it, you have to know the tools in the shell and their capabilities, which arguments to give them and in which syntax to give them. The philosophy of "do one thing and do it well" is solid, it's not supposed to be user friendly. The fact that it fails "on the yard stick of usability" is irrelevant. — Ярослав Рахматуллин, Sep 19 '12 at 06:59

Krzysztof Adamski · Accepted Answer · 2015-01-12T18:58:08.013

Good way

Normally you can't do this with grep but you can use other tools. AWK was already mentioned but you can also use sed, like this:

sed -e '1p' -e '/youpattern/!d'

How it works:

Sed utility works on each line individually, running specified commands on each of them. You can have multiple commands, specifying several -e options. We can prepend each command with a range parameter that specifies if this command should be applied to specific line or not.
"1p" is a first command. It uses p command which normally prints all the lines. But we prepend it with a numerical value that specifies the range it should be applied to. Here, we use 1 which means first line. If you want to print more lines, you can use x,yp where x is first line to print, y is last line to print. For example to print first 3 lines, you would use 1,3p
Next command is d which normally deletes all the lines from buffer. Before this command we put yourpattern between two / characters. This is the other way (first was to specify which lines as we did with p command) of addressing lines that the command should be running at. This means the command will only work for the lines that match yourpattern. Except, we use ! character before d command which inverts its logic. So now it will remove all the lines that do not match specified pattern.
At the end, sed will print all the lines that are left in buffer. But we removed lines that do not match from the buffer so only matching lines will be printed.

To sum up: we print 1st line, then we delete all the lines that do not match our pattern from input. Rest of the lines are printed (so only lines that do match the pattern).

First line problem

As mentioned in comments, there is a problem with this approach. If specified pattern matches also first line, it will be printed twice (once by p command and once because of a match). We can avoid this in two ways:

Adding 1d command after 1p. As I already mentioned, d command deletes lines from buffer and we specify it's range by number 1, which means it will only delete 1st line. So the command would be sed -e '1p' -e '1d' -e '/youpattern/!d'
Using 1b command, instead of 1p. It's a trick. b command allows us to jump to other command specified by a label (this way some commands can be omitted). But if this label is not specified (as in our example) it just jumps to the end of commands, ignoring rest of the commands for our line. So in our case, last d command won't remove this line from buffer.

Full example:

ps aux | sed -e '1b' -e '/syslog/!d'

Using semicolon

Some sed implementations can save you some typing by using semicolon to separate commands instead of using multiple -e options. So if you don't care about being portable the command would be ps aux | sed '1b;/syslog/!d'. It works at least in GNU sed and busybox implementations.

Crazy way

Here's, however, rather crazy way to do this with grep. It's definitely not optimal, I'm posting this just for learning purposes, but you may use it for example, if you don't have any other tool in your system:

ps aux | grep -n '.*' | grep -e '\(^1:\)\|syslog'

How it works

First, we use -n option to add line numbers before each line. We want to numerate all the lines we we are matching .* - anything, even empty line. As suggested in comments, we can also match '^', result is the same.
Then we are using extended regular expressions so we can use \| special character which works as OR. So we match if the line starts with 1: (first line) or contains our pattern (in this case its syslog).

Line numbers problem

Now the problem is, we are getting this ugly line numbers in our output. If this is a problem, we can remove them with cut, like this:

ps aux | grep -n '.*' | grep -e '\(^1:\)\|syslog' | cut -d ':' -f2-

-d option specifies delimiter, -f specifies fields (or columns) we want to print. So we want to cut each lines on every : character and print only 2nd and all subsequent columns. This effectively removes first column with it's delimiter and this is exactly what we need.

Line numbering can be done with cat -n as well and would look clearer as with a grep abused for this. — Alfe, Sep 12 '12 at 13:34
Thanks! As a VIM user I love the "Good way". The crazy way is quite the learning experience! — dotancohen, Sep 12 '12 at 14:34
@Alfe: I think nl is much nicer than cat -n for numbering lines. — Nabb, Sep 12 '12 at 14:45
nl does not count empty lines (but prints them without line number), cat -n formats the numbering with preceding spaces, grep -n . strips empty lines at all and adds a colon. All have their ... er ... features ;-) — Alfe, Sep 12 '12 at 15:19
cut -n is a good tip but I wanted to prove that it's possible (ab)using only grep :) — Krzysztof Adamski, Sep 12 '12 at 16:12
Very educational well-written answer. I tried to replace "Pretend" (Near the beginning) with "Prepend" for you but it wanted more changes and I didn't feel like changing random crap in your post, so you might want to fix that. — Bill K, Sep 12 '12 at 16:57
ps aux | sed '1p;/pattern/!d' will print the first line twice if it matches pattern. Best is to used the b command: ps aux | sed -e 1b -e '/pattern/!d'. cat -n is not POSIX. grep -n '^' would number every line (not an issue for ps output which doesn't have empty lines). nl -ba -d $'\n' numbers every line. — Stéphane Chazelas, Sep 13 '12 at 09:46
It is also possible to use '.*' to match all lines with grep. I've mentioned your method in my answer too. Thanks for this. And yes, it will print first line twice if it matches. Another solution is to add 1d after 1p command. Your suggestion is a nicer trick however. — Krzysztof Adamski, Sep 13 '12 at 09:54
There were lots of great, informative answers here. I choose this answer as the selected answer as it helped me learn the most. Thank you! — dotancohen, Sep 13 '12 at 10:08
Note that 1b;... is not portable nor POSIX, there can't be any other command after "b", so you need a newline or another -e expression. — Stéphane Chazelas, Sep 18 '12 at 21:03
@sch: I couldn't find information about this in gnu sed manual. ps aux | sed --posix '1b;/syslog/!d' works as expected. Do you mean that I can't use semicolon to separate commands in sed at all if I want to be portable or it's somehow related only to b command? — Krzysztof Adamski, Sep 19 '12 at 05:47
The --posix option tells "sed" to posix conformant, not that it would stop working if you give it arguments not conforming to the POSIX specification. The POSIX and Unix specifications are now merged and the latest version is at http://pubs.opengroup.org/onlinepubs/9699919799/, see http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html for sed. — Stéphane Chazelas, Sep 19 '12 at 06:44
Adding a function in your shell configuration makes this handily available. Eg (on OSX): function psgrep(){ ps -ef | sed -e '1p' -e "/$1/!d" } — Nathan Long, Oct 23 '14 at 14:57
Your grep -n .* in first example of crazy way will expand to grep -n . .. .file1 .file2 .otherfiles because you've left .* unquoted. — Ruslan, Jan 12 '15 at 10:39
You're right, thank you. How was such an error not spotted until now? :) — Krzysztof Adamski, Jan 12 '15 at 19:00
You can do this portably w/ line numbers like sed -e 1b -e '/pattern/!d;=' | paste -sd\\n: -. — mikeserv, Jan 12 '15 at 19:37
grep -n '.*' can be simplified to grep -n '' or are there some weird grep implementations in which empty pattern does not match anything? — pabouk - Ukraine stay strong, May 06 '22 at 21:32

score 60 · Answer 2 · answered Sep 12 '12 at 11:08

How do you feel about using awk instead of grep?

chopper:~> ps aux | awk 'NR == 1 || /syslogd/'
USER              PID  %CPU %MEM      VSZ    RSS   TT  STAT STARTED      TIME COMMAND
root               19   0.0  0.0  2518684   1160   ??  Ss   26Aug12   1:00.22 /usr/sbin/syslogd
mrb               574   0.0  0.0  2432852    696 s006  R+    8:04am   0:00.00 awk NR == 1 || /syslogd/

NR == 1: Number of record == 1; ie. the first line
||: or:
/syslogd/: Pattern to search for

It might also be worth looking at pgrep, although this is more for scripts rather than user-facing output. It does avoid the grep command itself from appearing in the output, though.

chopper:~> pgrep -l syslogd
19 syslogd

Very nice, thanks. This is also nicely scriptable for future expansion. — dotancohen, Sep 12 '12 at 14:30

score 33 · Answer 3 · edited May 06 '22 at 07:19

33

ps aux | { IFS= read -r line; printf '%s\n' "$line";grep someApp;}

With some head implementations such as the builtin head of ksh93 (enabled with builtin head, though beware not all builds of ksh93 include it):

ps aux | { head -n1;grep someApp;}

With most head implementations however, that doesn't work when the input is not seekable (such as the pipe it is here) as they read their input by block.

With:

{ head -1;grep ok;} <<END
this is a test
this line should be ok
not this one
END

with most head implementations, you only get:

this is a test
this line should be ok

With shells that implement here documents with temporary files instead of pipes.

The line command, where available (it used to be a standard command, but was removed from the standard on the ground that the functionality was available via IFS= read -r) would work for that as it's guaranteed not to read more than one line of input.

With zsh, you can also use IFS= read -re (-e for echo, not to be confused with bash's -e for edit). It's also the only shell whose read won't choke on NUL bytes.

edited May 06 '22 at 07:19

Stéphane Chazelas

544,893

answered Sep 12 '12 at 12:46

Nahuel Fouilleul

1,584

2

That's the idea spelled out directly in bash. I'd like to give more than one thumbs-up for this. I'd just maybe use { IFS='' read line; ... } in case the header starts with spaces. – Alfe Sep 12 '12 at 13:31
This does exactly attack the problem directly. Nice! – dotancohen Sep 12 '12 at 14:36
3

I'd just use head -1 instead of the read/echo combo. – chepner Sep 12 '12 at 16:07
1

@chepner head -1 will not work here because it reads all the input and the grep will find nothing – Nahuel Fouilleul Sep 13 '12 at 07:37
1

Well, it works with head -n1 on my bash. This can probably be implementation specific. My head is not reading whole input in this case, only first line, leaving rest of them in the input buffer. – Krzysztof Adamski Sep 13 '12 at 10:11
3

head -n1 is shorter, but it appears even the POSIX spec is silent as to how much of its input it is allowed to read, so perhaps read line; echo $line is more portable after all. – chepner Sep 13 '12 at 11:59
Interestingly, for ls -la the ls -la | { read line;echo "$line";grep someApp;} command works, but not ls -la | { head -1;grep someApp;}. – dotancohen Oct 21 '13 at 07:47
@chepner, POSIX guarantees head -n1 reads only one line of input (well leaves the input position just after the line it has read) if its input is seekable, not for pipes. – Stéphane Chazelas May 06 '22 at 07:05

score 14 · Answer 4 · edited May 06 '22 at 07:08

14

The HP/UX and procps implementations of ps support internal filter with the -C option.

Suppose you're looking for bash processes:

ps -fC bash

Will list all processes whose name is bash.

edited May 06 '22 at 07:08

Stéphane Chazelas

544,893

answered Sep 12 '12 at 11:14

daisy

54,555

Thanks, that is nice to know. However, it won't find scripts started from python, among others. – dotancohen Sep 12 '12 at 14:32

score 8 · Answer 5 · answered Sep 13 '12 at 05:21

8

I tend to send the header to stderr:

ps | (IFS= read -r HEADER; echo "$HEADER" >&2; cat) | grep ps

This is usually sufficient for human reading purposes. e.g.:

  PID TTY          TIME CMD
 4738 pts/0    00:00:00 ps

The bracketed part could go into its own script for general use.

There's an added convenience in that the output can be further piped (to sort etc.) and the header will remain on top.

answered Sep 13 '12 at 05:21

antak

1,070

Does that descriptor exist? It looks like the page you're linking is a proposal, but your comment, worded the way it is, could confuse people who're genuinely looking for a usable answer to this (which happens to also be yours) How to grep a specific line and the first line of a file? question. It certainly confused me. – antak Jul 25 '21 at 02:22
No, the stdmeta descriptor does not exist currently, but I think that it would be a very helpful addition. Excellent answers that [re|ab]use stderr, such as this answer, demonstrate why. – dotancohen Jul 25 '21 at 08:16

score 5 · Answer 6 · edited Apr 13 '17 at 12:36

5

You could also use tee and head:

ps aux | tee >(head -n1) | grep syslog

Note however that as long as tee is unable to ignore SIGPIPE signals (see e.g. the discussion here) this approach needs a workaround to be reliable. The workaround is to ignore SIGPIPE signals, this can for example be done like this in bash like shells:

trap '' PIPE    # ignore SIGPIPE
ps aux | tee >(head -n1) 2> /dev/null | grep syslog
trap - PIPE     # restore SIGPIPE handling

Also note that the output order is not guaranteed.

edited Apr 13 '17 at 12:36

Community

1

answered Sep 12 '12 at 11:19

Thor

17,182

1

I would not rely on this to work, the first time I ran it (zsh) it produced column headers below grep results. Second time it was fine. – Rqomey Sep 12 '12 at 12:16
1

I haven't seen this yet, but one way to increase reliability is to insert a small delay in the pipeline before the grep: | { sleep .5; cat }. – Thor Sep 12 '12 at 12:42
3

Adding sleeps to avoid concurrency problems is always a hack. Though this might work, it's a step towards the dark side. -1 for this. – Alfe Sep 12 '12 at 13:36
1

I have had a few other strange issues while trying this answer, I set up a question to check – Rqomey Sep 12 '12 at 13:47
This is an interesting use of tee, but I find it unreliable and often only prints the output line, but not the header line. – dotancohen Sep 12 '12 at 14:32

score 4 · Answer 7 · answered Sep 12 '12 at 11:08

4

Perhaps two ps commands would be easiest.

$ ps aux | head -1 && ps aux | grep someApp
USER             PID  %CPU %MEM      VSZ    RSS   TT  STAT STARTED      TIME COMMAND
100         3304   0.0  0.2  2466308   6476   ??  Ss    2Sep12   0:01.75 /usr/bin/someApp

answered Sep 12 '12 at 11:08

emcconville

266

2

I don't like this solution, primarily because the situation could change between the first and second ps aux call... And if you just want that static first line, why not echo it manually? – Shadur-don't-feed-the-AI Sep 12 '12 at 11:19
2

Changes between the two calls aren't to be bothered in this situation. The first will only provide the headline which will always fit to the output of the second. – Alfe Sep 12 '12 at 13:30
2

I don't see why this was downvoted, it certainly is a viable option. Upvoting. – dotancohen Sep 12 '12 at 14:37

score 4 · Answer 8 · answered Sep 12 '12 at 11:42

4

You could use pidstat with:

pidstat -C someApp
or
pidstat -p <PID>

Example:

# pidstat -C java
Linux 3.0.26-0.7-default (hostname)    09/12/12        _x86_64_

13:41:21          PID    %usr %system  %guest    %CPU   CPU  Command
13:41:21         3671    0.07    0.02    0.00    0.09     1  java

Further Info: http://linux.die.net/man/1/pidstat

answered Sep 12 '12 at 11:42

harp

1,027

Thanks, that is nice to know. However, it won't find scripts started from python, among others. – dotancohen Sep 12 '12 at 14:35

taco · Answer 9 · 2012-09-13T04:31:39.623

4

Put the following in your .bashrc file or copy/paste in shell first, for testing.

function psls { 
ps aux|head -1 && ps aux|grep "$1"|grep -v grep;
}

Usage: psls [grep pattern]

$ psls someApp
USER             PID  %CPU %MEM      VSZ    RSS   TT  STAT STARTED      TIME COMMAND
root              21   0.0  0.0  2467312   1116   ??  Ss   Tue07PM   0:00.17 /sbin/someApp

Make sure to source your .bashrc (or .bash_profile if you put it there instead):

source ~/.bashrc

The function will even auto-complete at the shell command line. As you stated in another answer, you can pipe the first line to a file to save one call to ps.

edited Sep 13 '12 at 04:31

answered Sep 13 '12 at 04:05

taco

143

1

Nice, I've been using that kind of function for years. I call my version psl, which only call ps and grep once each (and doesn't need head). – Adam Katz Jan 14 '15 at 21:08

score 4 · Answer 10 · edited May 06 '22 at 07:15

sort but keep header line at the top

# print the header (the first line of input)
# and then run the specified command on the body (the rest of the input)
# use it in a pipeline, e.g. ps | body grep somepattern
body() {
    if IFS= read -r header; then
      printf '%s\n' "$header"
    else # no first line or unterminated first line
      printf %s "$header"
      # return # you may want to return in that case
    fi
    "$@"
}

And use it like this

$ ps aux | body grep someApp
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
1000     11634 51.2  0.1  32824  9112 pts/1    SN+  13:24   7:49 someApp

Thanks, some of those answers discuss the general case of this question. Perfect! — dotancohen, Sep 13 '12 at 10:07

bdesham · Answer 11 · 2023-04-29T18:11:57.507

Thanks mostly to Janis Papanagnou in comp.unix.shell, I use the following function:

grep1() {
    local header
    IFS= read -r header && printf "%s\n" "$header"; grep "$@"
}

This has a number of advantages:

Should work with bash, dash, ksh, and zsh
It’s a drop-in replacement for grep, so you can continue to use whichever flags you want: -i for case-insensitive matching, -E for extended regexes, etc.
Always yields the same exit code as grep, in case you want to programmatically determine whether any lines actually matched
Prints nothing if the input was empty

Usage example:

$ ps -rcA | grep1 databases
  PID TTY           TIME CMD
$ ps -rcA | grep1 -i databases
  PID TTY           TIME CMD
62891 ??         0:00.33 com.apple.WebKit.Databases

don_crissti · Answer 12 · 2015-08-02T18:53:57.820

Another way with gnu ed:

ed -s '!ps aux' <<< $'2,$v/PATTERN/d\n,p\nq\n'

or, if the shell supports process substitution:

printf '%s\n' '2,$v/PATTERN/d' ,p q | ed -s <(ps aux)

that is:

2,$v/PATTERN/d  - remove all lines not matching pattern (ignore the header)
,p              - print the remaining lines
q               - quit

More portable, without gnu '!' or shell substitution - using only ed built-in r to read the output of ps aux into the buffer and then delete non-matching lines in the 2,$ range and print the result:

printf '%s\n' 'r !ps aux' '2,$v/PATTERN/d' ,p q | ed -s

And since the sed commands in the accepted answer output also the line matching themselves, with a sed that supports -f- and a shell that supports process substitution I would run:

printf '%s\n' '2,${' '/PATTERN/!d' '}' | sed -f - <(ps aux)

which pretty much does the same thing as the previous ed commands.

emazep · Answer 13 · 2016-09-29T15:13:05.047

1

The Perl way:

ps aux | perl -ne 'print if /pattern/ || $.==1'

Way easier to read than sed, faster, no risk to pick undesired lines.

edited Sep 29 '16 at 15:13

answered Sep 23 '16 at 02:51

emazep

111

Perl?!? – dotancohen Sep 23 '16 at 08:01

score 0 · Answer 14 · 2015-01-01T14:00:50.060

If that's only for grepping processes with full headers, I'd expand @mrb's suggestion:

$ ps -f -p $(pgrep bash)
UID        PID  PPID  C STIME TTY      STAT   TIME CMD
nasha     2810  2771  0  2014 pts/6    Ss+    0:00 bash
...

pgrep bash | xargs ps -fp will get the same result but without a subshell. If other formatting is required:

$ pgrep bash | xargs ps fo uid,pid,stime,cmd -p
  UID   PID STIME CMD
    0  3599  2014 -bash
 1000  3286  2014 /bin/bash
 ...

pabouk - Ukraine stay strong · Answer 15 · 2022-05-05T22:34:36.197

This solution allows to use the real grep or any other filtering command.

ps aux | { awk 'NR<=1 {print >"/dev/fd/3"; next}; {print}' | grep someApp ; } 3>&1

It should not have these problems which are present in some other solutions:

doubling of the first line
early exit before processing all the input
skipping part of the file after the first line
changing the order of the lines
using unreliable hacks instead of proper process synchronization (to keep the line order)

It could be a base for small utility script or function. Note that it is possible that the redirection using /dev/fd/3 is not available on some systems.

How does it work?

The awk process reads the standard input and splits it to two outputs based on a given condition - here line number: NR<=1.

The first line is sent to the file descriptor 3 and in awk it is not processed further (accomplished by the next command).
The other lines are sent to the grep process through the pipe. The grepped result continues to the standard output.

The parent shell opened the file descriptor 3 for the child processes as a copy of the standard output (3>&1). This way both outputs go to the standard output in the parent shell.