How can I time a pipe?

Question

I want to time a command which consists of two separate commands with one piping output to another. For example, consider the two scripts below:

$ cat foo.sh
#!/bin/sh
sleep 4

$ cat bar.sh
#!/bin/sh
sleep 2

Now, how can I get time to report the time taken by foo.sh | bar.sh (and yes, I know the pipe makes no sense here, but this is just an example)? It does work as expected if I run them sequentially in a subshell without piping:

$ time ( foo.sh; bar.sh )

real    0m6.020s
user    0m0.010s
sys     0m0.003s

But I can't get it to work when piping:

$ time ( foo.sh | bar.sh )

real    0m4.009s
user    0m0.007s
sys     0m0.003s

$ time ( { foo.sh | bar.sh; } )

real    0m4.008s
user    0m0.007s
sys     0m0.000s

$ time sh -c "foo.sh | bar.sh "

real    0m4.006s
user    0m0.000s
sys     0m0.000s

I've read through a similar question (How to run time on multiple commands AND write the time output to file?) and also tried the standalone time executable:

$ /usr/bin/time -p sh -c "foo.sh | bar.sh"
real 4.01
user 0.00
sys 0.00

It doesn't even work if I create a third script which only runs the pipe:

$ cat baz.sh
#!/bin/sh
foo.sh | bar.sh

And then time that:

$ time baz.sh

real    0m4.009s
user    0m0.003s
sys     0m0.000s

Interestingly, it doesn't appear as though time exits as soon as the first command is done. If I change bar.sh to:

#!/bin/sh
sleep 2
seq 1 5

And then time again, I was expecting the time output to be printed before the seq but it isn't:

$ time ( { foo.sh | bar.sh; } )
1
2
3
4
5

real    0m4.005s
user    0m0.003s
sys     0m0.000s

Looks like time doesn't count the time it took to execute bar.sh despite waiting for it to finish before printing its report¹.

All tests were run on an Arch system and using bash 4.4.12(1)-release. I can only use bash for the project this is a part of so even if zsh or some other powerful shell can get around it, that won't be a viable solution for me.

So, how can I get the time a set of piped commands took to run? And, while we're at it, why doesn't it work? It looks like time immediately exits as soon as the first command has finished. Why?

I know I can get the individual times with something like this:

( time foo.sh ) 2>foo.time | ( time bar.sh ) 2> bar.time

But I still would like to know if it's possible to time the whole thing as a single operation.

¹ _{This doesn't seem to be a buffer issue, I tried running the scripts with unbuffered and stdbuf -i0 -o0 -e0 and the numbers were still printed before the time output.}

@pericynthion yep eventually I did. And that also showed what the answers explain: time is actually working but (obviously enough and as I should have realized) the commands in the pipeline run concurrently so the time taken is essentially the time of the slowest. — terdon, May 11 '17 at 08:28
see also https://unix.stackexchange.com/questions/46051/how-to-run-time-on-multiple-commands-and-write-the-time-output-to-file — weberjn, May 28 '20 at 09:09

Kusalananda · Accepted Answer · 2017-05-10T15:27:31.687

48

It is working.

The different parts of a pipeline are executed concurrently. The only thing that synchronises/serialises the processes in the pipeline is IO, i.e. one process writing to the next process in the pipeline and the next process reading what the first one writes. Apart from that, they are executing independently of each other.

Since there is no reading or writing happening between the processes in your pipeline, the time take to execute the pipeline is that of the longest sleep call.

You might as well have written

time ( foo.sh & bar.sh &; wait )

Terdon posted a couple of slightly modified example scripts in the chat:

#!/bin/sh
# This is "foo.sh"
echo 1; sleep 1
echo 2; sleep 1
echo 3; sleep 1
echo 4

and

#!/bin/sh
# This is "bar.sh"
sleep 2
while read line; do
  echo "LL $line"
done
sleep 1

The query was "why does time ( sh foo.sh | sh bar.sh ) return 4 seconds rather than 3+3 = 6 seconds?"

To see what's happening, including the approximate time each command is executed, one may do this (the output contains my annotations):

$ time ( env PS4='$SECONDS foo: ' sh -x foo.sh | PS4='$SECONDS bar: ' sh -x bar.sh )
0 bar: sleep 2
0 foo: echo 1     ; The output is buffered
0 foo: sleep 1
1 foo: echo 2     ; The output is buffered
1 foo: sleep 1
2 bar: read line  ; "bar" wakes up and reads the two first echoes
2 bar: echo LL 1
LL 1
2 bar: read line
2 bar: echo LL 2
LL 2
2 bar: read line  ; "bar" waits for more
2 foo: echo 3     ; "foo" wakes up from its second sleep
2 bar: echo LL 3
LL 3
2 bar: read line
2 foo: sleep 1
3 foo: echo 4     ; "foo" does the last echo and exits
3 bar: echo LL 4
LL 4
3 bar: read line  ; "bar" fails to read more
3 bar: sleep 1    ; ... and goes to sleep for one second

real    0m4.14s
user    0m0.00s
sys     0m0.10s

So, to conclude, the pipeline takes 4 seconds, not 6, due to the buffering of the output of the first two calls to echo in foo.sh.

edited May 10 '17 at 15:27

answered May 10 '17 at 13:11

Kusalananda

333,661

But if that were the case, wouldn't the user and sys values for the pipe be the sum of timing each script alone? Also, I did try with a first script that was running seq and a second that had a while read loop and that made no difference either. – terdon May 10 '17 at 13:15
1

@terdon the values are the sums, but the scripts take very little user and system time — they just wait, which doesn’t count (except in wall-clock time). – Stephen Kitt May 10 '17 at 13:16
3

Note that some shells like the Bourne shell or ksh93 only wait for the last component of the pipeline (sleep 3 | sleep 1 would last 1 second). the Bourne shell has no time keyword, but in ksh93, when run with time, all the components are waited for. – Stéphane Chazelas May 10 '17 at 15:18
@StéphaneChazelas The above was timed with the built-in time in ksh93, Version AJM 93u+ 2012-08-01. – Kusalananda May 10 '17 at 15:25
4

I'm just saying that one may be surprised to find that sleep 10 | sleep 1 takes one second while time sleep 10 | sleep 1 takes 10 seconds in ksh93. In the Bourne shell time sleep 10 | sleep 1 would take one second, but you'd get the time output (for sleep 10 only and from /usr/bin/time) out of the blue 9 seconds later. – Stéphane Chazelas May 10 '17 at 15:31
@StéphaneChazelas You would guard against this by timing a subshell, yes? As was done? – Kusalananda May 10 '17 at 15:33
1

That's not about guarding anything. time correctly times the pipeline, but changes the behaviour of the shell in ksh93. (sleep 10 | sleep 1) takes 1 second, time (sleep 10 | sleep 1) takes 10 seconds. { (sleep 10 | sleep 1); echo x; } outputs x after 1 second, time { (sleep 10 | sleep 1); echo x; } outputs x after 10 seconds. Same if you put that code in a function and time the function. – Stéphane Chazelas May 10 '17 at 15:39
@StéphaneChazelas Aaaah! I understand now. Thanks! I didn't fully understand what you were getting at at first, but now I do. – Kusalananda May 10 '17 at 15:40
1

Note that in ksh93 like in zsh (-o promptsubst here), you can do typeset -F SECONDS to get a less approximate number of seconds (POSIX sh has no SECONDS) – Stéphane Chazelas May 10 '17 at 15:49

score 13 · Answer 2 · answered May 10 '17 at 13:44

Would this be a better example?

$ time perl -e 'alarm(3); 1 while 1;' | perl -e 'alarm(4); 1 while 1;'
Alarm clock

real    0m4.004s
user    0m6.992s
sys     0m0.004s

The scripts busyloop for 3 and 4 seconds (resp.), taking a total of 4 seconds in real time because of parallel execution, and 7 seconds of CPU time. (at least approximately.)

Or this:

$ time ( sleep 2; echo) | ( read x; sleep 3 )

real    0m5.004s
user    0m0.000s
sys     0m0.000s

These don't run in parallel, so the total time taken is 5 seconds. It's all spent sleeping, so no CPU time used.

score 3 · Answer 3 · answered May 10 '17 at 13:47

If you have sysdig you can insert tracers at arbitrary points, assuming you can modify the code to add the necessary writes to /dev/null

echo '>::blah::' >/dev/null
foo.sh | bar.sh
echo '<::blah::' >/dev/null

(but that fails your "single operation" requirement) and then record things via

$ sudo sysdig -w blalog "span.tags contains blah"

and then you'll probably need a sysdig chisel to export just the durations

description = "Exports sysdig span tag durations";
short_description = "Export span tag durations.";
category = "Tracers";

args = {}

function on_init()
    ftags = chisel.request_field("span.tags")
    flatency = chisel.request_field("span.duration")
    chisel.set_filter("evt.type=tracer and evt.dir=<")
    return true
end

function on_event()
    local tags = evt.field(ftags)
    local latency = evt.field(flatency)
    if latency then
        print(tostring(tags) .. "\t" .. tonumber(latency) / 1e9)
    end
    return true
end

which once saved to your sysdig/chisels directory as the file spantagduration.lua can be used as

$ sysdig -r blalog -c spantagduration
...

Or you can play around with csysdig or the JSON output.

score 1 · Answer 4 · answered Jan 27 '20 at 16:24

Too fun I know this is an old thread but I found myself in the same situ. I found out aliases are a simple workaround. At least this works for bash and fish. Not sure about all shells.

Instead of: time ( foo.sh | bar.sh )

Try:

alias foobar="foo.sh | bar.sh" time foobar

How can I time a pipe?

4 Answers4

Linked

Related