0

I read from this answer about the distinguish between | and ; https://unix.stackexchange.com/a/159492/318084;

Consider two commands A and B. When you write

A | B
A and B are executed in parallel, and the standard output of A is sent as the standard input of B.

I am confused about the word parallel

I could understand the description from Pipeline (Unix) - Wikipedia

Unix-like computer operating systems, a pipeline is a sequence of processes chained together by their standard streams, so that the output of each process (stdout) feeds directly as input (stdin) to the next one.

Pipeline is passing output to next as input.

Nonetheless, the answer say "parallel", they are executing simultaneously instead of in sequence.

How this mechanism working?

I guess | should spawn a subshell which get variable from A in parent-shell (export the variable) then the subshells are closed automatically when jobs are finished.

Wizard
  • 2,503

1 Answers1

0

Pipelines are an example of stream processing. Once a pipeline is constructed, the processing takes part in many processes at the same time - as soon as the data gets to all processes. Picture this - there are three pieces of data - a b c, and two processes - A and B. Now look at these steps:

  1. a@A B - a enters A, there's nothing in B yet

  2. b@A a@B - a is passed on to B and b enters A

  3. c@A b@B - c gets to A while b reaches B

  4. A c@B - nothing more at A and c at B

Imagine a, b and c are very big. So big only two elements can be present at the machine at one time. Sequential processing demands first processing them at A then at B and storing them all at the same time. Parallel processing means not only lower storage demands, but also engages multiple processors - though not necessarily, as parallelism can be simulated on a single processor by time allocation.

Each step of a pipeline is a separate process running in a subshell. Usually there's buffering on output in particular processes, which means that output is sent out in larger chunks. This optimises operations, but may be turned off and then the output goes out as soon as it's ready. But even with bigger chunks it's still parallel.