On bash
v4.1.2(2), the following simple statement, chosen merely as a minimal example demonstrating the problem, gives seemingly random output:
$ for n in {0..1000}; do echo "$n"; done |
tee >(head -n2) >(sort -grk1,1 | head -n3) >/dev/null
whereas the following gives consistent output:
$ seq 0 10000 | tee >(head -n2) >(sort -grk1,1 | head -n3) >/dev/null
Specifically, for the first statement, the sort
command chooses apparently random consecutive triplets (e.g., 226,225,224; 52,51,50; 174,173,172; etc.). To get a sense of the heterogeneity of the output, one can run the problematic command many times, and then list the number of distinct possibilities:
$ seq -w 0 2000 | while read x; do for n in {0..1000}; do echo "$n"; done |
tee >(head -n2) >(sort -grk1,1 | head -n3) >/dev/null | cat > "file_${x}"; done
Counting the occurrences of the various outputs:
$ for f in file_*; do sort -g "$f" | tail -n3 | paste -sd, ; done |
sort | uniq -c | sort -gk1,1 -k2,2
1 7,8,9
1 17,18,19
1 40,41,42
1 43,44,45
1 47,48,49
1 50,51,52
1 54,55,56
1 58,59,60
1 59,60,61
1 66,67,68
1 71,72,73
1 78,79,80
1 103,104,105
1 104,105,106
1 106,107,108
1 110,111,112
1 111,112,113
1 121,122,123
1 125,126,127
1 129,130,131
1 134,135,136
1 136,137,138
1 142,143,144
1 143,144,145
1 148,149,150
1 150,151,152
1 156,157,158
1 157,158,159
1 165,166,167
1 171,172,173
1 173,174,175
1 174,175,176
1 177,178,179
1 179,180,181
1 181,182,183
1 183,184,185
1 185,186,187
1 186,187,188
1 191,192,193
1 194,195,196
1 198,199,200
1 200,201,202
1 206,207,208
1 208,209,210
1 209,210,211
1 210,211,212
1 216,217,218
1 217,218,219
1 233,234,235
1 236,237,238
1 237,238,239
1 238,239,240
1 242,243,244
1 245,246,247
1 246,247,248
1 254,255,256
1 256,257,258
1 267,268,269
1 270,271,272
1 273,274,275
1 277,278,279
1 279,280,281
1 287,288,289
1 288,289,290
1 305,306,307
1 306,307,308
1 307,308,309
1 326,327,328
1 337,338,339
1 339,340,341
1 340,341,342
1 351,352,353
1 357,358,359
1 359,360,361
1 365,366,367
1 368,369,370
1 370,371,372
1 376,377,378
1 377,378,379
1 383,384,385
1 386,387,388
1 388,389,390
1 401,402,403
1 408,409,410
1 409,410,411
1 415,416,417
1 419,420,421
1 424,425,426
1 426,427,428
1 432,433,434
1 454,455,456
1 462,463,464
1 466,467,468
1 475,476,477
1 482,483,484
1 487,488,489
1 504,505,506
1 508,509,510
1 511,512,513
1 532,533,534
1 538,539,540
1 544,545,546
1 548,549,550
1 558,559,560
1 603,604,605
1 604,605,606
1 608,609,610
1 659,660,661
1 660,661,662
1 663,664,665
1 668,669,670
1 692,693,694
1 699,700,701
1 717,718,719
1 738,739,740
1 740,741,742
1 750,751,752
1 771,772,773
1 784,785,786
1 796,797,798
1 799,800,801
1 806,807,808
1 814,815,816
1 832,833,834
1 848,849,850
1 858,859,860
1 869,870,871
1 922,923,924
1 952,953,954
1 961,962,963
1 985,986,987
2 64,65,66
2 127,128,129
2 141,142,143
2 169,170,171
2 170,171,172
2 172,173,174
2 187,188,189
2 221,222,223
2 234,235,236
2 252,253,254
2 292,293,294
2 350,351,352
2 364,365,366
2 375,376,377
2 622,623,624
2 666,667,668
3 70,71,72
3 102,103,104
3 137,138,139
3 155,156,157
1826 998,999,1000
shows that the result is correct ~91% of the time. Omitting the >(head -n2)
process substitution from the tee
statement results in the output being correct 100% of the time. I don't see why a race condition would be relevant in explaining the problem, since that should only affect the relative ordering of the output of each of the process substitutions in thetee
statement (i.e., >(head -n2)
may complete first or >(sort -grk1,1 | head -n3)
may do so, but this should only affect the output order, not the result itself; it would even be understandable if the output of the two commands were randomly interleaved). Since tee
should distribute identical copies of the stdout
of the loop to the stdin
of each >()
and since both process substitutions are run in separate sub-shells (https://unix.stackexchange.com/a/331199/14960), neither one should affect the other, yet they clearly interact. How can the interaction be explained? Also, how can the output of a for
/while
loop in bash
be distributed to multiple, independent processes by tee
?
tee
to multiple processes as long as each one reads to the end of the stream? – user001 Jan 07 '18 at 13:02