I have a series of commands I run like this:
cmd1 < input > foo
cmd2 < foo > bar
cmd3 foo bar > output
Is there a way to do this without the intermediate files foo
and bar
?
I'd also like to avoid running cmd1
twice:
cmd3 <(cmd1 < input) <(cmd1 < input | cmd2) > output
All 3 commands can take hours to run, and file sizes are in the 1GB to 100GB range (bioinformatics).
Here's a contrived but runnable example:
function cmd1 { sed -r 's/[246]/x/g'; }
function cmd2 { sed -r 's/[135]/-/g'; }
function cmd3 { paste $1 $2; }
seq 10 > input
cmd3 <(cmd1 < input) <(cmd1 < input | cmd2) # cmd1 runs twice
output
1 -
x x
3 -
x x
5 -
x x
7 7
8 8
9 9
10 -0
Not sure if this is helpful, but I want data to flow like this:
input --> cmd1 ---> cmd2 -->|
| |--> cmd3 --> output
------------>|
https://unix.stackexchange.com/a/43536 came close, but not quite.