5

I've seen Fish shell implement process substitution as a function:

# === Fish shell lang:
function psub
   mkfifo $the_pipe
   cat >$the_pipe &
   echo $the_pipe
   # remove pipe when bg job is done
end  

# Example:
diff (echo abc | psub)  (echo def | psub)

Full code: https://github.com/fish-shell/fish-shell/blob/master/share/functions/psub.fish

I've tried for several hours to re-implement it for a non-Fish shell (mksh), but could not do it:

# Executable file:  my.psub.sh
the_pipe="$(mktemp /tmp/pipe.XXXXXXXXXX)"
mkfifo $the_pipe
cat >$the_pipe &
echo $the_pipe

# Example:
diff $(echo abc | my.psub.sh)  $(echo def | my.psub.sh)

The command blocks. I've tried everything I could think of, but I have no idea where to go next.

dgo.a
  • 779
  • are you looking for diff <(echo abc) <(echo def ) in bash ? (and fish example is not working for me) – Archemar Mar 29 '16 at 11:59
  • Yes. Bash implements '<()' as proc. sub. which is non-standard for POSIX. For the fish example: this is what I get: https://pbs.twimg.com/media/Cett6VoXIAAdpcN.jpg:large (function psub is already defined by Fish in each Fish shell you start.) – dgo.a Mar 29 '16 at 12:07
  • OK, I didn't get you want to avoid bash. (yet you use $( )'s substitution for temporary file, is that posix ? ) – Archemar Mar 29 '16 at 12:13
  • It's called "command substitution": $() . I'm not sure if it's standard POSIX. However, the shell I'm using (mksh) implements it. <() is the "process substitution" – dgo.a Mar 29 '16 at 12:16
  • 2
    @dgo.a: Read http://unix.stackexchange.com/a/218505/38906 – cuonglm Mar 29 '16 at 12:44
  • Apparantly, this question was asked by someone else on the Android StackExchange: http://android.stackexchange.com/questions/101839/work-around-for-process-substitution-in-mksh – dgo.a Mar 29 '16 at 12:37
  • @Archemar yes, $(…) is POSIX and, in fact, highly recommended over the older form that used an accent gravis for substitution, as it defines a full recursive shell and quoting environment. – mirabilos Mar 30 '16 at 19:03
  • @cuonglm Thanks for that link. For output it works great: cmd >$(log_it ..) 2>$(log it ...). But, it doesn't seem to work for my example w/diff: diff $( proc | proc ) $( proc | proc). The diff won't run until the subshells exit. In my case, diff needs the output of the subshells. But, I also need to cleanup named pipes. The problem is the subshells need to exit, preventing the cleanup of the named pipes. – dgo.a Mar 30 '16 at 19:24
  • In my experiment (code slightly changed from yours), it’s diff which blocks. It reads abc from the FIFO, then wants to read more from it. At this time, it should be signalled EOF but isn’t. I’m currently investigating why. -- mksh author here – mirabilos Mar 30 '16 at 19:28

2 Answers2

3

It’s a bit difficult but doable:

function die {
        print -ru2 -- "E: $*"
        exit 1
}

function psubin {
        local stdin=$(cat; echo .) pipe

        pipe=$(mktemp /tmp/psub.XXXXXXXXXX) || die mktemp

        # this is racy
        rm -f "$pipe"
        mkfifo "$pipe" || die mkfifo

        (
                # don’t block parent
                exec <&- >&- 2>&-
                # write content to FIFO
                print -nr -- "${stdin%.}" >"$pipe"
                # signal EOF to reader, ensure it’s not merged
                sleep 0.1
                :>"$pipe"
                # clean up
                (sleep 1; rm -f "$pipe") &
        ) &
        print -nr -- "$pipe"
}

diff -u $(echo abc | psubin) $(echo def | psubin)

The problems you and I encountered here are:

  • mkfifo complains unless you rm mktemp’s output first
  • mksh blocks for the background process if it still shares any file descriptors (stdin, stdout, stderr) with the parent (Note: This is probably the only valid use case for using >&- instead of >/dev/null, and only because we can guarantee these fds to no longer be used, nor replaced by any new fds)
  • as we don’t have stdin in the background process we’ll need to cache its content, byte-exact
  • EOF with FIFOs (named pipes) is nontrivial. Let’s just leave at that… (we could do some tricks with trying to open the FIFO non-blocking to check if there’s a reader, and signal EOF until the reader died, but this works well enough for now)
  • it’s nicer if the FIFO is removed after use…

On the other hand: good thing we’re doing this now, because I eventually will wish to implement <(…) into mksh itself, and then I need to know what to watch out for, since we can’t use /dev/fd/ as GNU bash does, because that’s not portable enough.

mirabilos
  • 1,733
  • Thank you so much. This has answered other questions I had about mksh and processes. (I asked the question for esoteric reasons. mksh has been more reliable and predictable than other shells... and a joy to use even w/o process substitution.) – dgo.a Apr 02 '16 at 05:37
-1

This doesn't seem possible:

 diff   (echo abc | ./my.psub.sh)    (echo def | ./my.psub.sh)

It blocks as the 2 sub-shells need to exit before diff is run.

It's doable on the Fish shell, but most likely because they do things like command and process substitution very differently with it's own set of problems.

An alternative would be to use busybox as suggested over in "Work-around for process substitution in mksh":

busybox diff =(sort ./a) =(sort ./b)
dgo.a
  • 779