2

I'm trying to read from two fifos (read from one, if there's no content read from the other, and if there's no content in neither of both try it again later), but it keeps blocking the process (even with the timeout option).

I've followed some other questions that reads from files (How to read from two input files using while loop, and Reading two files into an IFS while loop) and right now I have this code:

while true; do
    while
            IFS= read -r msg; statusA=$?
            IFS= read -u 3 -r socket; statusB=$?
            [ $statusA -eq 0 ] || [ $statusB -eq 0 ]; do
        if [ ! -z "$msg" ]; then
            echo "> $msg"
        fi
        if [ ! -z "$socket" ]; then
            echo ">> $socket"
        fi
    done <"$msg_fifo" 3<"$socket_fifo"
done

Is there something I'm doing wrong? Also, I can't use paste/cat piped or it blocks the process completely.

  • Relating: https://unix.stackexchange.com/q/522877/315749 – fra-san Jun 13 '22 at 16:56
  • @fra-san I've seen this post before. As far as I know, their read was blocked by absence of any writer. In my case, there's a writer, but no data. – Roger Miranda Perez Jun 13 '22 at 16:58
  • @RogerMirandaPerez, do you have a writer? i.e. something like cat > fifo running? Because, well, it works for me with the timeout option as long as both the pipes are open on the other end for the whole time the reader runs. (They have to be open on the write end when the reader starts, since otherwise the open blocks; and if the writers disappear, all reads start to return zero-length reads, basically turning the reader into a busyloop.) – ilkkachu Jun 13 '22 at 17:08
  • But yes, the proper way to do that would be with an actual programming language. If you can't convert the whole script but e.g. Perl is an option, you might be able to just do the "read from either pipe" part in Perl and keep the rest of the shell script intact. – ilkkachu Jun 13 '22 at 17:09
  • @ilkkachu the full code is here. I've noticed that the first pipe have a writer, but the second one no (it just outputs data every few minutes with an echo > pipe) – Roger Miranda Perez Jun 13 '22 at 17:12
  • @RogerMirandaPerez what is the problem you are solving? You want to build some orchestrator to strart minecraft servers at will? – etosan Jun 13 '22 at 17:21
  • @etosan yes, it's a bit more complex than that but the idea is having a server that allows you to configure and start a minecraft server using some protocol. I chose Bash because I had to start a Docker container, and the Docker API for Java (that was my original idea) was very poorly documented – Roger Miranda Perez Jun 13 '22 at 17:25
  • bash is not really safe to read unsanitized network input even if you can of course do it. I would suggest you to approach problem from different side pick a transport you want the orchestrator to use (http seems safest bet) use some simple http handler and write in some other language: java/php/python. Than combine this handler server with some service manager to start containers on demand. All languages can invoke system binaries usually through system() call, but the other problem you will now have is service management. – etosan Jun 13 '22 at 17:30
  • But discussion and optimal design for your circumstances is now way beyond capabilities of this communication space. – etosan Jun 13 '22 at 17:32

2 Answers2

3

This is by default and by design, and I am not sure you can solve this with bash. You can certainly solve this with zsh (it has select() syscall module) but maybe shell is not the right language for the job at this point.

The issue is simple as everything in unix is, but actual understanding and solution requires some deeper thinking and practical knowledge.

Cause of the effect is blocking flag set on the file descriptor, which is done by the system kernel by default when you open any VFS object.

In unix, process cooperation is synchronized by blocking at IO boundaries, which is very intuitive and makes everything very elegant and simple and makes naive and simplistic programming of beginner and intermediate application programmers "just work".

When you open object in question (file, fifo or whatever), the blocking flag set on it ensures, that any read form the descriptor immediately blocks the whole process when there is no data to be read. This block is unblocked only and only after some data is filled into the object from the other side (in case of pipe).

Regular files are an "exception", at least when compared to pipe, as from the point of IO subsystem they "never block" (even if they actually do - ie unsetting blocking flag on file fd has no effect, as process block happens deeper in kernel's storage fd read routine). From process POV disk read is always instant and in zero time without blocking (even though system clocks actually jump forward during such read).

This design has two effects you are observing:

First, you never really observe effects of IO blocking when juggling just regular files from shell (as they never block for real, as we explained above).

Second, once you get blocked in shell on read() from pipe, like you've got here, your process is essentially stuck in block forever as this kind of blocking is "for real", at least until more data is not filled in from the other side of the pipe. Your process does not even run and consume CPU time in that state, it is kernel who holds it blocked from outside, until more data arrives, and thus process timeout routines cannot even run either (as that requires process to be consuming CPU time ie running).

Your process will remain blocked at least until you fill pipe with sufficient amount of data to be read, then it will ublocked briefly, until all the data is consumed again and process is blocked again.

If you ponder about it carefully this is actually what makes pipes in shell work.

Have you ever wondered how come that complex shell pipeline adapts to fast or slow programs somehow on it's own? This is the mechanism that makes it work. Fast generators spewing output fast will make next program in pipeline read it faster, and slow data generator in pipeline will make any subsequent program in pipeline read/run slower - everything is rate limited by the pipes blocking on data, synchonizing whole pipe as if by magic.

EDIT: further clarification

How to get out of it?

There is no easy way. Not in bash as far as I know.

Easiest one is to ponder more about the problem and redesign it different way.

Due nature of blocking explained above, the most simple is to understand imposed design constraint for shell programs: only have one main input stream.

This will make shell program robust enough to deal with both file input (not a problem) and pipe.

Reading from multiple pipes (ie. even two) will make your program block naturally until both of them have data, so if you can ensure that both pipes are full of data at all times this will work. Unfortunately this rarely works: the moment reading from pipes becomes intertwined and interleaved you have problem with pipes filling in random order - especially if reads are dependent first pipe to become empty will stall you whole processing. We call such situation deadlock.

You can solve problem of reading from multiple pipes by removing the blocking flag from file descriptors in question, but now you have IO scheduling and data multiplexing problem, which require properly equipped language to deal with.

I am afraid bash is not equipped well enough for that and even if it is you now need to learn more how this stuff works then.

etosan
  • 1,054
  • Very informative. Thank you. :) – paul garrett Jun 13 '22 at 16:56
  • I don't think the blocking in itself is the problem, since Bash's read has the timeout option, which does work even though the read blocks. (It basically uses alarm() to have a signal interrupt the system call). You can try something like mkfifo p; while true; do if read -t .5 ret; then echo "read: '$ret'"; else echo .; fi; done < p, it works fine as long as you're running cat > p in another window. The problem here is that with a pipe, even opening it blocks if the write end isn't open and the timeout on read doesn't apply to opening the fifo. – ilkkachu Jun 13 '22 at 16:57
  • (if you have read -t 1 < file, the shell opens file just before starting read, same as with any redirection. putting the redirection e.g. around the whole block just moves the problem to a different place.) – ilkkachu Jun 13 '22 at 16:59
  • Thanks for the response, but it's too late to change the language at this point. I'll wait if anyone knows if there's any other way, and if not I'll just combine the two pipes in one (but that may involve security problems...) – Roger Miranda Perez Jun 13 '22 at 17:01
  • 1
    Yes @ilkkachu on linux (I think only, but maybe I am mistaken) pipe also requires to have both sides connected to not block on open(). In my extempores and experiments I usually solve it by opening pipe for both read and write at the same time, ie 'rw', in the consumer process (I think exec $fd<>fifo), but that is way more advanced than this simple case. – etosan Jun 13 '22 at 17:14
  • @etosan, right, opening for read-write circumvents the blocking issue nicely (on Linux). I'm not sure I can see a reason why it wouldn't work here, too. – ilkkachu Jun 13 '22 at 17:20
  • 1
    As I said zsh is more more potent language and could deal with it more easily, but I checked @RogerMirandaPerez repository and am not entirely sure what is the problem they are is trying to solve. With some clever hacks you could certainly somehow make it work even in bash, but I am more convinced the approach taken is wrong. – etosan Jun 13 '22 at 17:24
1

I've seen the conversation between @etosan and @ilkkachu and I've tested your proposal of using exec fd<>fifo and it works now. I'm not sure if that involves some kind of problem (as @etosan have said) but at least it works now.

exec 3<>"$socket_fifo" # to not block the read
while true; do
    while
            IFS= read -t 0.1 -r msg; statusA=$?
            IFS= read -t 0.1 -u 3 -r socket; statusB=$?
            [ $statusA -eq 0 ] || [ $statusB -eq 0 ]; do
        if [ ! -z "$msg" ]; then
            echo "> $msg"
        fi
        if [ ! -z "$socket" ]; then
            echo ">> $socket"
        fi
    done <"$msg_fifo"
done

I'll also consider your warnings about using Bash in this case.

  • 2
    You'd plausibly have the same problem with $msg_fifo, so you might want ... done 1<>"$msg_fifo" there, too. Or just have exec 3<>onefifo 4<>otherfifo and use -u 3 and -u 4 on the reads for symmetry. But yeah, if both writers close, that might fall into a busyloop. If that turns out to be an issue, I guess you could add an sleep .1 inside the loop, though of course it'd make the reaction time longer. – ilkkachu Jun 13 '22 at 17:40