2

(I give context for my question first, the question itself is at the bottom where it says QUESTION in bold).

Take two processes A and B. A checks a condition, sees that it isn't satisfied, and goes to sleep/blocks. B satisfies the condition and wakes A up. If everything happens in that order, then we have no problems.

Now if the scheduler goes:

  1. A checks condition, it's not satisfied
  2. B satisfies condition, wake A up
  3. A goes to sleep/blocks

then we lose the wake-up that B performs for A.

I've come across this problem in the context of implementing a blocking semaphore (i.e. one that puts the wait()ing thread to sleep/blocks it instead of letting it spin-wait). Several sources give solutions to this, among them:

Andrew Tanenbaum, Modern Operating Systems, 4th edition, p. 130:

The essence of the problem here is that a wakeup sent to a process that is not (yet) sleeping is lost. If it were not lost, everything would work. A quick fix is to modify the rules to add a wakeup waiting bit to the picture. When a wakeup is sent to a process that is still awake, this bit is set. Later, when the process tries to go to sleep, if the wakeup waiting bit is on, it will be turned off, but the process will stay awake. The wakeup waiting bit is a piggy bank for storing wakeup signals. The consumer clears the wakeup waiting bit in every iteration of the loop.

This article in the Linux journal ("Kernel Korner - Sleeping in the Kernel", Linux Journal #137) mentions something similar:

This code avoids the lost wake-up problem. How? We have changed our current state to TASK_INTERRUPTIBLE, before we test the condition. So, what has changed? The change is that whenever a wake_up_process is called for a process whose state is TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE, and the process has not yet called schedule(), the state of the process is changed back to TASK_RUNNING.

Thus, in the above example, even if a wake-up is delivered by process B at any point after the check for list_empty is made, the state of A automatically is changed to TASK_RUNNING. Hence, the call to schedule() does not put process A to sleep; it merely schedules it out for a while, as discussed earlier. Thus, the wake-up no longer is lost.

As I understand, this basically says "you can mark a process as wanting to go to sleep/block such that a later wakeup can cancel the later sleep/block call".

Finally these lecture notes in the bottom couple paragraphs starting at "The pseudo-code below shows the implementation of such a semaphore, called a blocking semaphore:" gives code for a blocking semaphore and uses an atomic operation "Release_mutex_and_block (csem.mutex);". They claim that:

Please notice that the P()ing process must atomically become unrunnable and release the mutex. This is becuase of the risk of a lost wakeup. Imagine the case where these were two different operations: release_mutex(xsem.mutex) and sleep(). If a context-switch would occur in between the release_mutex() and the sleep(), it would be possible for another process to perform a V() operation and attempt to dequeue_and_wakeup() the first process. Unfortunately, the first process isn't yet asleep, so it missed the wake-up -- instead, when it again runs, it immediately goes to sleep with no one left to wake it up.

Operating systems generally provide this support in the form of a sleep() system call that takes the mutex as a parameter. The kernel can then release the mutex and put the process to sleep in an environment free of interruptions (or otherwise protected).

QUESTION: Do processes in UNIX have some way of marking them as "I'm planning on going to sleep", or a "wakeup waiting bit" as Tanenbaum calls it? Is there a system call sleep(mutex) that atomically releases a mutex and then puts the process to sleep/blocks it?

It's probably somewhat apparent that I'm not familiar with system calls and generally OS internals; if there are any false assumptions apparent in my question or misuses of terminology, I'd be happy to have them pointed out to me.

G. Bach
  • 121
  • Since I came across this while reading up on blocking semaphores, if someone knows any good writeups how they are implemented in detail, I'd be happy to get a reference for that. All textbooks and other publications I could find just say "we do wait() and signal() atomically" without explaining how blocking wait() can be implemented atomically. – G. Bach Jul 18 '17 at 16:39
  • sched_yield is one way for a process to (maybe) get off the CPU – thrig Jul 18 '17 at 18:23
  • This question is far too broad (and has an overlong preamble). Any decent book on Unices will tell you that different Unices have significantly different internals. How such stuff works on Solaris, MacOS, FreeBSD, AT&T System 5, and suchlike is quite different at this level. And whilst a full answer would not fill a book, it would fill one or more chapters of such a book. I possess several internals books where it does, not least one written by your namessake, where it is chapter 6. – JdeBP Jul 18 '17 at 18:24
  • @JdeBP Could you point me to that book/some of those books? I don't know who you mean by my namessake, my username was not chosen after a prominent person. Do you mean Maurice Bach and his Design of UNIX? – G. Bach Jul 18 '17 at 18:28
  • 1
    I do on my own WWW site. See https://unix.meta.stackexchange.com/questions/2642/ for what people on this WWW site think of requests for pointers to books, though. – JdeBP Jul 18 '17 at 21:12

1 Answers1

0

The classical Unixy solution is to let the "wakeup" signal consist of writing a byte to a pipe that the waiting process will try to read from.

(If the processes don't have a common ancestor that can help setting this up, a named pipe can be used instead).

The waiting process can then use the select() syscall (or one of its more recent redesigned alternatives such as pselect or poll/epoll) to atomically perform "wait until something is ready to read from such-and-such file descriptor". When it wakes up, it will read and discard everything there is to read, and then check what work is ready for it to do.

  • I'm trying to understand how blocking synchronization tools get around spin-waiting, so I wonder how select() or its alternatives actually put the process to sleep and somehow wake it back up, or whether it still gets scheduled and keeps checking the pipe until there's something in it. What I've got in my mind is a three-state process model with running/blocked/ready, and I understand blocked as "the scheduler will never schedule this process". I can't fit that with your answer, could you comment on that? A cursory reading of the select() manpage didn't answer it, maybe I missed it though. – G. Bach Jul 18 '17 at 17:19
  • @G.Bach: I thought you were asking how it's done from userland. I have no specific knowledge how it's implemented in the kernel, sorry. On single-core systems (which is more of an ideal abstraction than actual reality these days), select() is implemented as part of the scheduler, so there's no risk of its being preempted at a time it doesn't want to. For multi-core, the scheduler instances running on each core need to coordinate things among themselves, which ultimately comes down to mutexes with spinlocks and perhaps specific (extremely architecture-dependent) hardware support. – hmakholm left over Monica Jul 18 '17 at 17:25
  • I'm pretty sure the OP wants to know how the kernel avoids lost wakeups. What's more, a pipe for wakeup signals is to heavy-weight. – DannyNiu Feb 16 '23 at 06:19