Avoiding races with Unix signals and select()

Emile van Bergen

About me

Software
   RADIUS for pppd
   i386 debugger
   HTML menus
   OpenRADIUS...

Technical articles
   Configuration data
   Non-recursive make
   Signals and Select
   Linux GUIs
      Library problems
      The GUI terminal

Work
   E-Advies...
   CV (NL, pdf)...
   Resume (pdf)...

Avoiding races with Unix signals and select()

Some thinking out loud about signals

The problem with signals: you never know when they're delivered. They are completely asynchronous, so delivery may happen before the next system call is made as well as while inside the system call.

This means that a setup where you test for a signal flag that's incremented by a handler, just before going into a blocking call such as select(), has a nasty race: the handler may have been called and the flag incremented just after the test, but before the select.

If that happens, select() won't be interrupted, because the signal has already been handled as far as the kernel is concerned. And for sure, doing a test closer to the select doesn't solve the race condition.

Using blocking masks is no solution either: although signal delivery is deferred till the unmasking call, it can still happen after the unmasking and handling, but before the select().

As far as I can see, there are three obvious solutions to this problem:

A technique I used earlier: make the sighandler write to a pipe that's included in the select set. This is OK as long as you make sure the pipe is nonblocking (otherwise the sighandler can block if called often enough before the pipe is emptied). That has the implication however that one or more sighandler invocations may go undetected by the main program.
This doesn't have to be much of a problem, as long as the condition doesn't last too long and the exact number of handler invocations doesn't matter. In most cases it won't, eg. when a SIGCHLD is delivered, any number of childs may actually have exited, so you already have to loop through wait4(WNOHANG) anyway until no more dead childs are found.
However, if we don't use a separate pipe for every signal but want to use the data in the pipe to tell which signal occurred, we may remain unaware of a particular pending signal when we're flooded by others. That's not very attractive.
Don't use sighandlers at all; set all non-IGN/DFL handlers to an empty handler that does nothing but reinstall the handler, and keep their delivery blocked using sigprocmask. At the point in the main program where you want to test for delivered signals, you use use sigpending to find out which were.
By briefly unblocking and re-blocking delivery, you effectively clear all pending signals (whether they had been handled or not).
This does create a race that can cause an event to be missed, between the sigpending() test and the re-blocking of delivery. No good.
Do all the work within the signal handler itself, and protect all sections in the main program in which this work may not be carried out, using a sigprocmask.
This is possible, but quite risky and I'd rather do the opposite: signals are delived nowhere but at the exact location in the main program where I want to handle them. Also because I don't want to have to expose the program's full state through global variables just so that a signal handler can do its work.

It seems that there is a tradeoff: if you want to control signal delivery so that you can handle them in a particular place in your main program, it is possible to miss a signal or delay its handling for a long time, which isn't very desireable. (First two options.)

The other option is the last one, but as it limits control and requires the work associated with signals to happen in an isolated function (the signal handler), it's isn't very attractive either.

Requirements

It seems that what we want from a real solution is this:

It must be possible to interrupt blocking syscalls and do what's needed before restarting them. In other words, handling signalled events should not be deferred until some other event happens that unblocks the syscall.
This implies that the signal must either be unmasked during those syscalls, or that the pipe-solution is used - although that only works when select is the only blocking syscall.
The exact number of invocations of a handler that happened between the last test and the next one isn't important, but it must always be possible to distinguish between zero and more calls.
It must be possible to test and clear this difference atomically at a particular point in the main program.
This implies that any masking/unmasking trick won't work: there's always the period after the test, before the masking that effectively queues the signal in the pending set.

As said, using a pipe as the queue is attractive as it also solves the problem of interrupting select even though the signal is already delivered when we enter it, by including the reading end in the select. However, there's still the risk of overflowing and missing signals.

Still, the pipe must always have something in it as long as the event hasn't been completely handled and cleared yet, or select will block.

Solution

What if we do it like this: the signal handler empties the pipe first and then writes a byte to it. To all parties concerned, this appears to happen atomically; a signal handler won't interrupt itself (well, we can prevent it on systems that have sigaction()), and surely the main program can never interrupt the signal handler.

That means that the firstmost byte in the pipe is the actual 'signalled' flag. This can never not be set when the handler was actually called.

As long as this flag is set, select() will always exit immediately if you include the reading end of the pipe. And, testing and resetting 'atomically' (as far as the handler is concerned) can be done by reading the single byte from it in non-blocking mode.

If the read returns no data, there's nothing to be done. If read returns something else, we just tested and cleared the flag (which is the existence of the data byte in the pipe) atomically.

The atomicity comes from the fact that if a reading end of a pipe has two readers (effectively, the signal handler that empties it and puts it back, and the main program), a byte present in it has to go either to one or the other, but it can never disappear, or go to both.

Ok!

Now we want to refine the technique a bit, as we want to know which signals are in the set. The idea is this:

Instead of a byte, we use a 32-bit value or even a real sigset_t that holds the 'pending since last test' set in the pipe.

The signal handler, when called, 'empties' the pipe by reading the word (using zero if the pipe didn't have any data), OR'ing the received signal into it and writing it back.

The main program needs to ensure that the read in the signal handler doesn't interfere with the read that it uses to test-and-clear the set. This used to be easy because of the single byte, but if two processes each attempt to read four bytes from a pipe, you can't guarantee that one doesn't gets a short read of two, causing the other two bytes to go to the other.

But if the main process temporarily blocks signal delivery using sigprogmask around the nonblocking read, we're done.

This way, unhandled signals will cause select() to exit immediately, while the main program can do an atomic test-and-clear to handle them at any desired place, without any races!

Generated on Sun Feb 23 17:20:55 2014 by decorate.pl / menuize.pl