Signals and Threads

Tim · October 7, 2005, 5:20pm

I have a question regarding signals and a threaded application.

I just inherited some very old DOS code that run under a quasi-real time O/S called AMX (something from the late 80’s - early 90’s). In this quasi-real time O/S the tasks were basically functions linked into the ‘kernal’ running in unprotected memory (similar to what vxWorks is now).

We are totally re-writing everything to run under QNX but I was first asked to port the existing stuff to QNX as a proof of concept to get a baseline test with the hardware. This is a very quick/dirty port not intended to be every deployed or used for anything other than a test.

So what I decided is to create one multi-threaded process because the old code transfers pointers and access global variables and other such stuff.

Now at some point the user can from the keyboard tell the system to shutdown. All the threads should then orderly shut down and leave the hardware in a nice state or save data to files etc.

In all the former ‘functions’ that ran as threads I simply added the following code:

extern int runFlag;  // This is a global

while(runFlag)
{
    // former code that is reply blocked for a timer/message/hardware etc.
}

Then I added a signal handler for SIGUSR1 and inside there I simply set the flag to false when the signal is raised (which happens when the user requests shutdown of the system).

Now, there are about 15 threads running in total, most of which are blocked on hardware/message queues/timers etc.

I expected that when the signal was raised that it would be delivered to every thread in my app, they would break out of their receive loops, see the flag was false and run their exit routines. But it doesn’t appear this is the case because when I do send the signal only one thread actually exits (which ever thread happens to be active at that moment) and the rest stay blocked on their conditions until they unblock and check the while loop.

So obviously the whole app doesn’t shut down gracefully. Yet if I send an unhandled signal to the app (such as SIGINT) everything immediately shutdown.

So my question is, how do multi-threaded apps such as resource managers shut down gracefully if a signal is delivered that requires them to shutdown and inform every thread of the shutdown? Something really simple would be ideal because this isn’t really code that’s going to be kept when the final re-design is done under QNX.

TIA,

Tim

P.S. One more strange thing. When all the threads are running and I am getting keyboard input from the user if I enter a CTRL-C all the threads shutdown except one that shows it’s reply blocked on the serial port (where I read data). For some reason this thread never dies. Even doing a kill -9 # doesn’t delete it from the list of tasks running. I can run my app again with no problems so it’s not interfering with the reading from the serial port and if I CTRL-C again I get another instance of a task reply blocked on the serial port. Anyone know why this might be?

albrecht · October 8, 2005, 3:47pm

Hello Tim,

you want to use pthread_cancel() to properly shut down your threads. The functions do not return I think, instead, the thread’s clean-up routines are called in the reverse order you attached them with pthread_cleanup_push(), and this is where you do the thread-specific clean-ups.

In the signal handler for SIGINT (which is delivered if you press CTRL-C on the keyboard) you can call pthread_cancel(). You should install a signal handler for SIGTERM also that can use the same handler routine.

Remember that the threads are still alive after they are cancelled, the state is DEAD. Only when you call pthread_join() on that thread ID the system will reclaim all the resources it allocated for that particular thread. This is a nice method of synchronizing on thread to continue only if another thread has exited (and therefore ran its clean-up routines). However, pthread_join() BLOCKS until that thread has actually exited, so make sure it does. An alternative is to create the thread in a so-called detachable state (using pthread_attr_setdetach()) - if you do this, the QNX reclaims the thread’s resources immediately after it returns from the entry routine (the clean-up handlers are run also). Not so safe, but may be more useful in your case.

To the P.S.: If a thread is REPLY blocked it cannot be killed whatsoever. That is a strange property of the QNX. What happens is that the server already received the message, so it must reply to it before the thread can continue. The resource manager that handles the message usually gets a special pulse code that informs it that a program that has a connection with it is about to die, and it would then reply to the message. However, many resource managers do not implement this. An alternative is to use select() to wait for input - this can always be blocked by pthread_cancel(). You see this in pidin output because the thread is not REPLY but SIGWAITINFO blocked.

Hope that helps.

Albrecht

Tim · October 20, 2005, 7:24pm

Albrecht,

This did indeed help!

I simply added a bunch of phread_cancel() calls in my signal handler and for those threads that needed to do cleanup I used the pthread_cleanup_push() call to get their exit routines executed. I didn’t need the pthread_join() call because once a thread reaches the dead state I don’t care about it anymore nor does any thread need to shut down in a particular order relative to another thread.

Now it works like a champ. There was no need to use detached threads and now even those threads that before never exited due to being blocked on hardware are exiting properly because the thread is being canceled.

Thanks,

Tim

albrecht · October 23, 2005, 5:22pm

Hello Tim,

I’m glad that I could help you. One more word about the DEAD state: If your program repeatedly creates threads that later exit without detaching you might end up with dozens of dead threads which still consume system resources! Also, for reasons of good design I recommend that you explicitly create the threads in a detachable state because that also tells the reader of your code (possibly yourself in six months time!): “I know what I’m doing, I don’t care about the threads once they are cancelled, it’s intentional!”

Best regards,
Albrecht