How to implement a blocked interrupt handler in QNX?

yicheng · January 12, 2008, 2:29am

Hi Folks,

I’m writing a driver which has lengthy interrupt handler call. This ISR has to use mutex/condvar to access the critical section, thus may even fall asleep. The “InterruptAttachEvent()” seems to fit but the ISR may blocked on a mutex/condvar that’s released by itself when the next interrupt occurs, so it will cause deadlock if only single thread ISR is running with “InterruptWait()” waiting for interrupt. So how to implement a ISR which may block on a synchronization primitive?

Linux use “tasklet” or “workqueue” to do the bottom half of the interrupt service, can I find the similar functions in QNX?

Thanks!
Yicheng

mario · January 12, 2008, 6:01pm

You cannot use mutex/convar inside a ISR. ISR Thread setup by interruptAttachEvent are like tasklet/workqueue. If you need to add another layer then setup another thread.

Note that you could setup the ISR thread to receive not just the SIGEV_INTR but other type of event as well.

maschoen · January 12, 2008, 7:14pm

I think there may be some terminology confusion here. ISR (Interrupt Service Routine) usually refers to the instructions executed immediately after an interrupt causes the cpu to transfer control. In QNX the OS provides a thin layer above this so that applications programmers don’t have to deal with the complication of multiple devices on an interrupt, as well as the ability to unblock a thread. When using InterruptAttachEvent, the application does noting during the ISR, but a thread is unblocked afterwards. I think yicheng is referring to this code as ISR because it services the hardware.

An ISR cannot be designed to have an issue entering critical code. It cannot wait for an application that is already in the critical area to complete. There is no pre-empting or “rescheduling” of an ISR except possibly by a higher priority interrupt. On the other hand, an application may mask the interrupt while it enters critical (and hopefully short) code to prevent a conflict. With multiple cpus the issues gets even more complex, as an application might want to enter “critical” code while simultaneously another cpu is already in the ISR.

By using InterruptAttachEvent(), all code, hardware related or not, is executed as a thread, and so the normal synchronization methods, mutexes and condvars work quite well.

There are a number of important latency issues/differences between code in the ISR and code that is executed when an IRS unblocks a thread, however they are idiosyncratic to the specific hardware being used.

yicheng · January 13, 2008, 2:04am

Yes, You are right, I have to add another thread/layer on the top of the current interrupt handler thread to do the lengthy work. The ISR I refered to is set up by “interruptAttachEvent()” and using “InterruptWait()”, so it works as a normal thread. Just wondering if there’s similar function like “create_singlethread_workqueue()” to make things easier.

yicheng · January 13, 2008, 3:08am

My interrupt handler works as a normal thread. It simply calls “InterruptWait()” to block on the interrupt event.

My code runs on SMP but since I’m using normal thread to do interrupt service, this interrupt thread won’t be preempted by another interrupt, so there won’t be such race condition with the ISR, but rather with other application threads.

My problem is some of my interrupt handler has to do some lengthy work. So for my case, I have to use multiple layer of threads, the “bottom” thread that blocks on “InterruptWait()” receives interrupt events and passes them over to the the above thread that handles the lengthy work, and the bottom thread has to be set with high priority to avoid preempted by the above thread. In Linux, there’s workqueue to use, while in QNX, I guess I have to do it by myself.

mario · January 13, 2008, 4:11am

Yes the interrupt thread can be preempted by another interrupt thread of a higher priority interrupt or by another high priority interrupt.

There isn’t much to do. Just setup up two threads. How much simpler can it be. I don’t know about Linux’s workqueue I don’t see the difference.

maschoen · January 13, 2008, 4:37am

My interrupt handler works as a normal thread. It simply >calls “InterruptWait()” to block on the interrupt event.

Of course you are free to call it that, but it will only confuse people here.
The same for the term ISR. Let’s say you wrote a serial port driver. Doesn’t most of the code handle interrupts? No interrupts, not much to do.

My code runs on SMP but since I’m using normal thread to do interrupt >service, this interrupt thread won’t be preempted by another interrupt, >so there won’t be such race condition with the ISR, but rather with other >application threads.

I don’t know what you mean here. Unless you turn off interrupts entirely, your thread can always be pre-empted by another interrupt. Maybe once your hardware interrupts, it is disabled and can’t interrupt again until you have completed the servicing of it in your thread. Is that what you mean?

.>My problem is some of my interrupt handler has to do some lengthy >work. So for my case, I have to use multiple layer of threads, >the “bottom” thread that blocks on “InterruptWait()” receives interrupt >events and passes them over to the the above thread that handles the >lengthy work, and the bottom thread has to be set with high priority to >avoid preempted by the above thread. In Linux, there’s workqueue to >use, while in QNX, I guess I have to do it by myself.

How do you pass data from the higher priority thread, to the lower one, or is that the question you wanted us to answer? Your idea about using a Condvar is a good one.

The usual design issue for this type of structure is whether or not an overrun is possible, and what to do about it. If data comes in faster than you can process it for extended periods of time, eventually you will fill up all available memory in your computer. If the data comes in bursts, you merely have to figure out the worst case and make your buffer big enough to handle it. Then, you can use a condvar to have the thread that handles your interrupt add data to the buffer, and your lower priority thread consume it. I think I recall the manuals having a good example of how to do this. The condvar makes syncronization pretty straight forward. The design of the buffer is up to you, be it FIFO, LIFO, circular, a linked list or whatever.

I

yicheng · January 13, 2008, 7:01am

What I mean is since my interrupt handler works as a normal thread, it calls “InterruptWait()” to block on the interrupt event to come, so it won’t serve the next interrupt until it finishes the current one. I call “InterruptUnmask()” to enable the interrupt as soon as the thread unblocks from the “InterruptWait()”, and the interrupt handler is called after it. Will it cause problem?

maschoen:

How do you pass data from the higher priority thread, to the lower one, or is that the question you wanted us to answer? Your idea about using a Condvar is a good one.

The usual design issue for this type of structure is whether or not an overrun is possible, and what to do about it. If data comes in faster than you can process it for extended periods of time, eventually you will fill up all available memory in your computer. If the data comes in bursts, you merely have to figure out the worst case and make your buffer big enough to handle it. Then, you can use a condvar to have the thread that handles your interrupt add data to the buffer, and your lower priority thread consume it. I think I recall the manuals having a good example of how to do this. The condvar makes syncronization pretty straight forward. The design of the buffer is up to you, be it FIFO, LIFO, circular, a linked list or whatever.

I will use a shared work list as the communication mechanism between the two threads, the producer (higher priority thread) puts the new task into the work list, the consumer (lower priority thread) gets the task from the list. Condvar is a good method. My concern is whether the condvar signal will be lost if the producer signals the same condvar multiple times while the consumer is still processing the first one, due to the fact that the producer runs much faster than the consumer.

maschoen · January 13, 2008, 6:15pm

It might, but it depends on how your hardware works, and why you are getting an interrupt. The most likely problem would occur if the hardware required service in a shorter period than the latency to schedule your thread. If your thread is the highest priority in the system, this is unlikely.

Another problem could occur if you are dealing with a level sensitive interrupt that must be reset before the ISR returns. In that case, using AttachInterrupt() would not work at all. You would get the first interrupt, and the OS ISR would return, unmasking the interrupt, which would immediately fire again. Unless you had more than one processor, your system would hang, and otherwise one processor would be busy responding to the interrupt until you reset the hardware in your code.

I’m making this next example up, but I’ve seen hardware even more evil. What if you had an input device that reads data into a FIFO. If the FIFO is empty when a byte comes in, an interrupt is fired so the hardware expects you to drain the FIFO each time you get an interrupt. Now an interrupt occurs and your thread is scheduled. You read a count register that tells you how many bytes are in the FIFO, and then you read them out. What happens if a byte comes in after you read the count register, but before the FIFO is drained? No interrupt. Since the FIFO is not empty, you will never be interrupted again, and your driver is hung. You could fix this by always making sure the count register is zero before quitting, but it is possible there is some kind of hardware that will not provide such a solution.

The answer is no, the condvar cannot be lost if you code correctly. You may have noticed that a condvar is protected by a Mutex which prevents such a race condition.

yicheng · January 14, 2008, 1:05am

maschoen:

It might, but it depends on how your hardware works, and why you are getting an interrupt. The most likely problem would occur if the hardware required service in a shorter period than the latency to schedule your thread. If your thread is the highest priority in the system, this is unlikely.

Another problem could occur if you are dealing with a level sensitive interrupt that must be reset before the ISR returns. In that case, using AttachInterrupt() would not work at all. You would get the first interrupt, and the OS ISR would return, unmasking the interrupt, which would immediately fire again. Unless you had more than one processor, your system would hang, and otherwise one processor would be busy responding to the interrupt until you reset the hardware in your code.

If “InterruptAttachEvent()” is used, the OS ISR will disable and mask the interrupt line every time an interrupt is fired, it’s up to user interrupt thread to enable and unmask the interrupt, is it correct? So in your case, the user interrupt thread will keep busy of responding interrupt event, but why the system would hang? Higher priority threads can still preempt the user interrupt thread. What is your solution to this problem if the “InterruptAttachEvent()” method doesn’t work? You still have to face the fact that the interrrupt is fired much faster than your interrupt service routin if you use “InterruptAttach()” instead.

maschoen:

I’m making this next example up, but I’ve seen hardware even more evil. What if you had an input device that reads data into a FIFO. If the FIFO is empty when a byte comes in, an interrupt is fired so the hardware expects you to drain the FIFO each time you get an interrupt. Now an interrupt occurs and your thread is scheduled. You read a count register that tells you how many bytes are in the FIFO, and then you read them out. What happens if a byte comes in after you read the count register, but before the FIFO is drained? No interrupt. Since the FIFO is not empty, you will never be interrupted again, and your driver is hung. You could fix this by always making sure the count register is zero before quitting, but it is possible there is some kind of hardware that will not provide such a solution.

Thanks! It’s a very good example, so the interrupt service design is really depend on the hardware!

maschoen · January 14, 2008, 6:32pm

I hate to discuss things when the terminology is in question, so let me go over it a bit. Masking an interrupt with the CPU can occur in three ways.
A program can set the interrupt bit in the program status word (PSW) off, which will mask all interrupts. This is useful in (small) sections of code where timing is critical. The second way is to access the interrupt controller itself and update its mask to include and/or exclude specific interrupts. The final way is what happens when an interrupt occurs. The cpu pushes the current PSW and current (return) address into the stack, turns the PSW interrupt bit off, and transfers control via the vector table to the ISR. This procedure is followed so that all the ISR has do to return is to execute the IRET instruction which pops the PSW and return address, effectively turning interrupts back on in an atomic instruction and thereby returning to where the cpu left off before the interrupt occured. If another interrupt is pending when this happens, control is immediately transfered again in the same way.

The way things work with QNX, is that interrupts are transfered to an OS routine that masks off lower priority interrupts and then turns on the interrupt bit in the status word. This allows an ISR to be interrupted by a higher priority interrupt. When your ISR is entered, only your own interrupt is guaranteed to masked off. After your ISR returns to the OS code, your interrupt is automatically unmasked. I don’t think that there is a way to prevent this from happening, especially when you don’t have an ISR, but are just using InterruptAttach() instead. Hmmm, am I confusing InterruptAttach() and InterruptWait()? Maybe.

In addition, hardware that causes interrupts usually has an interrupt status bit in a register as well as an interrupt enable bit. The interrupt status bit shows whether the device itself wants to interrupt the cpu, but without the enable bit set, no signal is sent to the cpu, or rather the interrupt controller chip. This is useful during initial testing as you can see whether your device has interrupted by reading the status bit without setting up an ISR. It is also possible to have a device run in polled mode this way.

Typically, after the hardware has set its interrupt bit, you need to reset it before it will fire again. Forgetting to do this with an edge sensitive interrupt would just prevent another interrupt. With a level sensitive interrupt, which you will find on the PCI bus, forgetting to reset the interrupt will leave the interrupt at the trigger level. So as soon as the ISR is done, and it’s interrupt is umasked, it will immediately fire again. Unless the OS has some kind of fail safe feature, this will continue and system will be hung in a loop where the ISR executes followed by another interrupt.

Hang is just from the users perspective. The cpu is busy executing the ISR code over and over. Whenever the ISR returns, unmasking the interrupt, the interrupt occurs again because the interrupt signal hasn’t been reset. So nothing else gets done.

You say that like it is a bad thing. If for some odd reason, you have a critical timing constraint in ISR code, you can briefly turn the PSW interrupt bit off and be sure that no other interrupt can occur.

I mentioned using cli sti instructions to turn all interrupts off, but this is rarely needed. You can do this in either the ISR or thread.

Yes, well if this is the case, you need to put code in your ISR to service the hardware. (My use of the word ISR, not yours). That code can put any data received in a buffer and fire a pulse off to wake up your thread.
As before, you have to be careful about simultaneous access to your buffer structure.

rgallen · January 14, 2008, 7:10pm

Stop with the InterruptWait() thing!

Use MsgReceive() with InterruptAttachEvent().

It should all fall together for you with that approach.

yicheng · January 15, 2008, 6:36am

First let me confirm that the word “your ISR” refers to the user interrupt handler thread that use “InterruptAttachEvent()”. Is the interrupt really aumomatically umasked? Here’s I quoto from QNX system document: “By using the InterruptAttachEvent() call, no user ISR is run… The interrupt is automatically masked when the event is generated and then EXPLICITLY unmasked by the thread that handles the device at the appropriate time.” It looks like we have to explicitly unmask the interrupt in our user interrupt handler thread.

Is “InterruptAttach()” the only way to have my own ISR in QNX?

yicheng · January 15, 2008, 6:44am

Could you elaborate the exact reason? I don’t see any difference between these two. Is InterruptWait() a wrapper of PULSE signal and MsgReceive()?

yicheng · January 15, 2008, 6:46am

Could you elaborate the exact reason? I don’t see any difference between these two. Is InterruptWait() a wrapper of PULSE signal and MsgReceive()?

rgallen · January 16, 2008, 1:07am

No InterruptWait is a different kernel call. InterruptWait does not require you to set up a channel, and as a result (since Channels are the means of priority conveyance) there is no priority inheritance with InterruptWait. This is one reason it is bad™.

There are other reasons why it is bad, such as:

You can’t receive messages through an InterruptWait
You can’t attach different sigevent structs for different interrupts.

We all know there are 10 types of people in the world; those who like Neil Diamond and those that don't.

InterruptWait is for people who like Neil Diamond

Actually, InterruptWait is only useful for a very limited set of applications, and yours is not one of them. You clearly want a thread pool on the channel (with a minimum of 2 threads I am guessing).

Read up on thread pools and imagine your code with a MsgReceive instead of a InterruptWait, and I think everything will make sense…

mario · January 16, 2008, 12:42pm

Doesn’t interrupt wait has a speed advantage?

rgallen · January 17, 2008, 12:19am

Don’t see any reason why it would. Internally it is still a sigevent, the kernel still needs to reschedule, and move the thread from the blocked list to the ready list. If there were any difference (maybe some advantage in lookup of priority???) it would seem to be negligible.

You can verify this by looking at the kernel source now…

rgallen · January 17, 2008, 1:20am

OK, there won’t be the lookup for the channel; so it will be slightly faster.

Small price to pay to have control of schedulability, not to mention how complicated the design in this thread was getting trying to avoid this small penalty…

xtang · January 17, 2008, 6:16pm

InterruptWait() has a speed advantage over, say, pulse_attach() (ie resmgr framework). But if you MsgReceive() it on a specific channel, it shouldn’t have too much different.