hey David…
Here are some things I have noticed:
o You should be passing in _NTO_INTR_FLAGS_TRK_MSK to InterruptAttachEvent().
o The call to InterruptDetatach() should be passing in the iid (from the
Attach) and not the interrupt number.
o Is Timer_IRQ the system timer or some other bit of hardware? If it is
some other bit of hardware (which it appears to be) where is your interrupt
ack? I see it in you timer_isr() code but not in your thread based handler.
You will want to do the out16(hDeviceHandle+IRQ_CLR, 0x0001) before you
unmask the interrupt.
o Your problems with timer_isr() come from your referencing of global
data directly. You are ment to pass everything in via the “void *arg”
pointer. Normally you would setup a structure with the bits you need
and pass that (and it’s size) into InterruptAttach() which would then
get passed into your handler. In your case this would be something
like:
struct myarea {
struct sigevent event;
int hDeviceHandle;
}
Based on your comments I think you are a little confused about the
difference between InterruptAttach() and InterruptAttachEvent(). In
the end, you have to do the same work in both cases. It turns out,
however, that in most cases all one needs is to have the sigevent
delivered on interrupt and the rest of the work can be done at the
thread level. So, for ease of use, InterruptAttachEvent() exists to
mask the interrupt and deliver the sigevent specified (and nothing more).
The only reason you would use InterruptAttach() over the Event() version
is if you have timing sensitive hardware (ie: you have to hit the hardware
within X amount of time, you can’t wait for a thread to run) OR if you can
avoid having a thread run on every interrupt. For example, on some devices
not every interrupt requires you to do anything besides ack the interrupt.
In these cases you don’t need to return the sigevent structure and can avoid
having a context switch to the handler thread. If you start having really
high interrupt rates (50Khz and beyond), being able to control context
switches in this manner becomes critical.
Hope this helps a little…
chris
David Kuechenmeister <david.kuechenmeister@viasat.com> wrote:
Thanks for your interest in this problem. Let me back up a little, as I have
already done some of the things that you suggest.
I was initially using the implementation:
struct sigevent timer_isr_event;
SIGEV_INTR_INIT( &timer_isr_event );
InterruptAttach(Timer_IRQ,timer_isr,NULL,0,0)
setprio(0,sched_get_priority_max(SCHED_FIFO) - PROC_TIMER_PRI), where
PROC_TIMER_PRI is 10 and was the highest application priority,
to connect to an ISR that only cleared an interrupt. Later, I added a call
to toggle a bit on an IO line. I could detect the interrupt pulse and I
displayed it on a logic analyzer along with the toggled bit. The logic
analyzer was set to trigger on “long” pulses from the ISR, i.e. the toggled
bit wasn’t toggled when it should have been. Initially, the transitions of
the toggled bit followed the interrupt pulse. After some large amount of
time, usually overnight, the pulse generated by the toggled bit would span 2
or 3 interrupts in every 10 or 15 interrupts.
I understand, perhaps incorrectly, that the ISR isn’t scheduled, the
interrupt just needs to be detected. It’s the returned event from the ISR
that makes the handler ready to run. I think hardware interrupts are the
lowest priority on a x86 processor, so if the kernel was busy and had
disabled interrupts, I would never see the hardware interrupt that arrived
during that busy time, would I?
I didn’t look at the scheduling latency because I have a FPGA that handles
the I/O at 600 Hz. The fine time differences would be lost because all the
data is output at the same time during a 600 Hz frame. I might look at
writes on the address bus, though.
I tried the InterruptAttachEvent(Timer_IRQ,&timer_isr_event,0) call just to
try a different route to the handler. I figured if the context switches to
the ISR were removed, the timing might be changed enough to see some change
in the overall behavior. It didn’t.
The problem seems to still be missed interrupts, rather than handler
scheduling. I say this, because when the InterruptAttachEvent as defined
above, isn’t the kernel is solely responsible for detecting the interrupt
and scheduling the handler? That is, doesn’t the kernel perform the
additional function of the ISR that is used with the InterruptAttach() call?
Incidentally, there is quite a bit more processing taking place on this
board. The interrupt handler is just there to kick off a data collection
process. The data is processed pretty extensively after that. Spin puts the
cpu average load at 80 to 90 percent. I guess if the spin contribution to
that figure is removed, the cpu is at about 75 to 85 percent.
Thanks again for your interest and any suggestions.
Sincerely,
David Kuechenmeister
“Rennie Allen” <> rallen@csical.com> > wrote in message
news:> 3DEDA95A.4050604@csical.com> …
David Kuechenmeister wrote:
My conclusion is that the kernel is somehow too busy to service the
hardware
interrupt and so my timing suffers.
Hmmm, since the kernel is pre-emptable, and (presumably) your handler is
at a higher priority than the kernel, I can’t see how the kernel could
be involved. The kernel should only make a scheduling decision upon exit
from the handler, and, if your using InterruptAttachEvent, and if your
event is a pulse, and if the pulse priority is the highest priority,
then the scheduler will context switch (and the context switch is - as
Chris states - very small) directly into your handler.
In order to determine if scheduling latency is playing a role, you
should replace InterruptAttachEvent with InterruptAttach of a handler
that does nothing more than toggle an I/O line high, and return the same
event that you would have registered with InterruptAttachEvent. Then as
the first instruction after the receive in your handler, toggle the I/O
line low. When the problem situation occurs compare the scheduling
latency, to that before the situation occured (obtained by hooking a
scope to the I/O line), if it hasn’t changed then the kernel is not
involved.
As a data point, a project I worked on in a previous life had a 5Khz
control loop - not simple data acquisition (on 16 channels
simultaneously) under QNX4 on 50Mhz 486, so a 166Mhz doing only a single
channel at only 600Hz should be using only a microscopically small
fraction of available CPU.
My question is what causes the kernel to
get in this state. If the processor was overloaded, wouldn’t the problem
show up immediately? When I use “spin”, thanks Igor, the only difference
in
cpu consumption I can see is that my interrupt handler takes a little
more
cpu time when it is missing interrupts. All I do is send a message from
the
handler to schedule my periodic process. Again, why does this work well
for
half a day, then go south?
You have checked the priority of the pulse right ?
Rennie
\
–
Chris McKillop <cdm@qnx.com> “The faster I go, the behinder I get.”
Software Engineer, QSSL – Lewis Carroll –
http://qnx.wox.org/