Long interrupt latency

Art_Hays1 · March 8, 2002, 7:26am

Under 4.25 the max interrupt latency I see is 17usec (5usec typical) on a P3
450Mhz.

Tested with same conditions, same machine, on 6.1 the max interrupt latency
I see is 120usec (4usec typical). This happens when I scroll a window, rage
driver. Is this just a problem with the rage driver locking out
interrupts? Would other display drivers give me the 4.25 performance?

More info about measurement:

Machine: Dell Optiplex GX1+ 450MHz P3. ISA a/d converter interrupting every
msec on interrupt 10. Measuring latency with Tek 3054 by length of
assertion on bus of IRQ10. Downloaded latest 6.1 patches two days ago.

–
Art Hays
National Institutes of Health
avhays@nih.gov

Rennie_Allen2 · March 8, 2002, 4:28pm

Art Hays wrote:

Under 4.25 the max interrupt latency I see is 17usec (5usec typical) on a P3
450Mhz.

Tested with same conditions, same machine, on 6.1 the max interrupt latency
I see is 120usec (4usec typical). This happens when I scroll a window, rage
driver. Is this just a problem with the rage driver locking out
interrupts? Would other display drivers give me the 4.25 performance?

I don’t believe you will ever achieve QNX4 worst-case performance (wrt
interrupt latencies) on QNX6, however, it seems to be fairly common for
video drivers to negatively impact interrupt latency. I don’t know much
about video cards, but I think they can be fairly piggish on the bus
(and that the extent of piggishness does vary between hardware
types/drivers). Of course,since you state that the hardware is the same
in either the QNX4 or QNX6 case, the discrepency points to the rage
driver (or at least the way it is configuring the hardware). 120usec
worst case definately sucks.

Art_Hays1 · March 9, 2002, 2:43am

Your post has somewhat worried me… I’ve used QNX4 for years and assumed
RTP would be equal or better. However, I did a 10 minute search of
www.qnx.com looking for a published maximum or worst-case interrupt latency
spec and couldnt find one for RTP (I only found ‘typical’ specs).

I dont have a problem with some drivers locking out interrupts if some that
do not are available. After all, I chose QNX over things like Linux with
underlying real-time executives because I need bounded scheduling latencies
of user mode processes. However, I dont want to give up good interrupt
latencies either .

“Rennie Allen” <rallen@csical.com> wrote in message
news:3C88E69C.7070900@csical.com…

Art Hays wrote:

Under 4.25 the max interrupt latency I see is 17usec (5usec typical) on
a P3
450Mhz.

Tested with same conditions, same machine, on 6.1 the max interrupt
latency
I see is 120usec (4usec typical). This happens when I scroll a window,
rage
driver. Is this just a problem with the rage driver locking out
interrupts? Would other display drivers give me the 4.25 performance?

I don’t believe you will ever achieve QNX4 worst-case performance (wrt
interrupt latencies) on QNX6, however, it seems to be fairly common for
video drivers to negatively impact interrupt latency. I don’t know much
about video cards, but I think they can be fairly piggish on the bus
(and that the extent of piggishness does vary between hardware
types/drivers). Of course,since you state that the hardware is the same
in either the QNX4 or QNX6 case, the discrepency points to the rage
driver (or at least the way it is configuring the hardware). 120usec
worst case definately sucks.

Rennie_Allen2 · March 9, 2002, 5:55am

Art Hays wrote:

Your post has somewhat worried me… I’ve used QNX4 for years and assumed
RTP would be equal or better. However, I did a 10 minute search of
www.qnx.com > looking for a published maximum or worst-case interrupt latency
spec and couldnt find one for RTP (I only found ‘typical’ specs).

I am not trying to worry you. I think you can definately do much better
than 120usec, but QNX4 is a simpler kernel, designed for x86 only, to
expect the much more sophisticated Neutrino kernel to perform as well as
QNX4 is a little unrealistic. Overall, QNX6 is a much more capable RTOS
than QNX4 (try and put 8 SMP processors to work under QNX4).

Stephen_Thomas · March 9, 2002, 4:59pm

Art Hays <avhays@nih.gov> wrote in message news:a6bse2$1c5$1@inn.qnx.com…

Your post has somewhat worried me… I’ve used QNX4 for years and assumed
RTP would be equal or better. However, I did a 10 minute search of
www.qnx.com > looking for a published maximum or worst-case interrupt
latency
spec and couldnt find one for RTP (I only found ‘typical’ specs).

The reports by Dedicated Systems have some real numbers.
http://www.dedicated-systems.com/encyc/buyersguide/rtos/QNX61Report.htm

Regards,
Stephen

Armin_Steinhoff1 · March 9, 2002, 6:15pm

Rennie Allen wrote:

Art Hays wrote:

Your post has somewhat worried me… I’ve used QNX4 for years and assumed
RTP would be equal or better. However, I did a 10 minute search of
www.qnx.com > looking for a published maximum or worst-case interrupt latency
spec and couldnt find one for RTP (I only found ‘typical’ specs).

I am not trying to worry you. I think you can definately do much better
than 120usec, but QNX4 is a simpler kernel, designed for x86 only, to
expect the much more sophisticated Neutrino kernel to perform as well as
QNX4 is a little unrealistic.

The QNX6 kernel is a fully preemptive kernel … so you should expect
a better context switching.

The everage interrupt latency (200 MHz CPU!) is 1.7 us

Overall, QNX6 is a much more capable RTOS than QNX4 (try and put 8 SMP processors
to work under QNX4).

Thread support … dynamic libraries/shared objects are also very
important.

Armin

Armin_Steinhoff1 · March 9, 2002, 6:40pm

Art Hays wrote:

Under 4.25 the max interrupt latency I see is 17usec (5usec typical) on a P3
450Mhz.

Tested with same conditions, same machine, on 6.1 the max interrupt latency
I see is 120usec (4usec typical). This happens when I scroll a window, rage
driver. Is this just a problem with the rage driver locking out
interrupts? Would other display drivers give me the 4.25 performance?

More info about measurement:

Machine: Dell Optiplex GX1+ 450MHz P3. ISA a/d converter interrupting every
msec on interrupt 10. Measuring latency with Tek 3054 by length of
assertion on bus of IRQ10.

The rage driver (io-graphics) works with a priority of 12 … that means
if the
interrupt thread of the a/d converter works at 10 then you will see a
delayed
processing.

Solution → set the prio of the driver of the a/d converter to 30 e.g.

Regards

Armin

Downloaded latest 6.1 patches two days ago.

–
Art Hays
National Institutes of Health
avhays@nih.gov

Igor_Kovalenko2 · March 10, 2002, 12:44am

True. Which is why the recommended way to handle interrupts is to not use
‘handlers’ at all and deliver events instead so all the interrupt processing
code would be schedulable. Too bad QNX does not follow their own advice -
that approach only makes sense when everyone is playing by the rules.

I believe also that Solaris kernel actually schedules interrupt handlers.
Curious if QNX could do that… (that probably would not make things faster
though).

– igor

“Alex Timofeev” <vchc@mail.ru> wrote in message
news:a6fnrt$jh3$1@inn.qnx.com…

Clear interrupt line inside ISR. Your thread could be preempted by
another ISR which priority always higher then regular processes.

The interrupt thread is running at pri 25. This is the highest on my
system.
\

Igor_Kovalenko2 · March 10, 2002, 3:57am

The latest most definitive answer I got was that QNX6 is indeed little
slower in some places in exchange for more robustness. The x86 kernel tries
to change protection levels on the fly is such a way that it can trap faulty
interrupt handlers, rather than let them crash kernel.

That applies only to x86 version, since no other architecture provides
hardware means to do it. I am not sure if that has any relation to the
observed spikes on interrupt latency.

igor

“Mario Charest” <goto@nothingness.com> wrote in message
news:a6h472$h10$1@inn.qnx.com…

“Rennie Allen” <> rallen@csical.com> > wrote in message
news:> 3C89A3EC.2080202@csical.com> …
Art Hays wrote:

Your post has somewhat worried me… I’ve used QNX4 for years and
assumed
RTP would be equal or better. However, I did a 10 minute search of
www.qnx.com > looking for a published maximum or worst-case interrupt
latency
spec and couldnt find one for RTP (I only found ‘typical’ specs).

I am not trying to worry you. I think you can definately do much better
than 120usec, but QNX4 is a simpler kernel, designed for x86 only, to
expect the much more sophisticated Neutrino kernel to perform as well as
QNX4 is a little unrealistic.

Why?

Overall, QNX6 is a much more capable RTOS
than QNX4 (try and put 8 SMP processors to work under QNX4).

Please don’t compare this with QNX4 as QNX6 has a special version
for SMP, so implementation wise, the single CPU kernel of QNX6 should
compare with the single CPU kernel of QNX4 no?
\

Art_Hays1 · March 10, 2002, 4:47am

The interrupt thread is running at pri 25. This is the highest on my
system.

“Armin Steinhoff” <a-steinhoff@web_.de> wrote in message
news:3C8A5731.FAF5FBFD@web_.de…

Art Hays wrote:

Under 4.25 the max interrupt latency I see is 17usec (5usec typical) on
a P3
450Mhz.

Tested with same conditions, same machine, on 6.1 the max interrupt
latency
I see is 120usec (4usec typical). This happens when I scroll a window,
rage
driver. Is this just a problem with the rage driver locking out
interrupts? Would other display drivers give me the 4.25 performance?

More info about measurement:

Machine: Dell Optiplex GX1+ 450MHz P3. ISA a/d converter interrupting
every
msec on interrupt 10. Measuring latency with Tek 3054 by length of
assertion on bus of IRQ10.

The rage driver (io-graphics) works with a priority of 12 … that means
if the
interrupt thread of the a/d converter works at 10 then you will see a
delayed
processing.

Solution → set the prio of the driver of the a/d converter to 30 e.g.

Regards

Armin

Downloaded latest 6.1 patches two days ago.

–
Art Hays
National Institutes of Health
avhays@nih.gov

Chris_McKillop1 · March 10, 2002, 8:13am

Art Hays <avhays@nih.gov> wrote:

The interrupt thread is running at pri 25. This is the highest on my
system.

And what priority do you have the pulse set to in the sigevent structure?

chris

–
Chris McKillop <cdm@qnx.com> “The faster I go, the behinder I get.”
Software Engineer, QSSL – Lewis Carroll –
http://qnx.wox.org/

Alex_Timofeev · March 10, 2002, 4:56pm

Clear interrupt line inside ISR. Your thread could be preempted by
another ISR which priority always higher then regular processes.

The interrupt thread is running at pri 25. This is the highest on my
system.

Mario_Charest1 · March 11, 2002, 2:27am

“Rennie Allen” <rallen@csical.com> wrote in message
news:3C89A3EC.2080202@csical.com…

Art Hays wrote:

Your post has somewhat worried me… I’ve used QNX4 for years and
assumed
RTP would be equal or better. However, I did a 10 minute search of
www.qnx.com > looking for a published maximum or worst-case interrupt
latency
spec and couldnt find one for RTP (I only found ‘typical’ specs).

I am not trying to worry you. I think you can definately do much better
than 120usec, but QNX4 is a simpler kernel, designed for x86 only, to
expect the much more sophisticated Neutrino kernel to perform as well as
QNX4 is a little unrealistic.

Why?

Overall, QNX6 is a much more capable RTOS
than QNX4 (try and put 8 SMP processors to work under QNX4).

Please don’t compare this with QNX4 as QNX6 has a special version
for SMP, so implementation wise, the single CPU kernel of QNX6 should
compare with the single CPU kernel of QNX4 no?

Kris_Warkentin1 · March 11, 2002, 1:57pm

“Armin Steinhoff” <a-steinhoff@web_.de> wrote in message
news:3C8A514B.FED4549F@web_.de…

Rennie Allen wrote:

Art Hays wrote:

Your post has somewhat worried me… I’ve used QNX4 for years and
assumed
RTP would be equal or better. However, I did a 10 minute search of
www.qnx.com > looking for a published maximum or worst-case interrupt
latency
spec and couldnt find one for RTP (I only found ‘typical’ specs).

I am not trying to worry you. I think you can definately do much better
than 120usec, but QNX4 is a simpler kernel, designed for x86 only, to
expect the much more sophisticated Neutrino kernel to perform as well as
QNX4 is a little unrealistic.

The QNX6 kernel is a fully preemptive kernel … so you should expect
a better context switching.

The everage interrupt latency (200 MHz CPU!) is 1.7 us >

I think that someone doing hard realtime doesn’t care about average latency.
They want to know MAXIMUM latency.

Kris

Overall, QNX6 is a much more capable RTOS than QNX4 (try and put 8 SMP
processors
to work under QNX4).

Thread support … dynamic libraries/shared objects are also very
important.

Armin

Rennie_Allen2 · March 11, 2002, 4:38pm

Mario Charest wrote:

“Rennie Allen” <> rallen@csical.com> > wrote in message

I am not trying to worry you. I think you can definately do much better
than 120usec, but QNX4 is a simpler kernel, designed for x86 only, to
expect the much more sophisticated Neutrino kernel to perform as well as
QNX4 is a little unrealistic.

Why?

Simply, if the kernel does more work, the longest path through the
kernel is certainly going to be longer.

Rennie

Mario_Charest1 · March 11, 2002, 5:23pm

“Rennie Allen” <rallen@csical.com> wrote in message
news:3C8CDD6E.6080300@csical.com…

Mario Charest wrote:

“Rennie Allen” <> rallen@csical.com> > wrote in message

I am not trying to worry you. I think you can definately do much better
than 120usec, but QNX4 is a simpler kernel, designed for x86 only, to
expect the much more sophisticated Neutrino kernel to perform as well as
QNX4 is a little unrealistic.

Why?

Simply, if the kernel does more work, the longest path through the
kernel is certainly going to be longer.

Igor gave information about the extra work the QNX6 kernel is
doing over QNX4. Can you be more specific Rennie about your claim.
I don’t buy the “QNX4 designed for x86” think.

Rennie

Rennie_Allen2 · March 11, 2002, 7:33pm

Mario Charest wrote:

“Rennie Allen” <> rallen@csical.com> > wrote in message
news:> 3C8CDD6E.6080300@csical.com> …

Igor gave information about the extra work the QNX6 kernel is
doing over QNX4. Can you be more specific Rennie about your claim.
I don’t buy the “QNX4 designed for x86” think.

I am merely observing that it is unreasonable to expect a more
sophisticated (i.e. does more stuff) kernel to do “more stuff” without
any associated costs (there’s a basic engineering principle in there
somewhere).

What specific features of the new kernel affect latency (besides the
fact that almost all code paths in the kernel affect latency to some
degree), and precisely what impact they have on latencies, I don’t know;
but I do know that as smart as the kernel gurus at QSSL are, they cannot
make the new instructions that they add, execute without consuming any
clock cycles

Mario_Charest1 · March 11, 2002, 8:16pm

“Rennie Allen” <rallen@csical.com> wrote in message
news:3C8D0688.609@csical.com…

Mario Charest wrote:

“Rennie Allen” <> rallen@csical.com> > wrote in message
news:> 3C8CDD6E.6080300@csical.com> …

Igor gave information about the extra work the QNX6 kernel is
doing over QNX4. Can you be more specific Rennie about your claim.
I don’t buy the “QNX4 designed for x86” think.

I am merely observing that it is unreasonable to expect a more
sophisticated (i.e. does more stuff) kernel to do “more stuff” without
any associated costs (there’s a basic engineering principle in there
somewhere).

What stuff??? I’m no kernel programmer but I don’t see how the
new stuff explain the slowdown (aside what Igor has pointed out).
Without details I’m not ready to make that leap of faith.

What specific features of the new kernel affect latency (besides the
fact that almost all code paths in the kernel affect latency to some
degree), and precisely what impact they have on latencies, I don’t know;
but I do know that as smart as the kernel gurus at QSSL are, they cannot
make the new instructions that they add, execute without consuming any
clock cycles >

I see your point, but I don’t agree. If I have a calculator that only
supports

and - and I want to add support for * and /, I don’t see why supporting

and / would slow down + and -.

Igor_Kovalenko2 · March 11, 2002, 8:52pm

“Armin Steinhoff” <a-steinhoff@web_.de> wrote in message
news:3C8E6FCA.9486DAC8@web_.de…

Rennie Allen wrote:

Mario Charest wrote:

“Rennie Allen” <> rallen@csical.com> > wrote in message

I am not trying to worry you. I think you can definately do much
better
than 120usec, but QNX4 is a simpler kernel, designed for x86 only, to
expect the much more sophisticated Neutrino kernel to perform as well
as
QNX4 is a little unrealistic.

Why?

Simply, if the kernel does more work, the longest path through the
kernel is certainly going to be longer.

Any execution of a longer path can be preempted …

Almost ‘any’, but still good point, I was curious if someone will say that

How ever long kernel paths might be, kernel is preemptable and disables
interrupts only during sertain short windows and docs actually say how long
those windows are (in opcodes). The extra work needed to set up protection
probably does contribute to longer latency, since it has to be done before
interrupt handler is invoked. However I don’t understand why latency could
have such a high jitter (i.e., worst case being SO MUCH worse than average),
assuming we rule out evil things like SMM.

– igor

Rennie_Allen2 · March 11, 2002, 9:00pm

Mario Charest wrote:

“Rennie Allen” <> rallen@csical.com> > wrote in message
news:> 3C8D0688.609@csical.com> …

I see your point, but I don’t agree. If I have a calculator that only
supports

and - and I want to add support for * and /, I don’t see why supporting

and / would slow down + and -.

This is a poor analogy.

The scheduler is entered via an event of some sort (kernel call,
interrupt) and then does some processing to figure out what is going to
be scheduled. If you make those decisions more complex, then you
lengthen the path (this is not akin to creating a totally new, and
completely independant path - as in you analogy).

One obvious mechanism that the QNX6 kernel supports, that the QNX4
kernel didn’t, is a more sophisticated “proxy” mechanism (i.e. events).
It doesn’t take much imagination to to see how the scheduling work
that follows an interrupt which dispatches a QNX6 pulse would take more
time than the scheduling work that follows an interrupt dispatching a
QNX4 proxy (e.g. the fact that a pulse payload can change every time
means that at the very least, the compression algorithm must be more
complex - or not exist at all in favor of queueing everything).

With good design (which I am sure exists in the QNX6 kernel) you can do
a lot to mitigate the effect of this new code, and you may even be able
to make your average latency as good as the old scheduler, but the worst
case latency is going to be impacted since (by definition) it involves
executing all of the new code.

Rennie