Interrupt latency measurements, anyone?

Hi!

We have seen the following interesting behaviour on a machine that we
are integrating with 4.25. We have a timer that goes off every 20
microseconds. (Yes, micro-, not milli-). It generates an interrupt
that comes in on IRQ 4. We have a scope on the interrupt line and
another on a chip select that we can toggle programatically. We also
have a scope probe on a line that is asserted when the interrupt
routine takes the sample (this causes the IRQ to go low.)

The hardware is from Octagon Systems and has an A/D card and also a
133Mhz Pentium. We have checked that the IRQ coming out of the A/D
matches the IRQ going into the CPU (thus, not a bus contention type of
problem).

Here’s the ineteresting thing… We watch this on the scope. IRQ 4
goes high. About five microseconds later, the interrupt routine reads
the A/D and IRQ 4 goes low. Hm… that’s about in line with QNX’s
“typical interrupt latency” specification for a chip like
this. But… Every now and then, IRQ 4 goes high, and nothing
happens. 20 microseconds goes buy, the timer goes off again, and we
get hozed, having missed a sample from the A/D. Is there hidden
meaning to the word “typically” in the QNX literature?

We have added the appropriate flags to proc32 so that IRQ 4 is the
highest priority interrupt, and we have added the flag to disallow
interrupt nesting; no effect. We have turned off every other device
driver in the system and raised the priority of the process to 29. No
effect. We have changed the timer from 20 microseconds to 30, 40, 50,
… and all this does is make the problem less frequent.

This reminds me of a posting that I read in this group oh, say, six
months ago, where someone had noted that the QNX kernel was
occasionally “going away for a long period of time with interrupts
apparently disabled”. Anyone care to comment on that? Or remember the
posting? Anyone know the “real” (as opposed to typical) maximum
latency?

Thanks

Bill Mahoney
Technical Support, Inc.
Omaha, Nebraska

Dr. Bill Mahoney <bill@techsi.com> wrote:

We have added the appropriate flags to proc32 so that IRQ 4 is the
highest priority interrupt, and we have added the flag to disallow
interrupt nesting; no effect. We have turned off every other device
driver in the system and raised the priority of the process to 29. No
effect. We have changed the timer from 20 microseconds to 30, 40, 50,
… and all this does is make the problem less frequent.

Could you give a little more detail? It will help in answering your
question/problem.

o What other processes are running on the machine?
o What other drivers are running?
o What priority are these processes running at?

To answer these can you post the output from…

o sin
o sin irqs
o sin rtimers
o sin info

thanks,
chris

cdm@qnx.com > “The faster I go, the behinder I get.”

Chris McKillop – Lewis Carroll –
Embedded Software Design Engineer
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Well having low interrupt latency is not the
same has having no interrupt latency. I’d
guess that the OS is occasionaly in some
state where interrupts must be turned off
for a short time. Neutrino is supposed
to be even better about this, but again,
not perfect.


Dr. Bill Mahoney <bill@techsi.com> wrote:

Hi!

We have seen the following interesting behaviour on a machine that we
are integrating with 4.25. We have a timer that goes off every 20
microseconds. (Yes, micro-, not milli-). It generates an interrupt
that comes in on IRQ 4. We have a scope on the interrupt line and
another on a chip select that we can toggle programatically. We also
have a scope probe on a line that is asserted when the interrupt
routine takes the sample (this causes the IRQ to go low.)

The hardware is from Octagon Systems and has an A/D card and also a
133Mhz Pentium. We have checked that the IRQ coming out of the A/D
matches the IRQ going into the CPU (thus, not a bus contention type of
problem).

Here’s the ineteresting thing… We watch this on the scope. IRQ 4
goes high. About five microseconds later, the interrupt routine reads
the A/D and IRQ 4 goes low. Hm… that’s about in line with QNX’s
“typical interrupt latency” specification for a chip like
this. But… Every now and then, IRQ 4 goes high, and nothing
happens. 20 microseconds goes buy, the timer goes off again, and we
get hozed, having missed a sample from the A/D. Is there hidden
meaning to the word “typically” in the QNX literature?

We have added the appropriate flags to proc32 so that IRQ 4 is the
highest priority interrupt, and we have added the flag to disallow
interrupt nesting; no effect. We have turned off every other device
driver in the system and raised the priority of the process to 29. No
effect. We have changed the timer from 20 microseconds to 30, 40, 50,
… and all this does is make the problem less frequent.

This reminds me of a posting that I read in this group oh, say, six
months ago, where someone had noted that the QNX kernel was
occasionally “going away for a long period of time with interrupts
apparently disabled”. Anyone care to comment on that? Or remember the
posting? Anyone know the “real” (as opposed to typical) maximum
latency?

Thanks

Bill Mahoney
Technical Support, Inc.
Omaha, Nebraska


Mitchell Schoenbrun --------- maschoen@pobox.com

On 18 Aug 2000 18:52:25 GMT, Chris McKillop <cdm@qnx.com> wrote:

Dr. Bill Mahoney <> bill@techsi.com> > wrote:

We have added the appropriate flags to proc32 so that IRQ 4 is the
highest priority interrupt, and we have added the flag to disallow
interrupt nesting; no effect. We have turned off every other device
driver in the system and raised the priority of the process to 29. No
effect. We have changed the timer from 20 microseconds to 30, 40, 50,
… and all this does is make the problem less frequent.



Could you give a little more detail? It will help in answering your
question/problem.

o What other processes are running on the machine?

There’s two processes, but one is blocked and a low priority. There
are no other processes running on the box.

o What other drivers are running?

That si what is interesting. We kill everything except the app and
proc32 + dev.par and efsys.eide.

o What priority are these processes running at?

To answer these can you post the output from…

o sin
o sin irqs
o sin rtimers
o sin info

Monday we can do that for you. How about if I email you the info
offline?

Bill


thanks,
chris

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Embedded Software Design Engineer

In article <399d6001$0$97820$45beb828@newscene.com>, "Dr. says…

Hi!

We have seen the following interesting behaviour on a machine that we
are integrating with 4.25. We have a timer that goes off every 20
microseconds. (Yes, micro-, not milli-). It generates an interrupt
that comes in on IRQ 4. We have a scope on the interrupt line and
another on a chip select that we can toggle programatically. We also
have a scope probe on a line that is asserted when the interrupt
routine takes the sample (this causes the IRQ to go low.)

The hardware is from Octagon Systems and has an A/D card and also a
133Mhz Pentium. We have checked that the IRQ coming out of the A/D
matches the IRQ going into the CPU (thus, not a bus contention type of
problem).

Here’s the ineteresting thing… We watch this on the scope. IRQ 4
goes high. About five microseconds later, the interrupt routine reads
the A/D and IRQ 4 goes low. Hm… that’s about in line with QNX’s
“typical interrupt latency” specification for a chip like
this. But… Every now and then, IRQ 4 goes high, and nothing
happens. 20 microseconds goes buy, the timer goes off again, and we
get hozed, having missed a sample from the A/D. Is there hidden
meaning to the word “typically” in the QNX literature?

We have added the appropriate flags to proc32 so that IRQ 4 is the
highest priority interrupt, and we have added the flag to disallow
interrupt nesting; no effect. We have turned off every other device
driver in the system and raised the priority of the process to 29.

What is the priority of Proc32?? Should be lowered to 26(e.g) in youre case
and the app should have a prio > 26.

No
effect. We have changed the timer from 20 microseconds to 30, 40, 50,
… and all this does is make the problem less frequent.

This reminds me of a posting that I read in this group oh, say, six
months ago, where someone had noted that the QNX kernel was
occasionally “going away for a long period of time with interrupts
apparently disabled”.

The QNX4 kernel is single threaded … so every longer action for copieng a
message buffer could create the delay.

Anyone care to comment on that? Or remember the
posting? Anyone know the “real” (as opposed to typical) maximum
latency?

For kernel traces use the analysis stuff for a detailed analysis:
(QNX2000/QIUCS: /usr/free/qnx4/os/development/sysdbg.tgz )

from sysdbg.tgz.readme:

Begin
File: sysdbg.tgz
Description: QNX4 System Debugging Info… from a seminar given at QNX2000
Keywords: debug kernel monitor watcom wd system
Version:
Entered-date: 00/05/07
Author: Randy Martin
Ported-by:
Original-site:
Copying-policy:
Supplemental:

QNX4 Debugging Seminar Randy Martin FAE Group

Three aspects to debugging will be covered in this seminar:

  1. process level debugging
  2. kernel level debugging (Debugger32)
  3. system analysis

Introduction 5 minutes

  • describe kernel functions in brief. what the kernel does, what Proc32
    does etc.

  • describe when to look at process level debugging, kernel level debug, and
    system analysis.
    How do you decide where to look for your problem.


    process level debugging 10 minutes

  • wd tips and tricks

  • trace calls

  • profiler example (if time)

kernel level debugging (Debugger32) 10 minutes

  • tips and tricks

  • debugging isr’s

  • commands available from debugger

  • invoking the debugger

  • setup (serial / console)

  • breakpoints, fault traps, ready queues etc.

system analysis 30 minutes

  • what the kernel can monitor for you
  • how to use ‘monitor’
  • how to query for specific data using some simple API calls
  • how to analyse output
    End

Regards

Armin

“Mitchell Schoenbrun” <maschoen@tsoft.com> wrote in message
news:spr7l78887v61@corp.supernews.com

Well having low interrupt latency is not the
same has having no interrupt latency. I’d
guess that the OS is occasionaly in some
state where interrupts must be turned off
for a short time. Neutrino is supposed
to be even better about this, but again,
not perfect.


Dr. Bill Mahoney <> bill@techsi.com> > wrote:
Hi!

We have seen the following interesting behaviour on a machine that we
are integrating with 4.25. We have a timer that goes off every 20
microseconds. (Yes, micro-, not milli-). It generates an interrupt
that comes in on IRQ 4. We have a scope on the interrupt line and
another on a chip select that we can toggle programatically. We also
have a scope probe on a line that is asserted when the interrupt
routine takes the sample (this causes the IRQ to go low.)

The hardware is from Octagon Systems and has an A/D card and also a
133Mhz Pentium. We have checked that the IRQ coming out of the A/D
matches the IRQ going into the CPU (thus, not a bus contention type of
problem).

Here’s the ineteresting thing… We watch this on the scope. IRQ 4
goes high. About five microseconds later, the interrupt routine reads
the A/D and IRQ 4 goes low. Hm… that’s about in line with QNX’s
“typical interrupt latency” specification for a chip like
this. But… Every now and then, IRQ 4 goes high, and nothing
happens. 20 microseconds goes buy, the timer goes off again, and we
get hozed, having missed a sample from the A/D. Is there hidden
meaning to the word “typically” in the QNX literature?

We have added the appropriate flags to proc32 so that IRQ 4 is the
highest priority interrupt, and we have added the flag to disallow
interrupt nesting; no effect. We have turned off every other device
driver in the system and raised the priority of the process to 29. No
effect. We have changed the timer from 20 microseconds to 30, 40, 50,
… and all this does is make the problem less frequent.

Have you lower Proc32 priority to 26?

But my guess is that you are asking for too mcuh, 20us between interrupt,
that very high frequency even for a P133. Aside from being cause by the OS,
this
could also be cause by the PC hardware, DMA, VIDEO, bus contention
etc.

I’m assuming your device is ISA?

Have you measure how long your ISR takes?

This reminds me of a posting that I read in this group oh, say, six
months ago, where someone had noted that the QNX kernel was
occasionally “going away for a long period of time with interrupts
apparently disabled”. Anyone care to comment on that? Or remember the
posting? Anyone know the “real” (as opposed to typical) maximum
latency?

Thanks

Bill Mahoney
Technical Support, Inc.
Omaha, Nebraska


\

Mitchell Schoenbrun --------- > maschoen@pobox.com

“Mario Charest” <mcharest@zinformatic.com> writes:

“Mitchell Schoenbrun” <> maschoen@tsoft.com> > wrote in message
news:> spr7l78887v61@corp.supernews.com> …
Dr. Bill Mahoney <> bill@techsi.com> > wrote:
Here’s the ineteresting thing… We watch this on the scope. IRQ 4
goes high. About five microseconds later, the interrupt routine reads
the A/D and IRQ 4 goes low. Hm… that’s about in line with QNX’s
“typical interrupt latency” specification for a chip like
this. But… Every now and then, IRQ 4 goes high, and nothing
happens. 20 microseconds goes buy, the timer goes off again, and we
get hozed, having missed a sample from the A/D. Is there hidden
meaning to the word “typically” in the QNX literature?

We have seen problems with timing determinism when there is an
Ethernet card in the machine, even if the driver is not running. We
had to physically remove the card to reduce jitter. Could you be
seeing something like this?


Andrew Thomas, President, Cogent Real-Time Systems Inc.
2430 Meadowpine Boulevard, Suite 105, Mississauga, Ontario, Canada L5N 6S2
Email: andrew@cogent.ca WWW: http://www.cogent.ca