Neutrino Kernel not executing in real time

I’m having an issue with my QNX kernel execution. It appears that
the timer services of the RTOS are sporadically getting blocked out
for about 10 ms per second. I am attempting to setup a 12.5
millisecond timer loop. The code is very straightforward, bound the
timer to a signal event and block for the event to occur. This
missed time cannot be measured with the Neutrino clock_gettime
function, which does not report the lost time but if time is recorded
directly from reading the processor clock the loss can be observed.
The loss is not linear as multiple 12.5 millisecond increments come
in with very good accuracy, just the occasional 21st event which is
delayed by 2-3 milliseconds.

I’ve read a lot of the threads which point out Neutrino’s
methodology of keeping track of time (via software) and am aware of
the quantization effects which can occur; what I am seeing is not a
linear loss of time; so please don’t post a response to a link to
either of these two articles:

http://www.qnx.com/developers/articles/article_834_1.html
http://www.qnx.com/developers/articles/article_826_2.html


It appears that the kernel is being completely blocked by some other
event, maybe a hardware issue but was curious if anyone has every
experienced such a problem. I am running on an ETX module with an
Intel Pentium-M running at about 1 GHz.

mastadon_75 wrote:

It appears that the kernel is being completely blocked by some other
event, maybe a hardware issue but was curious if anyone has every
experienced such a problem. I am running on an ETX module with an
Intel Pentium-M running at about 1 GHz.

There can be some nastie little BIOS hooks that you may not be able to get rid of without changing to another motherboard chipset. Look for and disable all power save and compatibility features in the BIOS.

If that doesn’t help and no one else has any better ideas then I recommend you to assemble a duplicate CPU setup that uses a desktop processor and desktop chipset and slap it inplace of the existing P-M setup. This may not be acceptable for production but atleast you will be able to narrow down where the problem is coming from - hardware or software.


Evan

Is this a system with flash memory? And is the system doing
something in system management mode below the OS? There are
some embedded boards which emulate an IDE disk in flash memory
using code in system management mode. When that code is running,
interrupts are not processed. This causes latency problems.

John Nagle

mastadon_75 wrote:

I�m having an issue with my QNX kernel execution. It appears that
the timer services of the RTOS are sporadically getting blocked out
for about 10 ms per second. I am attempting to setup a 12.5
millisecond timer loop. The code is very straightforward, bound the
timer to a signal event and block for the event to occur. This
missed time cannot be measured with the Neutrino clock_gettime
function, which does not report the lost time but if time is recorded
directly from reading the processor clock the loss can be observed.
The loss is not linear as multiple 12.5 millisecond increments come
in with very good accuracy, just the occasional 21st event which is
delayed by 2-3 milliseconds.

I�ve read a lot of the threads which point out Neutrino�s
methodology of keeping track of time (via software) and am aware of
the quantization effects which can occur; what I am seeing is not a
linear loss of time; so please don�t post a response to a link to
either of these two articles:

http://www.qnx.com/developers/articles/article_834_1.html
http://www.qnx.com/developers/articles/article_826_2.html


It appears that the kernel is being completely blocked by some other
event, maybe a hardware issue but was curious if anyone has every
experienced such a problem. I am running on an ETX module with an
Intel Pentium-M running at about 1 GHz.

I think some boards also try to handle USB for you

John Nagle wrote:

Is this a system with flash memory? And is the system doing
something in system management mode below the OS? There are
some embedded boards which emulate an IDE disk in flash memory
using code in system management mode. When that code is running,
interrupts are not processed. This causes latency problems.

John Nagle

mastadon_75 wrote:

I�m having an issue with my QNX kernel execution. It appears that
the timer services of the RTOS are sporadically getting blocked out
for about 10 ms per second. I am attempting to setup a 12.5
millisecond timer loop. The code is very straightforward, bound the
timer to a signal event and block for the event to occur. This
missed time cannot be measured with the Neutrino clock_gettime
function, which does not report the lost time but if time is recorded
directly from reading the processor clock the loss can be observed.
The loss is not linear as multiple 12.5 millisecond increments come
in with very good accuracy, just the occasional 21st event which is
delayed by 2-3 milliseconds.
I�ve read a lot of the threads which point out Neutrino�s
methodology of keeping track of time (via software) and am aware of
the quantization effects which can occur; what I am seeing is not a
linear loss of time; so please don�t post a response to a link to
either of these two articles:

http://www.qnx.com/developers/articles/article_834_1.html
http://www.qnx.com/developers/articles/article_826_2.html


It appears that the kernel is being completely blocked by some other
event, maybe a hardware issue but was curious if anyone has every
experienced such a problem. I am running on an ETX module with an
Intel Pentium-M running at about 1 GHz.

Thanks for the helpful advice; we are still trying to determine if
there are any unmaskable interrupts in the system management mode but
have not gotten very far with this issue. We do have a Flash disk but
we are not emulating IDE and I removed it from the system and still
observed the issues.


John Naglewrote:
Is this a system with flash memory? And is the system doing
something in system management mode below the OS? There are
some embedded boards which emulate an IDE disk in flash memory
using code in system management mode. When that code is running,
interrupts are not processed. This causes latency problems.

John Nagle

mastadon_75 wrote:

Thanks for the helpful advice; we are still trying to determine if
there are any unmaskable interrupts in the system management mode but
have not gotten very far with this issue. We do have a Flash disk but
we are not emulating IDE and I removed it from the system and still
observed the issues.

You aren’t going to directly detect any either. That’s the purpose of the SM hooks, to provide full emulation so the only difference between them and having the real hardware is speed and a chunk of CPU time vanishes every time one of 'em gets triggered. Chances are you won’t be able to remove them. Changing CPU card to a different one maybe the only solution but you do need to confirm that this is your problem first.

Write a simple program that hooks IRQ #0 with InterruptAttach() and just counts the number of interrupts and if there is any regular losses then it’ll slowly get behind the expected 1 kHz rate. Do this while doing some common operations, Eg: Textmode listings, desktop trivia, file copying on the flash drive and HDD and LAN …


Evan

On 14 Jan 2006 21:53:59 GMT, johnny_rng@yahoo-dot-com.no-spam.invalid
(mastadon_75) was:

It appears that the kernel is being completely blocked by some other
event, maybe a hardware issue but was curious if anyone has every
experienced such a problem. I am running on an ETX module with an
Intel Pentium-M running at about 1 GHz.

I got non realtime latency on two different dual-SMP systems
for unblock waiting thread:

/// procnto-smp kernel
setprio(0, 63);
ThreadCtl(_NTO_TCTL_RUNMASK, (void*)001);

for(…) {
InterruptWait(0, (0));
cntr_latency = hwboard.read(CNTR); // …1 us resolution
hwboard.irq_reset();
InterruptUnmask(info.Irq, hintid);

}

The normal case max. latency <7 us.
In my case - >60 us.

for dual xeon this problem was solved by disabling
lpt, com, usb in bios.

The issue was finally able to be resolved by disabling a feature of
the BIOS called “Legacy USB Support” so thanks for that suggestion
and thanks to everyone for all the help. We no longer see the 3 ms
complete blockage as a result and our QNX system now delivers the
real-time performance that we were hoping for.