Timer disappearing for 50ms

We are running a QNX 6.1 x86 processor at 100Mhz.

System clock period set to 0.5ms using:
clockperiod.nsec = 500000;
ClockPeriod( CLOCK_REALTIME, &clockperiod, &old_clockperiod, 0);

Every 10 minutes the system gets the time update from an external source and
sets the QNX clock using:
timeStruct.tv_nsec = 0;
timeStruct.tv_sec = time_data->time;
clock_settime( CLOCK_REALTIME, &timeStruct );


The system has a task running at priority 60 that runs off a system timer:
high_priority_task( void )
{
for(;:wink:
{
timeout.tv_nsec = 8000000; // 8ms
timer_timeout( CLOCK_REALTIME, _NTO_TIMEOUT_INTR, &timeout_event,
&timeout, NULL );
InterruptWait_r( 0, NULL );

do_processing();
}
}


Infrequently (1 to 5 times per day) and seemingly randomly the TimerTimout
returns after about 52ms instead of 8ms. There are no interrupts that can
hold the system for >50ms and no applications of higher priority. Can
setting the QNX clock sometimes glitch the operation of the OS timer
functions? What else could make the timer take 50ms?

Trying CLOCK_MONOTONIC instead of CLOCK_REALTIME for the timer made no
difference.

Thanks for any help,
Jon

Jon Wyatt wrote:

We are running a QNX 6.1 x86 processor at 100Mhz.

Infrequently (1 to 5 times per day) and seemingly randomly the TimerTimout
returns after about 52ms instead of 8ms. There are no interrupts that can
hold the system for >50ms and no applications of higher priority. Can
setting the QNX clock sometimes glitch the operation of the OS timer
functions? What else could make the timer take 50ms?

What PC chipset are you using? If it is a Geode then you are screwed, go get something else. In fact, I would avoid all laptop/lowpower targeted chipsets. These beasties have some real nasty BIOS hooks that can get in underneath any OS.

I guess that bus faults/conflicts can also cause large timeouts too.

To investigate, you could easily InterruptAttachEvent() to IRQ#0. That way, you avoid possible bugs in the kernels timing compensation code. I would expect to see less than 100 usec jitter on a 100 MHz x86 CPU and, of course, about 2000 events/sec with no gaps.


Evan

It’s not a new product so switching CPU’s isn’t possible; can’t buy them
anymore anyway. It’s a ZFx86, a pretty cool PC-on-a-chip with more failsafe
boot options than a Mars rover. ZFmicro disappeared for about two years but
may make a comeback after winning a $20 million settlement from National
last December. They never said but I suspected it was based on a Cyrix
core.

We have the BIOS source and have been through it to make changes, so I’m
pretty sure it’s not any lowpower or sleep stuff.

The problem seems to be resolved by rewriting the code to use an external
timer. The rewrite also eliminated some nasty 64 bit math with timestamps
to calculate the TimerTimeout value, and reduced overall CPU usage by more
than 5%. But I really don’t like not knowing where the 50ms glitch came
from because that means it could show up somewhere else.

Bus faults is a good suggestion, but the only thing on the PCI bus is
Ethernet and the glitch happened the same whether or not the Ethernet is
active.

My main suspect at the moment is the devc-ser8250 driver. I found a mention
of a 50ms problem involving tcdrain in the forums. There is a task that
occasionally writes to a serial port to check for an external device. But
it does not use tcdrain and nothing is connected so there are no characters
coming in.

The IRQ0 ISR keeps right on trucking and I was able use it to detect when
TimerTimeout runs long. Is there a way to get any information out of the
kernel from within the ISR, such as which task was running just before the
ISR?

Thanks,
Jon



“Evan Hillas” <evanh@clear.net.nz> wrote in message
news:cv8mj1$p8o$1@inn.qnx.com

Jon Wyatt wrote:
We are running a QNX 6.1 x86 processor at 100Mhz.

Infrequently (1 to 5 times per day) and seemingly randomly the
TimerTimout returns after about 52ms instead of 8ms. There are no
interrupts that can hold the system for >50ms and no applications of
higher priority. Can setting the QNX clock sometimes glitch the
operation of the OS timer functions? What else could make the timer take
50ms?


What PC chipset are you using? If it is a Geode then you are screwed, go
get something else. In fact, I would avoid all laptop/lowpower targeted
chipsets. These beasties have some real nasty BIOS hooks that can get in
underneath any OS.

I guess that bus faults/conflicts can also cause large timeouts too.

To investigate, you could easily InterruptAttachEvent() to IRQ#0. That
way, you avoid possible bugs in the kernels timing compensation code. I
would expect to see less than 100 usec jitter on a 100 MHz x86 CPU and, of
course, about 2000 events/sec with no gaps.


Evan

Jon Wyatt wrote:

We have the BIOS source and have been through it to make changes, so I’m
pretty sure it’s not any lowpower or sleep stuff.

Neat.


The problem seems to be resolved by rewriting the code to use an external
timer. The rewrite also eliminated some nasty 64 bit math with timestamps

That’s always a good move anyway. The 2 kHz system tick you were using was prolly getting a bit hungry on the CPU. Your setup almost qualifies for the 100 Hz default.


The IRQ0 ISR keeps right on trucking and I was able use it to detect when
TimerTimeout runs long. Is there a way to get any information out of the
kernel from within the ISR, such as which task was running just before the
ISR?

Nothing legal me thinks. If you use IAE() then you are not in the ISR and therefore you are the running task, and if you use IA() then you aren’t allowed to make calls that could stall. Either way I’ve never delved.


Evan

“The porn companies long ago figured they couldn’t beat the geeks, deciding to join them instead. Losing a few dollars here and there to downloads is just part of doing business. Retailers often hold a similar attitude about shoplifting. Don’t like it, but can’t avoid it.”

Jon Wyatt postmaster@127.0.0.1 wrote:

We are running a QNX 6.1 x86 processor at 100Mhz.

System clock period set to 0.5ms using:
clockperiod.nsec = 500000;
ClockPeriod( CLOCK_REALTIME, &clockperiod, &old_clockperiod, 0);

Every 10 minutes the system gets the time update from an external source and
sets the QNX clock using:
timeStruct.tv_nsec = 0;
timeStruct.tv_sec = time_data->time;
clock_settime( CLOCK_REALTIME, &timeStruct );

clock_settime() will JUMP the value of the system clock.

You may wish to consider using ClockAdjust() to speed-up/slow-down
the clock a bit on each tick, rather than jumping the adjust.


The system has a task running at priority 60 that runs off a system timer:
high_priority_task( void )
{
for(;:wink:
{
timeout.tv_nsec = 8000000; // 8ms
timer_timeout( CLOCK_REALTIME, _NTO_TIMEOUT_INTR, &timeout_event,
&timeout, NULL );
InterruptWait_r( 0, NULL );

do_processing();
}
}



Infrequently (1 to 5 times per day) and seemingly randomly the TimerTimout
returns after about 52ms instead of 8ms. There are no interrupts that can
hold the system for >50ms and no applications of higher priority. Can
setting the QNX clock sometimes glitch the operation of the OS timer
functions? What else could make the timer take 50ms?

Trying CLOCK_MONOTONIC instead of CLOCK_REALTIME for the timer made no
difference.

Was gonna suggest that in case the clock jumps were causing a problem.

I assume you can detect this happening in software – can you run
the instrument kernel, and tracelogger in circular buffer mode, then
trigger the dump of the log history when you notice this has happened?
It may give you a view into exactly what was happening at the time.

-David

David Gibbs
QNX Training Services
dagibbs@qnx.com