Multiple Timers & Debugging.

I’ve got an application that creates multiple repetitive pulse-based timers.
When debugging the application (with ddd), during pauses in the program’s
execution, the system continues to send pulses (which are then queued).
Eventually, the system locks up solid (no keyboard response, no ping
response). We consider this to be NOT A GOOD THING :slight_smile:

I’ve built a simple test program that simulates the ddd pause with sleep().
I think I also see some sluggish output from the program just prior to the
lockup. The number of timer ticks that cause the lockup seem to be dependent
on the timer interval (faster ticking locks up with fewer queued ticks).
Using a single timer seems to eliminate the problem. The lockup occurs
according to the following table on a Celeron 566. All times are in seconds.

Timer1 Interval Timer2 Interval Time to Lockup
.001 .0015 4
.002 .003 11
.003 .0045 25
.004 .006 39
.001 not used >600 or never
.002 not used >600 or never

In QNX 4, I believe the timer ticks would just increment a count in a proxy
with no other ill effects. QNX 6’s pulses evidently have some more
significant overhead.

  1. Is this a known issue?
  2. Please explain the mechanism that causes the lockup.
  3. Is there a recommendation against using multiple timers in a single
    application?
  4. Is there a recommended approach for debugging applications with multiple
    pulse-based timers?

Thanks,
Marty Doane

Marty Doane <marty.doane@rapistan.com> wrote:

I’ve got an application that creates multiple repetitive pulse-based timers.
When debugging the application (with ddd), during pauses in the program’s
execution, the system continues to send pulses (which are then queued).
Eventually, the system locks up solid (no keyboard response, no ping
response). We consider this to be NOT A GOOD THING > :slight_smile:

Can you supply a test case (ie. sample code that illustrates this)? Does
this only occur when using DDD (ie. what if you just use gdb)?


  1. Is this a known issue?

No.

  1. Please explain the mechanism that causes the lockup.

It could be that you’re running you application at too high of a priority,
since pules are come in at priority 15.

Again, if you could supply some code to illustrate this, we can take a look
into it for you.

-Adam

“Operating System for Tech Supp” <os@qnx.com> wrote in message
news:9oq1of$svr$2@nntp.qnx.com

Marty Doane <> marty.doane@rapistan.com> > wrote:
I’ve got an application that creates multiple repetitive pulse-based
timers.
When debugging the application (with ddd), during pauses in the
program’s
execution, the system continues to send pulses (which are then queued).
Eventually, the system locks up solid (no keyboard response, no ping
response). We consider this to be NOT A GOOD THING > :slight_smile:

Can you supply a test case (ie. sample code that illustrates this)? Does
this only occur when using DDD (ie. what if you just use gdb)?

The debugger isn’t necessary to cause the problem - but it’s the only
reasonable real-world scenario I can think of to trigger it.

The following program causes my Celeron 566 w/ 192 MB RAM to lock up after
10 seconds. Most of the variations I listed in my original message can be
controlled from the command line.

Thanks,
Marty

QCC TimerLock.cpp -o TimerLock

TimerLock

TimerLock.cpp



#include <time.h>
#include <errno.h>
#include <iostream.h>
#include <signal.h>
#include <sys/neutrino.h>
#include <unistd.h>


class Timer
{
public:
Timer ();
~Timer();


// Static operation that waits for any Timer to expire, then returns
the Timer
// Reference for the one that did expire.
//
// On error, a -1 is returned, with the error indicated in errno.
// Signals will interrupt and generate an EINTR error.
static int waitForTimer ();

// Begins a timing cycle, which could be one-shot or periodic
depending on the
// contents of aTimeSpec.
int startTimer (itimerspec aTimeSpec);

// Stops a running timer.
int stopTimer ();

// Returns the TimerReference of this Timer. This is used to locate
the expired
// timer.
int getTimerReference ();

protected:

private:

// Static attributes require initialization. This flag orchestrates a
single
// initialization cycle.
static bool cInitialized;

// The QNX 6 channel ID that this object will listen on for timer
timeouts.
static int cChid;

// The QNX 6 connection ID that this object will send timer timeouts
on. The
// connection ID must be attached to the channel ID for the messages
to arrive
// here.
static int cCoid;

// An arbitrarily assigned number is associated with an instance of a
timer.
// This variable counts them.
static int cTimerNumber;

// A reference value that uniquely signifies a particular Timer.
int mTimerReference;

// The system-assigned ID of the system timer.
timer_t mTimerID;


};

bool Timer::cInitialized = false;
int Timer::cChid = 0;
int Timer::cCoid = 0;
int Timer::cTimerNumber = 0;

Timer::Timer ()
{

// First initialize the static class variables on the first invocation.
if (!cInitialized)
{
// Add channel and connection setup for QNX6
cChid = ChannelCreate(0);
if (cChid == -1)
{
// Throw an exception here
cout << “Timer::Timer() ChannelCreate() failed” << endl;
}
cCoid = ConnectAttach(0, 0, cChid, _NTO_SIDE_CHANNEL, 0);
if (cCoid == -1)
{
// Throw an exception here
cout << “Timer::Timer() ConnectAttach failed” << endl;
}
// OK, the global initialization is complete
cInitialized = true;
}


// System timers need a sigevent to signal the timer expiration
struct sigevent lEvent;

// QNX6 timer notification uses pulses
// Set up the pulse that will be used to notify that the timer has
expired
#define TIMER_PULSE_CODE _PULSE_CODE_MINAVAIL + 0

// The timer reference is an arbitrary sequential number assigned here
mTimerReference = ++cTimerNumber;

SIGEV_PULSE_INIT(&lEvent,
cCoid,
SIGEV_PULSE_PRIO_INHERIT,
TIMER_PULSE_CODE,
mTimerReference); // Put the timer reference in the
pulse message


// Create a system timer
if (timer_create(CLOCK_REALTIME, &lEvent, &mTimerID) == -1)
{
// Throw an exception here
cout << “Timer::Timer() Unable to create a timer” << endl;
}

// cout << "Timer::Timer() created timer ID: " << mTimerID << " and
reference: " << mTimerReference << endl;

}


Timer::~Timer()
{
// Delete the system timer
timer_delete( mTimerID);

}



int Timer::waitForTimer ()
{

// cout << “Timer::waitForTimer() waiting” << endl;

// If we haven’t initialized the static class variables, there can’t be
any timers to wait for.
if (!cInitialized)
{
// Throw an exception here
cout << “Timer::waitForTimer() called with no timers created” << endl;
errno = EBADE;
return(-1);
}

// In QNX6, we’re waiting for a pulse to be sent, indicating the
// timer has timed out.
struct _pulse lPulse;

int lRcvid = MsgReceive(cChid, &lPulse, sizeof(lPulse), NULL);

// cout << "Timer::waitForTimer() received ID: " << lRcvid << endl;

if (lRcvid != 0)
{
// Something unusual has happened.
// This process should only receive messages from the timer trigger
(pulse).
// However, if a signal is sent to this process,
// the ‘receive’ function will return a -1 and the errno will be set
to EINTR.
if (lRcvid != -1)
{
// ******************* This is where handling non-pulse messages
would have to go ***********************

// We expected to get a pulse, but got some other kind of message,
// set errno to an appropriate value
errno = ENOMSG;
}
// Let the caller examine errno and decide what he wants to do
return(-1);
}

// Got a pulse
if (lPulse.code != TIMER_PULSE_CODE)
{
// Not the pulse we expected
// set errno to an appropriate value
errno = ENOMSG;
return(-1);
}

// We stored the timer reference in the pulse message
return(lPulse.value.sival_int);
}

int Timer::startTimer (itimerspec aTimeSpec)
{
int lTimerFlag = 0;
return timer_settime ( mTimerID, lTimerFlag, &aTimeSpec, NULL );
}

int Timer::stopTimer ()
{
itimerspec lHaltTimeIncrement;
lHaltTimeIncrement.it_value.tv_sec = 0;
lHaltTimeIncrement.it_value.tv_nsec = 0;
file://cout << “Timer::stopTimer() SystemTimer has been halted” << endl;
return timer_settime ( mTimerID, 0, &lHaltTimeIncrement, NULL );
}

int Timer::getTimerReference ()
{
return(mTimerReference);
}

file://*********************************** START OF MAIN


int main(int argc, char *argv[])
{
Timer *lTimer1, *lTimer2;
struct itimerspec lTimerSpec;
long lIntervalNsec;
long lIntervalMS;
long lIntervalSec;
long lSleepSecs;

cout << “Starting TimerLock Test Program” << endl;

lIntervalSec = 0;
lIntervalMS = 2; // Default timer1 interval is 2 ms
if (argc > 1)
{
// Set the timer1 interval from the first argument
sscanf(argv[1], “%d”, &lIntervalMS);
}
if (lIntervalMS > 999)
{
lIntervalSec = lIntervalMS / 1000;
lIntervalMS -= lIntervalSec * 1000;
}
lIntervalNsec = lIntervalMS * 1000000;

lSleepSecs = 100; // Default time to not service timers is 100
sec
if (argc > 2)
{
// Set the time to not service timers from the second argument
sscanf(argv[2], “%d”, &lSleepSecs);
}

lTimerSpec.it_value.tv_sec = lIntervalSec;
lTimerSpec.it_value.tv_nsec = lIntervalNsec;
lTimerSpec.it_interval.tv_sec = lIntervalSec;
lTimerSpec.it_interval.tv_nsec = lIntervalNsec;

cout << “Creating Timer(s)” << endl;
lTimer1 = new Timer();
lTimer2 = new Timer();
cout << “Starting Timer1 with "
<< (float)lTimerSpec.it_interval.tv_sec + ((float)
lTimerSpec.it_interval.tv_nsec / 1000000000)
<< " second interval” << endl;
lTimer1->startTimer(lTimerSpec);

lTimerSpec.it_interval.tv_nsec = lIntervalNsec/2*3; // Timer2
interval is 3/2 of Timer1 interval
cout << “Starting Timer2 with "
<< (float)lTimerSpec.it_interval.tv_sec + ((float)
lTimerSpec.it_interval.tv_nsec / 1000000000)
<< " second interval” << endl;
lTimer2->startTimer(lTimerSpec);

// The timers are running, now don’t service them to simulate a ddd
breakpoint
cout << “Sleeping for " << lSleepSecs << " seconds” << endl;
for (int i=0; i<lSleepSecs; i++)
{
sleep(1);
cout << “Slept for " << i+1 << " seconds\r”;
cout.flush();
}
cout << endl;

// Done not servicing timers, now service some of the queued timer ticks
for (int i=0; i<20; i++)
{
int lTimerNum = Timer::waitForTimer();
cout << “Got Timer tick #” << i+1 << " from Timer " << lTimerNum <<
endl;
}
// That’s enough, skip the rest

cout << “Deleting Timer(s)” << endl;

lTimer1->stopTimer();
lTimer2->stopTimer();
delete lTimer1;
delete lTimer2;

cout << “------- Exiting TimerLock Test Program -------
<< endl;

return 0;
}

Marty Doane <marty.doane@rapistan.com> wrote:

The following program causes my Celeron 566 w/ 192 MB RAM to lock up after
10 seconds. Most of the variations I listed in my original message can be
controlled from the command line.

It’s quite possibly the debugger, since running your program through it would
cause changes in timing/response. In my previous post, I querried if the same
behaviour occurs under gdb (not ddd). You test case is quite big, so it might
take a little time to go through it all.

-Adam

“Operating System for Tech Supp” <os@qnx.com> wrote in message
news:9osmqs$kn4$2@nntp.qnx.com

Marty Doane <> marty.doane@rapistan.com> > wrote:

The following program causes my Celeron 566 w/ 192 MB RAM to lock up
after
10 seconds. Most of the variations I listed in my original message can
be
controlled from the command line.

It’s quite possibly the debugger, since running your program through it
would
cause changes in timing/response. In my previous post, I querried if the
same
behaviour occurs under gdb (not ddd).

I don’t know how it could be the debugger if my test program exhibits the
problem. Since I’m giving you a program that causes the problem without
involving any debugger, and since we never use gdb without ddd, I didn’t
take the time to gather that information.

You test case is quite big, so it might take a little time to go through
it all.

I’m sorry for the size of the test case, but I think most of the bulk comes
from the comments and the segregation of the timer access into the Timer
class. If it would be a big help to you, I could strip all that out and give
you a sequential flowing program. It would entail setting up and starting
two repetitive pulse-based timers, then not servicing them (don’t
MsgReceive() the pulses).

Did you build and run the test program? That should only take a few seconds.
Did it lock up your system?

Thanks,
Marty

Marty Doane <marty.doane@rapistan.com> wrote:

I don’t know how it could be the debugger if my test program exhibits the
problem. Since I’m giving you a program that causes the problem without
involving any debugger, and since we never use gdb without ddd, I didn’t
take the time to gather that information.

Sorry, I mis-read your post, and believed it was the debugger combination only.

I’m sorry for the size of the test case, but I think most of the bulk comes
from the comments and the segregation of the timer access into the Timer
class. If it would be a big help to you, I could strip all that out and give
you a sequential flowing program. It would entail setting up and starting
two repetitive pulse-based timers, then not servicing them (don’t
MsgReceive() the pulses).

A smaller test case would be nice, but not a required.

Did you build and run the test program? That should only take a few seconds.
Did it lock up your system?

After fixing a few issues with 's (since many of your lines run > 80 cols
I did build/run and recreated the problem. I’ll take a look into it for you.

-Adam

“Operating System for Tech Supp” <os@qnx.com> wrote in message
news:9p0205$oab$2@nntp.qnx.com

After fixing a few issues with 's (since many of your lines run > 80
cols
I did build/run and recreated the problem. I’ll take a look into it for
you.

-Adam

Thanks,
Marty

Marty Doane <marty.doane@rapistan.com> wrote:

“Operating System for Tech Supp” <> os@qnx.com> > wrote in message
news:9p0205$oab$> 2@nntp.qnx.com> …
After fixing a few issues with 's (since many of your lines run > 80
cols
I did build/run and recreated the problem. I’ll take a look into it for
you.

In QNX6, the pulses are stored in timed order, so when you do get around to
servicing them, you’ll receive them in the order they came. Since the pulses
are the same priority but different 2 different timer pulses, they are queued
up in alternated timed order.

T1->T2->T1->T2…

Traversing the queue to add to the end is expensive after many have been
queued. This work is being done by the kernel (at kernel priority),
so it grind the machine to a halt as it gets more and more expensive to
traverse.

If you change the priority of the pulses to be different, then they are ordered
by priority first, then time order. This effectively short circuits the need
to traverse the queue.

I would recommend writing two little functions that you call in your debugger
to stop the timers and/or restart them, once you’ve hit a break point or have
halted execution.

-Adam

“Operating System for Tech Supp” <os@qnx.com> wrote in message
news:9p2dfa$9hu$1@nntp.qnx.com

In QNX6, the pulses are stored in timed order, so when you do get around
to
servicing them, you’ll receive them in the order they came. Since the
pulses
are the same priority but different 2 different timer pulses, they are
queued
up in alternated timed order.

T1->T2->T1->T2…

Traversing the queue to add to the end is expensive after many have been
queued. This work is being done by the kernel (at kernel priority),
so it grind the machine to a halt as it gets more and more expensive to
traverse.

If you change the priority of the pulses to be different, then they are
ordered
by priority first, then time order. This effectively short circuits the
need
to traverse the queue.

I would recommend writing two little functions that you call in your
debugger
to stop the timers and/or restart them, once you’ve hit a break point or
have
halted execution.

-Adam

Thank you for the explanation. That makes sense. Now I’ve got two follow-on

questions:

  1. If two (or more) processes are paused, each causing a single timer pulse
    of the same priority to queue, would we see the same problem? Or do the
    separate processes also resolve the need to traverse the queue?

  2. Can ddd/gdb be set up to automatically invoke these functions? Or must
    they be manually invoked when the execution stops and before it resumes?

Thanks,
Marty

  1. Can ddd/gdb be set up to automatically invoke these functions? Or must
    they be manually invoked when the execution stops and before it resumes?

You can do it with user-defined command hooks. Attach a set of commands
to the stop, continue and next functions.

(gdb) hook-stop
call disable_func()
end

(gdb) hook-continue
call enable_func()
end

(gdb) hook-next
call enable_func()
end


cburgess@qnx.com

Marty Doane <marty.doane@rapistan.com> wrote:

Thank you for the explanation. That makes sense. Now I’ve got two follow-on
questions:

No problem.

  1. If two (or more) processes are paused, each causing a single timer pulse
    of the same priority to queue, would we see the same problem? Or do the
    separate processes also resolve the need to traverse the queue?

Yes, you’ll see the same problem if you have two processes producing large
amounts of events (that are queued) and not serviced. The idea is if you
consume resources without cleaning up, then you’ll eventually exhaust them.


  1. Can ddd/gdb be set up to automatically invoke these functions? Or must
    they be manually invoked when the execution stops and before it resumes?

Check out the documentation at http://www.gnu.org/manual/gdb-4.17/html_chapter/gdb_6.html#SEC35
as it describes exactly what you’ll need.

-Adam

“Operating System for Tech Supp” <os@qnx.com> wrote in message
news:9p9vrj$si6$1@nntp.qnx.com

Marty Doane <> marty.doane@rapistan.com> > wrote:
Thank you for the explanation. That makes sense. Now I’ve got two
follow-on
questions:

No problem.

  1. If two (or more) processes are paused, each causing a single timer
    pulse
    of the same priority to queue, would we see the same problem? Or do the
    separate processes also resolve the need to traverse the queue?

Yes, you’ll see the same problem if you have two processes producing large
amounts of events (that are queued) and not serviced. The idea is if you
consume resources without cleaning up, then you’ll eventually exhaust
them.


2) Can ddd/gdb be set up to automatically invoke these functions? Or
must
they be manually invoked when the execution stops and before it resumes?

Check out the documentation at
http://www.gnu.org/manual/gdb-4.17/html_chapter/gdb_6.html#SEC35
as it describes exactly what you’ll need.

-Adam

I think that completes the issue for me. Thanks to you and Colin for the
help.

Marty

I think that completes the issue for me. Thanks to you and Colin for the
help.

No problem.

-Adam

If I make that:

(gdb) define hook-next
call enable_func()
end

it works, for both hook-next and hook-continue.
However, defining hook-stop results in an infinite loop of disable_func()
calls at the first stopping point.
Any idea what’s going wrong?

Thanks,
Marty

“Colin Burgess” <cburgess@qnx.com> wrote in message
news:9p9sd5$qof$1@nntp.qnx.com

  1. Can ddd/gdb be set up to automatically invoke these functions? Or
    must
    they be manually invoked when the execution stops and before it resumes?

You can do it with user-defined command hooks. Attach a set of commands
to the stop, continue and next functions.

(gdb) hook-stop
call disable_func()
end

(gdb) hook-continue
call enable_func()
end

(gdb) hook-next
call enable_func()
end


cburgess@qnx.com

Marty Doane <marty.doane@rapistan.com> wrote:

If I make that:

(gdb) define hook-next
call enable_func()
end

it works, for both hook-next and hook-continue.
However, defining hook-stop results in an infinite loop of disable_func()
calls at the first stopping point.
Any idea what’s going wrong?

Hmmm, I guess that when it runs disable_func(), then it must stop, and
when it stops it runs disable_func() etc etc ad nauseum.

I’ll take a look.

Thanks,
Marty

“Colin Burgess” <> cburgess@qnx.com> > wrote in message
news:9p9sd5$qof$> 1@nntp.qnx.com> …
2) Can ddd/gdb be set up to automatically invoke these functions? Or
must
they be manually invoked when the execution stops and before it resumes?

You can do it with user-defined command hooks. Attach a set of commands
to the stop, continue and next functions.

(gdb) hook-stop
call disable_func()
end

(gdb) hook-continue
call enable_func()
end

(gdb) hook-next
call enable_func()
end


cburgess@qnx.com


cburgess@qnx.com