Abnormal CPU Use

I have a set of applications that use about 50% of the cpu time, according
to the spin program. Once in a great while, the task that kicks the hardware
watchdog is pre-empted and the processor reboots. So I would guess that
there is something in my applications that is causing a spike in the cpu use
and pre-empting the other tasks.

This takes from a day to several days before it happens and nothing crashes.
In fact, when I disable the watchdog with a jumper, the processor continues
as if nothing has happened.

Does anyone have some suggestions regarding a method to find out what the
offending task is?

I’m using 6.2.0 SE.

Thanks,
David Kuechenmeister

I would suggest using the instrumented kernel to see what is going on and
replace the hardware WD stroke with a user defined event emiting (yes I know
you have SE, so visualization isn’t happening - might be a good investment
for exactly this kind of hard to find problem).

Another idea would be to replace the hardware WD with an ISR that decrements
a value on tick, and on zero emits an event to a high priority thread - this
thread can look at what the state of the system is (whatever that entails
for you). Obviously your original ‘WD stroker’ needs to be modified to
constantly reset the value which is decrementing.

Just my $0.02

-Adam

David Kuechenmeister <david.kuechenmeister@viasat.com> wrote in message
news:bhbfbv$gv1$1@inn.qnx.com

I have a set of applications that use about 50% of the cpu time, according
to the spin program. Once in a great while, the task that kicks the
hardware
watchdog is pre-empted and the processor reboots. So I would guess that
there is something in my applications that is causing a spike in the cpu
use
and pre-empting the other tasks.

This takes from a day to several days before it happens and nothing
crashes.
In fact, when I disable the watchdog with a jumper, the processor
continues
as if nothing has happened.

Does anyone have some suggestions regarding a method to find out what the
offending task is?

I’m using 6.2.0 SE.

Thanks,
David Kuechenmeister

David Kuechenmeister <david.kuechenmeister@viasat.com> wrote:

I have a set of applications that use about 50% of the cpu time, according
to the spin program. Once in a great while, the task that kicks the hardware
watchdog is pre-empted and the processor reboots. So I would guess that
there is something in my applications that is causing a spike in the cpu use
and pre-empting the other tasks.

Are you creating new processes? Allocating large amounts of memory?

As Adam said, the instrumented kernel, if you can catch the event,
can be useful – but the problem is you really need the ring-mode
that I don’t think was in there before 6.2.1. (With the ring-mode,
you have it always logging to the kernels buffer, but it doesn’t
dump to file until you trigger it – so, when you notice something
gone wrong, you trigger a dump of the event log to a file.) But,
with SE, I don’t think you get the instrumented kernel or any of
the other system analysis/profiling tools.

-David

QNX Training Services
http://www.qnx.com/support/training/
Please followup in this newsgroup if you have further questions.

We had thought about using the PE version at one point. The upgrade path
seems to make all previous versions obsolete and it was just too complicated
to support 6.1, 6.2 and 6.2 PE builds. So we decided to stick with 6.2.0,
for better or worse. Probably should have stayed with 6.1, but we had
already upgraded some systems in the field and didn’t want to go back to
them.

Since the instrumented kernel is out of the question, I think I’ll take
what’s behind door #2. It’s a sound idea and I think, with some variation on
a pidin function, I can look at the tasks that are running.

Thanks,

“Adam Mallory” <amallory@qnx.com> wrote in message
news:bhbhgf$5m9$1@nntp.qnx.com

I would suggest using the instrumented kernel to see what is going on and
replace the hardware WD stroke with a user defined event emiting (yes I
know
you have SE, so visualization isn’t happening - might be a good investment
for exactly this kind of hard to find problem).

Another idea would be to replace the hardware WD with an ISR that
decrements
a value on tick, and on zero emits an event to a high priority thread -
this
thread can look at what the state of the system is (whatever that entails
for you). Obviously your original ‘WD stroker’ needs to be modified to
constantly reset the value which is decrementing.

Just my $0.02

-Adam

David Kuechenmeister <> david.kuechenmeister@viasat.com> > wrote in message
news:bhbfbv$gv1$> 1@inn.qnx.com> …
I have a set of applications that use about 50% of the cpu time,
according
to the spin program. Once in a great while, the task that kicks the
hardware
watchdog is pre-empted and the processor reboots. So I would guess that
there is something in my applications that is causing a spike in the cpu
use
and pre-empting the other tasks.

This takes from a day to several days before it happens and nothing
crashes.
In fact, when I disable the watchdog with a jumper, the processor
continues
as if nothing has happened.

Does anyone have some suggestions regarding a method to find out what
the
offending task is?

I’m using 6.2.0 SE.

Thanks,
David Kuechenmeister
\

Would not be too hard to modify spin to log whatever is interesting for
you(who’s currently the biggest hog). Perhaps even send it over network to
another comp, where it can be logged. That is, before I do that modification
myself :wink:

– igor

“David Kuechenmeister” <david.kuechenmeister@viasat.com> wrote in message
news:bhbisu$ja9$1@inn.qnx.com

We had thought about using the PE version at one point. The upgrade path
seems to make all previous versions obsolete and it was just too
complicated
to support 6.1, 6.2 and 6.2 PE builds. So we decided to stick with 6.2.0,
for better or worse. Probably should have stayed with 6.1, but we had
already upgraded some systems in the field and didn’t want to go back to
them.

Since the instrumented kernel is out of the question, I think I’ll take
what’s behind door #2. It’s a sound idea and I think, with some variation
on
a pidin function, I can look at the tasks that are running.

Thanks,

“Adam Mallory” <> amallory@qnx.com> > wrote in message
news:bhbhgf$5m9$> 1@nntp.qnx.com> …
I would suggest using the instrumented kernel to see what is going on
and
replace the hardware WD stroke with a user defined event emiting (yes I
know
you have SE, so visualization isn’t happening - might be a good
investment
for exactly this kind of hard to find problem).

Another idea would be to replace the hardware WD with an ISR that
decrements
a value on tick, and on zero emits an event to a high priority thread -
this
thread can look at what the state of the system is (whatever that
entails
for you). Obviously your original ‘WD stroker’ needs to be modified to
constantly reset the value which is decrementing.

Just my $0.02

-Adam

David Kuechenmeister <> david.kuechenmeister@viasat.com> > wrote in message
news:bhbfbv$gv1$> 1@inn.qnx.com> …
I have a set of applications that use about 50% of the cpu time,
according
to the spin program. Once in a great while, the task that kicks the
hardware
watchdog is pre-empted and the processor reboots. So I would guess
that
there is something in my applications that is causing a spike in the
cpu
use
and pre-empting the other tasks.

This takes from a day to several days before it happens and nothing
crashes.
In fact, when I disable the watchdog with a jumper, the processor
continues
as if nothing has happened.

Does anyone have some suggestions regarding a method to find out what
the
offending task is?

I’m using 6.2.0 SE.

Thanks,
David Kuechenmeister


\

What IRQ does the QNX kernel use for its timing? Does it chain the hardware
interrupt at IRQ 8?

Thanks,

“Adam Mallory” <amallory@qnx.com> wrote in message
news:bhbhgf$5m9$1@nntp.qnx.com

I would suggest using the instrumented kernel to see what is going on and
replace the hardware WD stroke with a user defined event emiting (yes I
know
you have SE, so visualization isn’t happening - might be a good investment
for exactly this kind of hard to find problem).

Another idea would be to replace the hardware WD with an ISR that
decrements
a value on tick, and on zero emits an event to a high priority thread -
this
thread can look at what the state of the system is (whatever that entails
for you). Obviously your original ‘WD stroker’ needs to be modified to
constantly reset the value which is decrementing.

Just my $0.02

-Adam

David Kuechenmeister <> david.kuechenmeister@viasat.com> > wrote in message
news:bhbfbv$gv1$> 1@inn.qnx.com> …
I have a set of applications that use about 50% of the cpu time,
according
to the spin program. Once in a great while, the task that kicks the
hardware
watchdog is pre-empted and the processor reboots. So I would guess that
there is something in my applications that is causing a spike in the cpu
use
and pre-empting the other tasks.

This takes from a day to several days before it happens and nothing
crashes.
In fact, when I disable the watchdog with a jumper, the processor
continues
as if nothing has happened.

Does anyone have some suggestions regarding a method to find out what
the
offending task is?

I’m using 6.2.0 SE.

Thanks,
David Kuechenmeister
\

David Kuechenmeister <david.kuechenmeister@viasat.com> wrote:

What IRQ does the QNX kernel use for its timing? Does it chain the hardware
interrupt at IRQ 8?

I assume you’re in x86 land.

The system clock is driven by irq 0.

ClockCycles() uses the rdtsc op code to get a free running 64-bit
value for accurate timestamps.

If neither of those is what you meant, what do you mean by the
kernel using something for timing?

-David

QNX Training Services
http://www.qnx.com/support/training/
Please followup in this newsgroup if you have further questions.

Sorry, that question didn’t come out too well. The answer was what I needed,
though.

Thanks,


“David Gibbs” <dagibbs@qnx.com> wrote in message
news:bhdjqb$jbr$1@nntp.qnx.com

David Kuechenmeister <> david.kuechenmeister@viasat.com> > wrote:
What IRQ does the QNX kernel use for its timing? Does it chain the
hardware
interrupt at IRQ 8?

I assume you’re in x86 land.

The system clock is driven by irq 0.

ClockCycles() uses the rdtsc op code to get a free running 64-bit
value for accurate timestamps.

If neither of those is what you meant, what do you mean by the
kernel using something for timing?

-David

QNX Training Services
http://www.qnx.com/support/training/
Please followup in this newsgroup if you have further questions.

How can I dump the stack for a process?

Thanks,
David

“Adam Mallory” <amallory@qnx.com> wrote in message
news:bhbhgf$5m9$1@nntp.qnx.com

I would suggest using the instrumented kernel to see what is going on and
replace the hardware WD stroke with a user defined event emiting (yes I
know
you have SE, so visualization isn’t happening - might be a good investment
for exactly this kind of hard to find problem).

Another idea would be to replace the hardware WD with an ISR that
decrements
a value on tick, and on zero emits an event to a high priority thread -
this
thread can look at what the state of the system is (whatever that entails
for you). Obviously your original ‘WD stroker’ needs to be modified to
constantly reset the value which is decrementing.

Just my $0.02

-Adam

David Kuechenmeister <> david.kuechenmeister@viasat.com> > wrote in message
news:bhbfbv$gv1$> 1@inn.qnx.com> …
I have a set of applications that use about 50% of the cpu time,
according
to the spin program. Once in a great while, the task that kicks the
hardware
watchdog is pre-empted and the processor reboots. So I would guess that
there is something in my applications that is causing a spike in the cpu
use
and pre-empting the other tasks.

This takes from a day to several days before it happens and nothing
crashes.
In fact, when I disable the watchdog with a jumper, the processor
continues
as if nothing has happened.

Does anyone have some suggestions regarding a method to find out what
the
offending task is?

I’m using 6.2.0 SE.

Thanks,
David Kuechenmeister
\

David Kuechenmeister <david.kuechenmeister@viasat.com> wrote in message
news:bhg5u1$16d$1@inn.qnx.com

How can I dump the stack for a process?

Hit the process with a signal and have dumper running. Then use gdb to get
the stack information you’re looking for.

-Adam

Adam Mallory <amallory@qnx.com> wrote:

David Kuechenmeister <> david.kuechenmeister@viasat.com> > wrote in message
news:bhg5u1$16d$> 1@inn.qnx.com> …
How can I dump the stack for a process?

Hit the process with a signal and have dumper running. Then use gdb to get
the stack information you’re looking for.

alternatively dumper -p -d <dumper_dir> will do the trick


cburgess@qnx.com

Thanks, but I 'm cross compiling and the best I can do is to look at the
core with coreinfo.

“Colin Burgess” <cburgess@qnx.com> wrote in message
news:bhgoum$er5$1@nntp.qnx.com

Adam Mallory <> amallory@qnx.com> > wrote:
David Kuechenmeister <> david.kuechenmeister@viasat.com> > wrote in message
news:bhg5u1$16d$> 1@inn.qnx.com> …
How can I dump the stack for a process?

Hit the process with a signal and have dumper running. Then use gdb to
get
the stack information you’re looking for.

alternatively dumper -p -d <dumper_dir> will do the trick


cburgess@qnx.com

David Kuechenmeister <david.kuechenmeister@viasat.com> wrote:

Thanks, but I 'm cross compiling and the best I can do is to look at the
core with coreinfo.

Can’t you load the core into gdb?

ntox86-gdb <executable.core>

“Colin Burgess” <> cburgess@qnx.com> > wrote in message
news:bhgoum$er5$> 1@nntp.qnx.com> …
Adam Mallory <> amallory@qnx.com> > wrote:
David Kuechenmeister <> david.kuechenmeister@viasat.com> > wrote in message
news:bhg5u1$16d$> 1@inn.qnx.com> …
How can I dump the stack for a process?

Hit the process with a signal and have dumper running. Then use gdb to
get
the stack information you’re looking for.

alternatively dumper -p -d <dumper_dir> will do the trick


cburgess@qnx.com


cburgess@qnx.com

I didn’t think about trying that. In fact, I didn’t know it was there.

Thanks,


“Colin Burgess” <cburgess@qnx.com> wrote in message
news:bhqj72$ree$1@nntp.qnx.com

David Kuechenmeister <> david.kuechenmeister@viasat.com> > wrote:
Thanks, but I 'm cross compiling and the best I can do is to look at the
core with coreinfo.

Can’t you load the core into gdb?

ntox86-gdb <executable.core

“Colin Burgess” <> cburgess@qnx.com> > wrote in message
news:bhgoum$er5$> 1@nntp.qnx.com> …
Adam Mallory <> amallory@qnx.com> > wrote:
David Kuechenmeister <> david.kuechenmeister@viasat.com> > wrote in
message
news:bhg5u1$16d$> 1@inn.qnx.com> …
How can I dump the stack for a process?

Hit the process with a signal and have dumper running. Then use gdb
to
get
the stack information you’re looking for.

alternatively dumper -p -d <dumper_dir> will do the trick


cburgess@qnx.com



\

cburgess@qnx.com