Debugging ISR's methods suggestions?

Hello All:

I am having an intermittent problem with an ISR that (after 300,000 -
400,000 interrupts) will lock up QNX. I am looking for suggestions on
debugging this issue. I am running NC6.2.0.

Background:

I have two versions of the code: one that utilizes
InterruptAttacheEvent() and a thread to handle the interrupt, and a
second that utilizes InterruptAttach(), and does the work in the actual
ISR. The thread versions will run forever, but the ISR will run around
2-3 hours before locking up the machine. Except for “cout” statements
for status messages, the ISR shares the same code base as the thread
version.

The work consists of checking two hardware registers and based on their
contents, changes a buffer pointer. In both methods I have minimized
function calls to limit stack use, but do not have a quantitative
measure on the amount actually used.

I have tried using dumper to trap the error but the has not worked. I do
not have access to an ICE. Any suggestions on a method that will allow
me to capture information on the state of the machine just before the
event occurs?

If anyone is interested in actual code then I can email that to them
directly.

Thanks in advance for the help.

Regards,
–Jeff Strickrott

“Jeff Strickrott” <jstric01@cs.fiu.edu> wrote in message
news:c9q6s2$nbe$1@inn.qnx.com

Hello All:

I am having an intermittent problem with an ISR that (after 300,000 -
400,000 interrupts) will lock up QNX. I am looking for suggestions on
debugging this issue. I am running NC6.2.0.

Background:

I have two versions of the code: one that utilizes
InterruptAttacheEvent() and a thread to handle the interrupt, and a
second that utilizes InterruptAttach(), and does the work in the actual
ISR. The thread versions will run forever, but the ISR will run around
2-3 hours before locking up the machine. Except for “cout” statements
for status messages, the ISR shares the same code base as the thread
version.

Are you sure EACH function called by the ISR is interrupt safe?

The work consists of checking two hardware registers and based on their
contents, changes a buffer pointer. In both methods I have minimized
function calls to limit stack use, but do not have a quantitative
measure on the amount actually used.

I have tried using dumper to trap the error but the has not worked. I do
not have access to an ICE. Any suggestions on a method that will allow
me to capture information on the state of the machine just before the
event occurs?

If anyone is interested in actual code then I can email that to them
directly.

Thanks in advance for the help.

Regards,
–Jeff Strickrott

Jeff Strickrott <jstric01@cs.fiu.edu> wrote:
JS > Hello All:

JS > I am having an intermittent problem with an ISR that (after 300,000 -
JS > 400,000 interrupts) will lock up QNX. I am looking for suggestions on
JS > debugging this issue. I am running NC6.2.0.

JS > Background:

JS > I have two versions of the code: one that utilizes
JS > InterruptAttacheEvent() and a thread to handle the interrupt, and a
JS > second that utilizes InterruptAttach(), and does the work in the actual
JS > ISR. The thread versions will run forever, but the ISR will run around
JS > 2-3 hours before locking up the machine. Except for “cout” statements
JS > for status messages, the ISR shares the same code base as the thread
JS > version.

I don’t think your allowed to have cout or printf() statements in an ISR.

What I do is to put messages into a buffer in the ISR and display them
outside of the ISR.

Yes I had a call to an function that would sometime allocate/free memory
from the heap. I should have know that it was not ISR safe, but just
assumed. Anyway this call was what would periodically lock up the kernel.

Regards,
–Jeff Strickrott

Mario Charest wrote:

“Jeff Strickrott” <> jstric01@cs.fiu.edu> > wrote in message
news:c9q6s2$nbe$> 1@inn.qnx.com> …


Hello All:

I am having an intermittent problem with an ISR that (after 300,000 -
400,000 interrupts) will lock up QNX. I am looking for suggestions on
debugging this issue. I am running NC6.2.0.

Background:

I have two versions of the code: one that utilizes
InterruptAttacheEvent() and a thread to handle the interrupt, and a
second that utilizes InterruptAttach(), and does the work in the actual
ISR. The thread versions will run forever, but the ISR will run around
2-3 hours before locking up the machine. Except for “cout” statements
for status messages, the ISR shares the same code base as the thread
version.



Are you sure EACH function called by the ISR is interrupt safe?



The work consists of checking two hardware registers and based on their
contents, changes a buffer pointer. In both methods I have minimized
function calls to limit stack use, but do not have a quantitative
measure on the amount actually used.

I have tried using dumper to trap the error but the has not worked. I do
not have access to an ICE. Any suggestions on a method that will allow
me to capture information on the state of the machine just before the
event occurs?

If anyone is interested in actual code then I can email that to them
directly.

Thanks in advance for the help.

Regards,
–Jeff Strickrott



\

Thanks Bill:

Yes, that is essentially what my ISR’s do. I pass status information to
a helper thread that then prints out whatever is important.

My problem this time was I called an ISR unsafe function that would on
occasion allocate/free memory from the heap. I would think that it would
have failed almost immediately, not 2-3 hours later, but that is the
nature of this business.

Just wish there was a better (read smarter) way to find this type of
error than the process of elimination.

Regards,
–Jeff Strickrott


Bill Caroselli wrote:

Jeff Strickrott <> jstric01@cs.fiu.edu> > wrote:
JS > Hello All:

JS > I am having an intermittent problem with an ISR that (after 300,000 -
JS > 400,000 interrupts) will lock up QNX. I am looking for suggestions on
JS > debugging this issue. I am running NC6.2.0.

JS > Background:

JS > I have two versions of the code: one that utilizes
JS > InterruptAttacheEvent() and a thread to handle the interrupt, and a
JS > second that utilizes InterruptAttach(), and does the work in the actual
JS > ISR. The thread versions will run forever, but the ISR will run around
JS > 2-3 hours before locking up the machine. Except for “cout” statements
JS > for status messages, the ISR shares the same code base as the thread
JS > version.

I don’t think your allowed to have cout or printf() statements in an ISR.

What I do is to put messages into a buffer in the ISR and display them
outside of the ISR.

“Jeff Strickrott” <jstric01@cs.fiu.edu> wrote in message
news:ca79pa$s64$1@inn.qnx.com

Thanks Bill:

Yes, that is essentially what my ISR’s do. I pass status information to
a helper thread that then prints out whatever is important.

My problem this time was I called an ISR unsafe function that would on
occasion allocate/free memory from the heap. I would think that it would
have failed almost immediately, not 2-3 hours later, but that is the
nature of this business.

If the ISR has not interrupted a call to memory api( malloc/free/etc) it is
likey to run.
It will also crash if the heap has to be grown. Could take a while for
these
condition to happen.

Just wish there was a better (read smarter) way to find this type of
error than the process of elimination.

I use a tool called Understand C++ (underwindows) it can draw a graphical
tree of function called. Hence I can check at the lowest level if all the
function are interrupt safe. Maybe cscope can do the same?

Regards,
–Jeff Strickrott


Bill Caroselli wrote:

Jeff Strickrott <> jstric01@cs.fiu.edu> > wrote:
JS > Hello All:

JS > I am having an intermittent problem with an ISR that (after
300,000 -
JS > 400,000 interrupts) will lock up QNX. I am looking for suggestions
on
JS > debugging this issue. I am running NC6.2.0.

JS > Background:

JS > I have two versions of the code: one that utilizes
JS > InterruptAttacheEvent() and a thread to handle the interrupt, and a
JS > second that utilizes InterruptAttach(), and does the work in the
actual
JS > ISR. The thread versions will run forever, but the ISR will run
around
JS > 2-3 hours before locking up the machine. Except for “cout”
statements
JS > for status messages, the ISR shares the same code base as the thread
JS > version.

I don’t think your allowed to have cout or printf() statements in an ISR.

What I do is to put messages into a buffer in the ISR and display them
outside of the ISR.

Jeff Strickrott <jstric01@cs.fiu.edu> wrote:
JS > Thanks Bill:

JS > Yes, that is essentially what my ISR’s do. I pass status information to
JS > a helper thread that then prints out whatever is important.

JS > My problem this time was I called an ISR unsafe function that would on
JS > occasion allocate/free memory from the heap. I would think that it would
JS > have failed almost immediately, not 2-3 hours later, but that is the
JS > nature of this business.

JS > Just wish there was a better (read smarter) way to find this type of
JS > error than the process of elimination.

Thankfully it wasn’t an error that only manifests itself after 2-3
weeks/months. I.E. After you’ve release the software to your customers.

The best bet on all accounts is to do absolutely as little as possible
in an ISR. Do all of the real work in process/thread level code.