Interrupt monitor

By logging information about system interrupts, I am hoping to glean
some information that may help me find out why machines occasionally
lock-up (even ctrl-shift-alt-del appears to have no effect). My
application attaches to interrupts 1 through 15 using
qnx_hint_attach() and then keeps a count of how many IRQs appear on
each line. Every second or so it logs this information to a server
running on another machine.

This works fine. I can see 100 interrupts per second from a set of
serial devices, a few from the network, some from the hard disk, 24
frame pulse interrupts a second, etc. I have also noticed a few some
strange things:

  1. The process seems to interfere with shutdown. Ctrl-shift-alt-del
    normally freezes the display (X11 is running so no countdown is
    expected) and then after 10 seconds reboots. With my process
    running no reboot occurs - the system is just frozen. It
    does seem to have shutdown properly because when I push the reset
    button the file system comes up clean.

  2. To identify the logs to the server I send the MAC address of the
    computer when the process connects initially (MAC address is unique
    whereas IP address may not be). I get the MAC address by running
    “netmap -r” as I cannot find a clean way to read it from QNX.
    Running netmap seems to cause the system to freeze. Note that the
    process starts (and therefore runs netmap) not long after the
    system has come up (but after all the essential services have
    started).

Can anyone explain this behaviour? Is there any problem with
attaching to all interrupts in this way (apart from loading the
system)?

When I drag a slider on the touch screen (X11 user-interface) I see
500-1000 interrupts a second (handled by Dev32.ser). This is not
extreme, but I mention it because the lockups we encounter often occur
when someone is adjusting something by dragging a slider. Is there
likely to be any direct connection (eg overflows in driver buffers)?

The computer in use is a DigitalLogic MSM-P5 module (PC104) containing
a Pentium 166MHz (under-clocked I think to 75MHz). The module is
attached to a custom motherboard.

Thanks in advance for any ideas.

William Morris
wrm@innovation-tk.com

William Morris <wrm@innovation-tk.com> wrote:

By logging information about system interrupts, I am hoping to glean
some information that may help me find out why machines occasionally
lock-up (even ctrl-shift-alt-del appears to have no effect). My
application attaches to interrupts 1 through 15 using
qnx_hint_attach() and then keeps a count of how many IRQs appear on
each line. Every second or so it logs this information to a server
running on another machine.

There’s a better/easier/faster/less-intrusive way… :slight_smile:

Check out the source for sysmon at www.parse.com, it digs deep into the
kernel and fetches the counts directly from the kernel. No fuss, no muss.

This works fine. I can see 100 interrupts per second from a set of
serial devices, a few from the network, some from the hard disk, 24
frame pulse interrupts a second, etc. I have also noticed a few some
strange things:

  1. The process seems to interfere with shutdown. Ctrl-shift-alt-del
    normally freezes the display (X11 is running so no countdown is
    expected) and then after 10 seconds reboots. With my process
    running no reboot occurs - the system is just frozen. It
    does seem to have shutdown properly because when I push the reset
    button the file system comes up clean.

  2. To identify the logs to the server I send the MAC address of the
    computer when the process connects initially (MAC address is unique
    whereas IP address may not be). I get the MAC address by running
    “netmap -r” as I cannot find a clean way to read it from QNX.
    Running netmap seems to cause the system to freeze. Note that the
    process starts (and therefore runs netmap) not long after the
    system has come up (but after all the essential services have
    started).

Can anyone explain this behaviour? Is there any problem with
attaching to all interrupts in this way (apart from loading the
system)?

When I drag a slider on the touch screen (X11 user-interface) I see
500-1000 interrupts a second (handled by Dev32.ser). This is not
extreme, but I mention it because the lockups we encounter often occur
when someone is adjusting something by dragging a slider. Is there
likely to be any direct connection (eg overflows in driver buffers)?

The computer in use is a DigitalLogic MSM-P5 module (PC104) containing
a Pentium 166MHz (under-clocked I think to 75MHz). The module is
attached to a custom motherboard.

Thanks in advance for any ideas.

William Morris
wrm@innovation-tk.com


Robert Krten, PARSE Software Devices +1 613 599 8316.
Realtime Systems Architecture, Consulting and Training at www.parse.com
Email my initials at parse dot com.

“William Morris” <wrm@innovation-tk.com> wrote in message
news:9vam2c$slc$1@inn.qnx.com

By logging information about system interrupts, I am hoping to glean
some information that may help me find out why machines occasionally
lock-up (even ctrl-shift-alt-del appears to have no effect). My
application attaches to interrupts 1 through 15 using
qnx_hint_attach() and then keeps a count of how many IRQs appear on
each line. Every second or so it logs this information to a server
running on another machine.

To add to nosmap93 suggestion, you can use monitor to log not
only interrupt but all kernel event.

This works fine. I can see 100 interrupts per second from a set of
serial devices, a few from the network, some from the hard disk, 24
frame pulse interrupts a second, etc. I have also noticed a few some
strange things:

  1. The process seems to interfere with shutdown. Ctrl-shift-alt-del
    normally freezes the display (X11 is running so no countdown is
    expected) and then after 10 seconds reboots. With my process
    running no reboot occurs - the system is just frozen. It
    does seem to have shutdown properly because when I push the reset
    button the file system comes up clean.

Can’t explain that one.

The file system may not be an indication cause in order to have a “non
clean”
file system, a write must have occured at the same time as the reset.

  1. To identify the logs to the server I send the MAC address of the
    computer when the process connects initially (MAC address is unique
    whereas IP address may not be). I get the MAC address by running
    “netmap -r” as I cannot find a clean way to read it from QNX.

http://qdn.qnx.com/support/bok/solution.qnx?9830

-r ??? Is that a type?

Running netmap seems to cause the system to freeze. Note that the
process starts (and therefore runs netmap) not long after the
system has come up (but after all the essential services have
started).

That doesn’t really make any kind of sense. I can’t see how
netmap would free the system.

Can anyone explain this behaviour?

Is there any problem with
attaching to all interrupts in this way (apart from loading the
system)?

It should be ok, however nospam93 suggestion is definitely
better.


When I drag a slider on the touch screen (X11 user-interface) I see
500-1000 interrupts a second (handled by Dev32.ser). This is not
extreme, but I mention it because the lockups we encounter often occur
when someone is adjusting something by dragging a slider. Is there
likely to be any direct connection (eg overflows in driver buffers)?

Mouse are running at 1200 baud I beleive, so 1000
would not make sense. Unless you are using a touch screen or that
mouse are running at 9600 (check serial port setting)

The computer in use is a DigitalLogic MSM-P5 module (PC104) containing
a Pentium 166MHz (under-clocked I think to 75MHz). The module is
attached to a custom motherboard.

Thanks in advance for any ideas.

William Morris
wrm@innovation-tk.com

nospam93@parse.com wrote:

There’s a better/easier/faster/less-intrusive way… > :slight_smile:

Check out the source for sysmon at > www.parse.com> , it digs deep into the
kernel and fetches the counts directly from the kernel. No fuss, no muss.

Mmmm. Looks like I just wasted some time. Thanks, I’ll take a read.

By the way, your website indicated that you had XFree86 for QNX4 but
I failed to find it. Where is it?

Many thanks

William Morris
wrm@innovation-tk.com

Mario Charest <mcharest@clipzinformatic.com> wrote:

The file system may not be an indication cause in order to have a “non
clean” file system, a write must have occured at the same time as
the reset.
I though it just required that something hadn’t been flushed to disk,

not necessarily that a write was coincident. Must it really be so?

http://qdn.qnx.com/support/bok/solution.qnx?9830
Great!



-r ??? Is that a type?
Sorry, error in posting. Code used just “netmap”



Mouse are running at 1200 baud I beleive, so 1000
would not make sense. Unless you are using a touch screen or that
Using touch screen running at 9600.

Thanks

William Morris
wrm@innovation-tk.com

William Morris <wrm@innovation-tk.com> wrote:

nospam93@parse.com > wrote:
There’s a better/easier/faster/less-intrusive way… > :slight_smile:

Check out the source for sysmon at > www.parse.com> , it digs deep into the
kernel and fetches the counts directly from the kernel. No fuss, no muss.

Mmmm. Looks like I just wasted some time. Thanks, I’ll take a read.

By the way, your website indicated that you had XFree86 for QNX4 but
I failed to find it. Where is it?

Fixed (removed). I had a copy a long time ago, but decided not to mirror
it when too many people tried to download it and it used up all my bandwidth! :frowning:

Cheers,
-RK


Robert Krten, PARSE Software Devices +1 613 599 8316.
Realtime Systems Architecture, Consulting and Training at www.parse.com
Email my initials at parse dot com.