umount and segment violation

We have a lot of trouble with io-net unpredictably getting a segment
violation when we try to umount a module. The latest case resulted in
the backtrace below. Anyone got any ideas what the problem might be?


GNU gdb 5.0 (UI_OUT)

Program terminated with signal 11, segmentation violation.
Reading symbols from /x86/lib/libc.so.2…(no debugging symbols found)…done.
Reading symbols from /x86/lib/dll/npm-tcpip-v6.so…
(no debugging symbols found)…done.
Reading symbols from /x86/lib/dll/devn-ns83815.so…
(no debugging symbols found)…done.
#0 0xb032ebce in _resmgr_detach_id () from /x86/lib/libc.so.2
(gdb) bt
#0 0xb032ebce in _resmgr_detach_id () from /x86/lib/libc.so.2
#1 0xb0322d44 in resmgr_detach () from /x86/lib/libc.so.2
#2 0x0804af00 in _btext ()
#3 0x0805024e in _btext ()
#4 0xb033027b in _resmgr_mount_handler () from /x86/lib/libc.so.2
#5 0xb032ea41 in _resmgr_connect_handler () from /x86/lib/libc.so.2
#6 0xb032f279 in _resmgr_handler () from /x86/lib/libc.so.2
#7 0xb0322d82 in _resmgr_msg_handler () from /x86/lib/libc.so.2
#8 0xb03224e3 in _message_handler () from /x86/lib/libc.so.2
#9 0xb0321cf9 in dispatch_handler () from /x86/lib/libc.so.2
#10 0xb032103d in _thread_pool_thread () from /x86/lib/libc.so.2

Murf

What module did you unmount? Did you have qnet running?

John A. Murphy <murf@perftech.com> wrote:

We have a lot of trouble with io-net unpredictably getting a segment
violation when we try to umount a module. The latest case resulted in
the backtrace below. Anyone got any ideas what the problem might be?

GNU gdb 5.0 (UI_OUT)

Program terminated with signal 11, segmentation violation.
Reading symbols from /x86/lib/libc.so.2…(no debugging symbols found)…done.
Reading symbols from /x86/lib/dll/npm-tcpip-v6.so…
(no debugging symbols found)…done.
Reading symbols from /x86/lib/dll/devn-ns83815.so…
(no debugging symbols found)…done.
#0 0xb032ebce in _resmgr_detach_id () from /x86/lib/libc.so.2
(gdb) bt
#0 0xb032ebce in _resmgr_detach_id () from /x86/lib/libc.so.2
#1 0xb0322d44 in resmgr_detach () from /x86/lib/libc.so.2
#2 0x0804af00 in _btext ()
#3 0x0805024e in _btext ()
#4 0xb033027b in _resmgr_mount_handler () from /x86/lib/libc.so.2
#5 0xb032ea41 in _resmgr_connect_handler () from /x86/lib/libc.so.2
#6 0xb032f279 in _resmgr_handler () from /x86/lib/libc.so.2
#7 0xb0322d82 in _resmgr_msg_handler () from /x86/lib/libc.so.2
#8 0xb03224e3 in _message_handler () from /x86/lib/libc.so.2
#9 0xb0321cf9 in dispatch_handler () from /x86/lib/libc.so.2
#10 0xb032103d in _thread_pool_thread () from /x86/lib/libc.so.2


Murf


Kirk Russell Bridlewood Software Testers Guild

I think on this particular occasion I umount’ed a filter (one that I wrote), but I’ve
seen the same things happen when unloading any of several Ethernet drivers - and yes,
Qnet was running. When we first started using QNX we blamed a lot of the apparent
instability of io-net on Qnet and stopped loading it, but it’s so convenient that we
usually load it on our development machines, and had kind of forgotten about blaming
it for problems, largely because the problems had seemed to go away. These crashes
seem completely unpredictable, and, as evidenced by the backtrace, happen so deep
inside the io-net resource manager that it’s hard to discover the cause.

Murf

kirk wrote:

What module did you unmount? Did you have qnet running?

John A. Murphy <> murf@perftech.com> > wrote:
We have a lot of trouble with io-net unpredictably getting a segment
violation when we try to umount a module. The latest case resulted in
the backtrace below. Anyone got any ideas what the problem might be?

GNU gdb 5.0 (UI_OUT)

Program terminated with signal 11, segmentation violation.
Reading symbols from /x86/lib/libc.so.2…(no debugging symbols found)…done.
Reading symbols from /x86/lib/dll/npm-tcpip-v6.so…
(no debugging symbols found)…done.
Reading symbols from /x86/lib/dll/devn-ns83815.so…
(no debugging symbols found)…done.
#0 0xb032ebce in _resmgr_detach_id () from /x86/lib/libc.so.2
(gdb) bt
#0 0xb032ebce in _resmgr_detach_id () from /x86/lib/libc.so.2
#1 0xb0322d44 in resmgr_detach () from /x86/lib/libc.so.2
#2 0x0804af00 in _btext ()
#3 0x0805024e in _btext ()
#4 0xb033027b in _resmgr_mount_handler () from /x86/lib/libc.so.2
#5 0xb032ea41 in _resmgr_connect_handler () from /x86/lib/libc.so.2
#6 0xb032f279 in _resmgr_handler () from /x86/lib/libc.so.2
#7 0xb0322d82 in _resmgr_msg_handler () from /x86/lib/libc.so.2
#8 0xb03224e3 in _message_handler () from /x86/lib/libc.so.2
#9 0xb0321cf9 in dispatch_handler () from /x86/lib/libc.so.2
#10 0xb032103d in _thread_pool_thread () from /x86/lib/libc.so.2


Murf


Kirk Russell Bridlewood Software Testers Guild

There is a bug with the qnet dll – if you umount() anything within the
io-net process, when qnet is or was running, then you have a good chance
of faulting the whole io-net process. This bug was fixed with qnet dll
in the 6.2.1 release.

John A. Murphy <murf@perftech.com> wrote:

I think on this particular occasion I umount’ed a filter (one that I wrote), but I’ve
seen the same things happen when unloading any of several Ethernet drivers - and yes,
Qnet was running. When we first started using QNX we blamed a lot of the apparent
instability of io-net on Qnet and stopped loading it, but it’s so convenient that we
usually load it on our development machines, and had kind of forgotten about blaming
it for problems, largely because the problems had seemed to go away. These crashes
seem completely unpredictable, and, as evidenced by the backtrace, happen so deep
inside the io-net resource manager that it’s hard to discover the cause.

Murf

Interesting! While the backtrace I enclosed was from 6.2.0, we’ve seen much the same
behavior with the 6.2.1 beta; in fact, if anything, it’s gotten worse. We’ve also seen
several instances of the machine (under 6.2.1) suddenly being completely out of memory, or
of file handles. Unfortunately, we started noticing this behavior at just about the time
the beta ended…

Murf

kirk wrote:

There is a bug with the qnet dll – if you umount() anything within the
io-net process, when qnet is or was running, then you have a good chance
of faulting the whole io-net process. This bug was fixed with qnet dll
in the 6.2.1 release.

John A. Murphy <> murf@perftech.com> > wrote:
I think on this particular occasion I umount’ed a filter (one that I wrote), but I’ve
seen the same things happen when unloading any of several Ethernet drivers - and yes,
Qnet was running. When we first started using QNX we blamed a lot of the apparent
instability of io-net on Qnet and stopped loading it, but it’s so convenient that we
usually load it on our development machines, and had kind of forgotten about blaming
it for problems, largely because the problems had seemed to go away. These crashes
seem completely unpredictable, and, as evidenced by the backtrace, happen so deep
inside the io-net resource manager that it’s hard to discover the cause.

Murf

John A. Murphy <murf@perftech.com> wrote:

Interesting! While the backtrace I enclosed was from 6.2.0, we’ve seen much the same
behavior with the 6.2.1 beta; in fact, if anything, it’s gotten worse. We’ve also seen
several instances of the machine (under 6.2.1) suddenly being completely out of memory, or
of file handles. Unfortunately, we started noticing this behavior at just about the time
the beta ended…

That is not good news :frowning: Our tests show everything is okay. If you
get more info about how to reproduce this issue, please let us know. TIA.

Murf

kirk wrote:

There is a bug with the qnet dll – if you umount() anything within the
io-net process, when qnet is or was running, then you have a good chance
of faulting the whole io-net process. This bug was fixed with qnet dll
in the 6.2.1 release.

This has nothing to do with networking, but on the subject of system stability, I’m interested
in learning more about memory usage. On several occasions we’ve had a machine start failing to
execute programs due to lack of memory. I’ve dumped the output of “pidin mem” when things were
bad (pidin was one of the few commands that would still execute), right after a reboot, and from
another machine, and spent some time comparing them, looking for a memory hog — but I find no
significant differences. Anybody got any hints on tracking down this sort of thing?

Murf

kirk wrote:

John A. Murphy <> murf@perftech.com> > wrote:
Interesting! While the backtrace I enclosed was from 6.2.0, we’ve seen much the same
behavior with the 6.2.1 beta; in fact, if anything, it’s gotten worse. We’ve also seen
several instances of the machine (under 6.2.1) suddenly being completely out of memory, or
of file handles. Unfortunately, we started noticing this behavior at just about the time
the beta ended…

That is not good news > :frowning: > Our tests show everything is okay. If you
get more info about how to reproduce this issue, please let us know. TIA.

Murf

kirk wrote:

There is a bug with the qnet dll – if you umount() anything within the
io-net process, when qnet is or was running, then you have a good chance
of faulting the whole io-net process. This bug was fixed with qnet dll
in the 6.2.1 release.

Does your module have any C++ component? I used to face some issues while
using dinkum libraries, especially while trying to do dlclose on some
particular .SOs which happened to use C++ components…

Sreekanth

“John A. Murphy” <murf@perftech.com> wrote in message
news:3E4A4CED.8C664E03@perftech.com

This has nothing to do with networking, but on the subject of system
stability, I’m interested
in learning more about memory usage. On several occasions we’ve had a
machine start failing to
execute programs due to lack of memory. I’ve dumped the output of “pidin
mem” when things were
bad (pidin was one of the few commands that would still execute), right
after a reboot, and from
another machine, and spent some time comparing them, looking for a memory
hog — but I find no
significant differences. Anybody got any hints on tracking down this sort
of thing?

Murf

kirk wrote:

John A. Murphy <> murf@perftech.com> > wrote:
Interesting! While the backtrace I enclosed was from 6.2.0, we’ve
seen much the same
behavior with the 6.2.1 beta; in fact, if anything, it’s gotten worse.
We’ve also seen
several instances of the machine (under 6.2.1) suddenly being
completely out of memory, or
of file handles. Unfortunately, we started noticing this behavior at
just about the time
the beta ended…

That is not good news > :frowning: > Our tests show everything is okay. If you
get more info about how to reproduce this issue, please let us know.
TIA.

Murf

kirk wrote:

There is a bug with the qnet dll – if you umount() anything within
the
io-net process, when qnet is or was running, then you have a good
chance
of faulting the whole io-net process. This bug was fixed with qnet
dll
in the 6.2.1 release.

Nope, no C++. Been there, done that, ddin’t much enjoy it.

Murf

Sreekanth wrote:

Does your module have any C++ component? I used to face some issues while
using dinkum libraries, especially while trying to do dlclose on some
particular .SOs which happened to use C++ components…

Sreekanth

“John A. Murphy” <> murf@perftech.com> > wrote in message
news:> 3E4A4CED.8C664E03@perftech.com> …
This has nothing to do with networking, but on the subject of system
stability, I’m interested
in learning more about memory usage. On several occasions we’ve had a
machine start failing to
execute programs due to lack of memory. I’ve dumped the output of “pidin
mem” when things were
bad (pidin was one of the few commands that would still execute), right
after a reboot, and from
another machine, and spent some time comparing them, looking for a memory
hog — but I find no
significant differences. Anybody got any hints on tracking down this sort
of thing?

Murf

kirk wrote:

John A. Murphy <> murf@perftech.com> > wrote:
Interesting! While the backtrace I enclosed was from 6.2.0, we’ve
seen much the same
behavior with the 6.2.1 beta; in fact, if anything, it’s gotten worse.
We’ve also seen
several instances of the machine (under 6.2.1) suddenly being
completely out of memory, or
of file handles. Unfortunately, we started noticing this behavior at
just about the time
the beta ended…

That is not good news > :frowning: > Our tests show everything is okay. If you
get more info about how to reproduce this issue, please let us know.
TIA.

Murf

kirk wrote:

There is a bug with the qnet dll – if you umount() anything within
the
io-net process, when qnet is or was running, then you have a good
chance
of faulting the whole io-net process. This bug was fixed with qnet
dll
in the 6.2.1 release.

Run spin early (before it is too late) and make it sort by memory usage.
Better run it with high priority. The hogs should be on top of the list.

– igor

John A. Murphy wrote:

This has nothing to do with networking, but on the subject of system stability, I’m interested
in learning more about memory usage. On several occasions we’ve had a machine start failing to
execute programs due to lack of memory. I’ve dumped the output of “pidin mem” when things were
bad (pidin was one of the few commands that would still execute), right after a reboot, and from
another machine, and spent some time comparing them, looking for a memory hog — but I find no
significant differences. Anybody got any hints on tracking down this sort of thing?

Murf

kirk wrote:


John A. Murphy <> murf@perftech.com> > wrote:

Interesting! While the backtrace I enclosed was from 6.2.0, we’ve seen much the same
behavior with the 6.2.1 beta; in fact, if anything, it’s gotten worse. We’ve also seen
several instances of the machine (under 6.2.1) suddenly being completely out of memory, or
of file handles. Unfortunately, we started noticing this behavior at just about the time
the beta ended…

That is not good news > :frowning: > Our tests show everything is okay. If you
get more info about how to reproduce this issue, please let us know. TIA.


Murf

kirk wrote:

There is a bug with the qnet dll – if you umount() anything within the
io-net process, when qnet is or was running, then you have a good chance
of faulting the whole io-net process. This bug was fixed with qnet dll
in the 6.2.1 release.

THe crash happened if you umount anything in io-net, while there is active
traffic on QNET
(some message passing cross QNET).

I seems forgot when we put the fix into 6.2.1, maybe the early 6.2.1 beta
don’t have the
fix…

-xtang

kirk <kirussel@NOSPAMrogers.com> wrote in message
news:b2dg7k$srd$1@inn.qnx.com

John A. Murphy <> murf@perftech.com> > wrote:
Interesting! While the backtrace I enclosed was from 6.2.0, we’ve seen
much the same
behavior with the 6.2.1 beta; in fact, if anything, it’s gotten worse.
We’ve also seen
several instances of the machine (under 6.2.1) suddenly being completely
out of memory, or
of file handles. Unfortunately, we started noticing this behavior at
just about the time
the beta ended…

That is not good news > :frowning: > Our tests show everything is okay. If you
get more info about how to reproduce this issue, please let us know. TIA.

Murf

kirk wrote:

There is a bug with the qnet dll – if you umount() anything within the
io-net process, when qnet is or was running, then you have a good
chance
of faulting the whole io-net process. This bug was fixed with qnet dll
in the 6.2.1 release.

I downloaded spin (version 1.10), but I get just a blank screen both using qtalk and from another
machine with the -H option. And either I cn’t figure out where to look, or the source wasn’t actually
in the archive…

Murf

Igor Kovalenko wrote:

Run spin early (before it is too late) and make it sort by memory usage.
Better run it with high priority. The hogs should be on top of the list.

– igor

John A. Murphy wrote:
This has nothing to do with networking, but on the subject of system stability, I’m interested
in learning more about memory usage. On several occasions we’ve had a machine start failing to
execute programs due to lack of memory. I’ve dumped the output of “pidin mem” when things were
bad (pidin was one of the few commands that would still execute), right after a reboot, and from
another machine, and spent some time comparing them, looking for a memory hog — but I find no
significant differences. Anybody got any hints on tracking down this sort of thing?

Murf

kirk wrote:


John A. Murphy <> murf@perftech.com> > wrote:

Interesting! While the backtrace I enclosed was from 6.2.0, we’ve seen much the same
behavior with the 6.2.1 beta; in fact, if anything, it’s gotten worse. We’ve also seen
several instances of the machine (under 6.2.1) suddenly being completely out of memory, or
of file handles. Unfortunately, we started noticing this behavior at just about the time
the beta ended…

That is not good news > :frowning: > Our tests show everything is okay. If you
get more info about how to reproduce this issue, please let us know. TIA.


Murf

kirk wrote:

There is a bug with the qnet dll – if you umount() anything within the
io-net process, when qnet is or was running, then you have a good chance
of faulting the whole io-net process. This bug was fixed with qnet dll
in the 6.2.1 release.

OK, that could explain a lot! Thanks, Xiaodan!

Murf

Xiaodan Tang wrote:

THe crash happened if you umount anything in io-net, while there is active
traffic on QNET
(some message passing cross QNET).

I seems forgot when we put the fix into 6.2.1, maybe the early 6.2.1 beta
don’t have the
fix…

-xtang

kirk <> kirussel@NOSPAMrogers.com> > wrote in message
news:b2dg7k$srd$> 1@inn.qnx.com> …
John A. Murphy <> murf@perftech.com> > wrote:
Interesting! While the backtrace I enclosed was from 6.2.0, we’ve seen
much the same
behavior with the 6.2.1 beta; in fact, if anything, it’s gotten worse.
We’ve also seen
several instances of the machine (under 6.2.1) suddenly being completely
out of memory, or
of file handles. Unfortunately, we started noticing this behavior at
just about the time
the beta ended…

That is not good news > :frowning: > Our tests show everything is okay. If you
get more info about how to reproduce this issue, please let us know. TIA.

Murf

kirk wrote:

There is a bug with the qnet dll – if you umount() anything within the
io-net process, when qnet is or was running, then you have a good
chance
of faulting the whole io-net process. This bug was fixed with qnet dll
in the 6.2.1 release.

Are you using qnet with bind=ip? You might want to try the socket
library that was posted to developers.qnx.com. It has a fix for an
issue that could cause you to run out of file handles in certain
situations.

Dave


John A. Murphy wrote:

Interesting! While the backtrace I enclosed was from 6.2.0, we’ve seen much the same
behavior with the 6.2.1 beta; in fact, if anything, it’s gotten worse. We’ve also seen
several instances of the machine (under 6.2.1) suddenly being completely out of memory, or
of file handles. Unfortunately, we started noticing this behavior at just about the time
the beta ended…

We’ve never been able to get qnet to work with bind=ip, so no, we’re not using it.
I’ll check into the socket library fix; thanks!

Murf

Dave Brown wrote:

Are you using qnet with bind=ip? You might want to try the socket
library that was posted to developers.qnx.com. It has a fix for an
issue that could cause you to run out of file handles in certain
situations.

Dave

John A. Murphy wrote:
Interesting! While the backtrace I enclosed was from 6.2.0, we’ve seen much the same
behavior with the 6.2.1 beta; in fact, if anything, it’s gotten worse. We’ve also seen
several instances of the machine (under 6.2.1) suddenly being completely out of memory, or
of file handles. Unfortunately, we started noticing this behavior at just about the time
the beta ended…