After working with QNX4 for more than 10 years, I am actually expieriencing a really wierd problem:
A certain process (nothing complicated, just a server that answers some questions, no hardware I/O) simply diappears after a few day running, without leaving any dumps or traceinfos. If this process ist started in a ‘while’ loop of a shell script, the shell disappears as well. The server has some modems attached, where some BMS dial in and is attached to a network with a slinger and phrelay as interface.
We changed the hardware already, so this is probably no hardware related problem. Unfortunately this happen only in a customer’s site and not in the lab
The only remarkable thing: the PID, that are uses for the new processes switch from 32xxx
to the low numbers just before this magic crash happens.
You can use tracectrl to up the level which is reported - to the point of showing all process creation and death. It is a little verbose, but you might be glean something from that. There is nothing special about the recycling pids. The only pids that “may” be special is pids 1-5. I suspect that really on pid 1 is special.
More than likely something killed your shell. You may be able to add shell commands to increase the verbosity there and even go as far as catching signals in the shell.
That’s what I did already, that’s why I found the coincidence with the ‘low’ PIDs.
I quess, that what I am going to do. The ‘dying process’ is almost the only one, which does’nt have any signal handler installed, nor it doesn’t ignore the SIGHUP. So I hope that will
avoid the process termination.
I still wonder, WHO could send the signals?!?
I find lots of ‘Generate hangup ()’ in the tracelog, every time a hangup on a modem
occurs, what’s normal. And it obviously never has a ‘negative effect’ on other processes.
The next ‘jump back’ to low PIDs will happen in a week or so, I hope will have some more diagnostics installed by then …
Is there any chance, to find out who has send the signal to a process?