Watchdog shutdown in C

Hello,

I am developing a watchdog program in ‘C’ under QNX4.25 for a
mission-critical application. In order to survive a memory leak, the
program must be able to perform a system shutdown without using ‘exec’ or
‘system’ calls that invoke a shell. (shell invocations fail due to
insufficient memory)

I’ve tried to find approaches in the documentation with no luck. I also
tried (naively) sending a _PROC_SHUTDOWN message to Proc32. This, as it
turns out, is an excellent way to lock up the system, if you’re so
inclined.

If anyone knows the secret handshake, won’t you please let me know?

Thanks for your time.

-Jim

If you put Shutdown in your .boot, will it work even without free memory as
it won’t have to load?

Jim Parnell <jparnell@wgate.com> wrote in message
news:39D35BDD.84FDCCC8@wgate.com

Hello,

I am developing a watchdog program in ‘C’ under QNX4.25 for a
mission-critical application. In order to survive a memory leak, the
program must be able to perform a system shutdown without using ‘exec’ or
‘system’ calls that invoke a shell. (shell invocations fail due to
insufficient memory)

I’ve tried to find approaches in the documentation with no luck. I also
tried (naively) sending a _PROC_SHUTDOWN message to Proc32. This, as it
turns out, is an excellent way to lock up the system, if you’re so
inclined.

If anyone knows the secret handshake, won’t you please let me know?

Thanks for your time.

-Jim

If you put shutdown it your .boot it will be executed at startup… :wink:

Does you machine as a hardware watchdog? Software watchdog are
not reliable enough for mission-critical application IHMO.

What if something prevents your watchdog from running.
What if there is an endless loop in an ISR.
What if your watchdog application crashes.

Only a hardware watchdog can protect you from these. If
you do have a hardware watchdog then there is no need
to perform a shutdown by software. You would let the hardware do
it.

That being said there is a way to perform a shutdown
via software. I don’t recall what it is since I’ve never used it :wink:
Some else will surely drop in and offer a piece of code.



“Paul Russell” <paul@jenosys.com> wrote in message
news:8qvqrj$9i4$1@inn.qnx.com

If you put Shutdown in your .boot, will it work even without free memory
as
it won’t have to load?

Jim Parnell <> jparnell@wgate.com> > wrote in message
news:> 39D35BDD.84FDCCC8@wgate.com> …
Hello,

I am developing a watchdog program in ‘C’ under QNX4.25 for a
mission-critical application. In order to survive a memory leak, the
program must be able to perform a system shutdown without using ‘exec’
or
‘system’ calls that invoke a shell. (shell invocations fail due to
insufficient memory)

I’ve tried to find approaches in the documentation with no luck. I also
tried (naively) sending a _PROC_SHUTDOWN message to Proc32. This, as it
turns out, is an excellent way to lock up the system, if you’re so
inclined.

If anyone knows the secret handshake, won’t you please let me know?

Thanks for your time.

-Jim
\

Mario,

You’re right, of course. Unfortunately, we have no hardware watchdog. Maybe
somebody will have mercy and send me a code snippet. :slight_smile:

Thanks,

-Jim


Mario Charest wrote:

If you put shutdown it your .boot it will be executed at startup… > :wink:

Does you machine as a hardware watchdog? Software watchdog are
not reliable enough for mission-critical application IHMO.

What if something prevents your watchdog from running.
What if there is an endless loop in an ISR.
What if your watchdog application crashes.

Only a hardware watchdog can protect you from these. If
you do have a hardware watchdog then there is no need
to perform a shutdown by software. You would let the hardware do
it.

That being said there is a way to perform a shutdown
via software. I don’t recall what it is since I’ve never used it > :wink:
Some else will surely drop in and offer a piece of code.

“Paul Russell” <> paul@jenosys.com> > wrote in message
news:8qvqrj$9i4$> 1@inn.qnx.com> …
If you put Shutdown in your .boot, will it work even without free memory
as
it won’t have to load?

Jim Parnell <> jparnell@wgate.com> > wrote in message
news:> 39D35BDD.84FDCCC8@wgate.com> …
Hello,

I am developing a watchdog program in ‘C’ under QNX4.25 for a
mission-critical application. In order to survive a memory leak, the
program must be able to perform a system shutdown without using ‘exec’
or
‘system’ calls that invoke a shell. (shell invocations fail due to
insufficient memory)

I’ve tried to find approaches in the documentation with no luck. I also
tried (naively) sending a _PROC_SHUTDOWN message to Proc32. This, as it
turns out, is an excellent way to lock up the system, if you’re so
inclined.

If anyone knows the secret handshake, won’t you please let me know?

Thanks for your time.

-Jim

\


Jim Parnell
WorldGate Communications, Inc.
(215) 354-5147 (Fax:1048)

Jim Parnell <jparnell@wgate.com> wrote:

Hello,

I am developing a watchdog program in ‘C’ under QNX4.25 for a
mission-critical application. In order to survive a memory leak, the
program must be able to perform a system shutdown without using ‘exec’ or
‘system’ calls that invoke a shell. (shell invocations fail due to
insufficient memory)

I’ve tried to find approaches in the documentation with no luck. I also
tried (naively) sending a _PROC_SHUTDOWN message to Proc32. This, as it
turns out, is an excellent way to lock up the system, if you’re so
inclined.

If anyone knows the secret handshake, won’t you please let me know?

Sending _PROC_SHUTDOWN to Proc32 is the proper way to shutdown – but
you have to build the message properly.

Basic code is:

/* block all signals */
bits = ~0L;
sigprocmask(SIG_BLOCK, &bits, 0);

/* tell Proc to send a shutdown (SIGPWR) signal to everyone */
msg.s.type = _PROC_SHUTDOWN;
msg.s.signum = SIGPWR;
Send(PROC_PID, &msg.s, &msg.r, sizeof(msg.s), sizeof(msg.r));

/* wait for things to shutdown */
sleep(whatever);

/* send SIGTERM to Fsys /
pid = qnx_name_locate( node, “qnx/fsys32”, 1024, NULL );
kill( pid, SIGTERM );
/
if you want, wait for Fsys to shut down /
/
either loop checking to see if it is still there, or sleep() */

/* shutdown the system */
msg.s.type = _PROC_SHUTDOWN;
msg.s.signum = -1;
Send(PROC_PID, &msg.s, &msg.r, sizeof(msg.s), sizeof(msg.r));

Oh ya, you need to be root to do this. :slight_smile:

-David

Many thanks! I’ll give this a shot.

-Jim

David Gibbs wrote:

Jim Parnell <> jparnell@wgate.com> > wrote:
Hello,

I am developing a watchdog program in ‘C’ under QNX4.25 for a
mission-critical application. In order to survive a memory leak, the
program must be able to perform a system shutdown without using ‘exec’ or
‘system’ calls that invoke a shell. (shell invocations fail due to
insufficient memory)

I’ve tried to find approaches in the documentation with no luck. I also
tried (naively) sending a _PROC_SHUTDOWN message to Proc32. This, as it
turns out, is an excellent way to lock up the system, if you’re so
inclined.

If anyone knows the secret handshake, won’t you please let me know?

Sending _PROC_SHUTDOWN to Proc32 is the proper way to shutdown – but
you have to build the message properly.

Basic code is:

/* block all signals */
bits = ~0L;
sigprocmask(SIG_BLOCK, &bits, 0);

/* tell Proc to send a shutdown (SIGPWR) signal to everyone */
msg.s.type = _PROC_SHUTDOWN;
msg.s.signum = SIGPWR;
Send(PROC_PID, &msg.s, &msg.r, sizeof(msg.s), sizeof(msg.r));

/* wait for things to shutdown */
sleep(whatever);

/* send SIGTERM to Fsys /
pid = qnx_name_locate( node, “qnx/fsys32”, 1024, NULL );
kill( pid, SIGTERM );
/
if you want, wait for Fsys to shut down /
/
either loop checking to see if it is still there, or sleep() */

/* shutdown the system */
msg.s.type = _PROC_SHUTDOWN;
msg.s.signum = -1;
Send(PROC_PID, &msg.s, &msg.r, sizeof(msg.s), sizeof(msg.r));

Oh ya, you need to be root to do this. > :slight_smile:

-David


Jim Parnell
WorldGate Communications, Inc.
(215) 354-5147 (Fax:1048)

David,

I’ve implemented your suggestion and it does reboot the system. Only one thing bothers me, and that’s a message from Dev32 saying
it’s had a SIGSEGV at: 006d:00017d2

I’d like to know if I need to sync() the filesystem myself or does Fsys do that for me? Also, how long do I need to wait at each
stage?

Any help would be greatly appreciated.

Here’s my code:

void
reboot_system(void)
{
int countdown = 5;
sigset_t bits;
union _shutdown_msg_tag
{
struct _proc_shutdown s;
struct _proc_shutdown_reply r;
} msg;
pid_t pid;

syslog(LOG_CRIT, “rebooting system”);
sync();
sleep(5);

/* block all signals */
bits = ~0L;
sigprocmask(SIG_BLOCK, &bits, 0);

/* tell Proc to send a shutdown (SIGPWR) signal to everyone */
msg.s.type = _PROC_SHUTDOWN;
msg.s.signum = SIGPWR;
Send(PROC_PID, &msg.s, &msg.r, sizeof(msg.s), sizeof(msg.r));

/* wait for things to shutdown */
sleep(4);

/* send SIGTERM to Fsys, waiting for it to die */
while (–countdown)
{
pid = qnx_name_locate( 0, “qnx/fsys32”, 1024, NULL );
if (pid == -1) break;

kill( pid, SIGTERM );
sleep(1);
}

/* shutdown the system */
msg.s.type = _PROC_SHUTDOWN;
msg.s.signum = -1;
Send(PROC_PID, &msg.s, &msg.r, sizeof(msg.s), sizeof(msg.r));

}

David Gibbs wrote:

Jim Parnell <> jparnell@wgate.com> > wrote:
Hello,

I am developing a watchdog program in ‘C’ under QNX4.25 for a
mission-critical application. In order to survive a memory leak, the
program must be able to perform a system shutdown without using ‘exec’ or
‘system’ calls that invoke a shell. (shell invocations fail due to
insufficient memory)

I’ve tried to find approaches in the documentation with no luck. I also
tried (naively) sending a _PROC_SHUTDOWN message to Proc32. This, as it
turns out, is an excellent way to lock up the system, if you’re so
inclined.

If anyone knows the secret handshake, won’t you please let me know?

Sending _PROC_SHUTDOWN to Proc32 is the proper way to shutdown – but
you have to build the message properly.

Basic code is:

/* block all signals */
bits = ~0L;
sigprocmask(SIG_BLOCK, &bits, 0);

/* tell Proc to send a shutdown (SIGPWR) signal to everyone */
msg.s.type = _PROC_SHUTDOWN;
msg.s.signum = SIGPWR;
Send(PROC_PID, &msg.s, &msg.r, sizeof(msg.s), sizeof(msg.r));

/* wait for things to shutdown */
sleep(whatever);

/* send SIGTERM to Fsys /
pid = qnx_name_locate( node, “qnx/fsys32”, 1024, NULL );
kill( pid, SIGTERM );
/
if you want, wait for Fsys to shut down /
/
either loop checking to see if it is still there, or sleep() */

/* shutdown the system */
msg.s.type = _PROC_SHUTDOWN;
msg.s.signum = -1;
Send(PROC_PID, &msg.s, &msg.r, sizeof(msg.s), sizeof(msg.r));

Oh ya, you need to be root to do this. > :slight_smile:

-David

Jim Parnell wrote:

I’d like to know if I need to sync() the filesystem myself or does Fsys do that for me?

Sending the SIGTERM to Fsys (you need a new-ish version) will perform a
“super-sync”,
flushing all dirty blocks and close down all Fsys drivers as well. It
is better to
do this than “sync()”, which is not synchronous in operation, or a
simple “shutdown”,
where the 10-second or 1-second countdown may not be enough time to
fully write-back
any dirty blocks.

/* send SIGTERM to Fsys, waiting for it to die */
while (–countdown)
{
pid = qnx_name_locate( 0, “qnx/fsys32”, 1024, NULL );
if (pid == -1) break;

kill( pid, SIGTERM );
sleep(1);
}

Sending multiple SIGTERMs is probably not a good idea, although it
shouldn’t hurt.
So perhaps something more like …

if ((pid = qnx_name_locate(0, “qnx/fsys32”, 0, NULL)) != -1 &&
!kill(pid, SIGTERM)) {
do {
sleep(1);
} while (!kill(pid, 0));
}

“Jim Parnell” <jparnell@wgate.com> wrote in message
news:39D4AE06.B9D18411@wgate.com

David,

I’ve implemented your suggestion and it does reboot the system. Only one
thing bothers me, and that’s a message from Dev32 saying
it’s had a SIGSEGV at: 006d:00017d2

I think this was solved, do you have the latest 4.25D?

Mario Charest wrote:

“Jim Parnell” <> jparnell@wgate.com> > wrote in message
news:> 39D4AE06.B9D18411@wgate.com> …
David,

I’ve implemented your suggestion and it does reboot the system. Only one
thing bothers me, and that’s a message from Dev32 saying
it’s had a SIGSEGV at: 006d:00017d2


I think this was solved, do you have the latest 4.25D?

No, here’s my sin :slight_smile:

sin ver
PROGRAM NAME VERSION DATE

/boot/sys/Proc32 Proc 4.25J Sep 09 1999
/boot/sys/Proc32 Slib16 4.23G Oct 04 1996
/boot/sys/Slib32 Slib32 4.24B Aug 12 1997
/bin/Fsys Fsys32 4.24T Feb 26 1999
/bin/Fsys.eide eide 4.24Q Jun 28 1999
//1/bin/Dev32 Dev32 4.23G Oct 04 1996
//1/bin/Dev32.ser Dev32.ser 4.23I Jun 27 1997
//1/bin/Dev32.ansi Dev32.ansi 4.23H Nov 21 1996
//1/bin/Dev32.pty Dev32.pty 4.23G Oct 04 1996
//1/bin/Pipe Pipe 4.23A Feb 26 1996
//1/bin/Mqueue mqueue 4.24A Aug 30 1999
//1/bin/Net Net 4.25C Aug 30 1999
//1/bin/Net.ether82557 Net.ether825 4.25G Mar 09 2000


Will the Dev32 fault cause any real problems or is it just a cosmetic thing?

-Jim

Hum
“Jim Parnell” <jparnell@wgate.com> wrote in message
news:39D4BA6E.A218A2AC@wgate.com

Mario Charest wrote:

“Jim Parnell” <> jparnell@wgate.com> > wrote in message
news:> 39D4AE06.B9D18411@wgate.com> …
David,

I’ve implemented your suggestion and it does reboot the system. Only
one
thing bothers me, and that’s a message from Dev32 saying
it’s had a SIGSEGV at: 006d:00017d2


I think this was solved, do you have the latest 4.25D?

No, here’s my sin > :slight_smile:



Will the Dev32 fault cause any real problems or is it just a cosmetic
thing?

It will not cause a problem since you are rebooting. At worse you may
loose
a character on the serial port…

Still it should be fixed IMHO, QSSL people should jump in.

-Jim

Jim Parnell <jparnell@wgate.com> wrote:

David,

I’ve implemented your suggestion and it does reboot the system. Only
one thing bothers me, and that’s a message from Dev32 saying
it’s had a SIGSEGV at: 006d:00017d2

I’ll bet it is because a Dev.driver is dying from the SIGPWR before
Dev32 gets to die, and Dev32 during cleanup tries to call a function
in that driver’s address space.

You’re killing everything off anyway, and rebooting the system – so
it shouldn’t be a problem.

-David

Roger that. David, Mario, et al, thanks for all the help. I’ll be ridin’
off into the sunset now…

-Jim

David Gibbs wrote:

Jim Parnell <> jparnell@wgate.com> > wrote:
David,

I’ve implemented your suggestion and it does reboot the system. Only
one thing bothers me, and that’s a message from Dev32 saying
it’s had a SIGSEGV at: 006d:00017d2

I’ll bet it is because a Dev.driver is dying from the SIGPWR before
Dev32 gets to die, and Dev32 during cleanup tries to call a function
in that driver’s address space.

You’re killing everything off anyway, and rebooting the system – so
it shouldn’t be a problem.

-David