Popen: Process hangs waiting for zombie

Hi,

I have written a program under qnx 424 that has the following function for
executing commands and sending the output to an ncurses window.
when executing some scripts, the shell spawned by popen never sends the
SIGCHLD signal to my process which causes the fgetc to wait forever and the
spawned shell to zombie. Making seemingly frivolus changes in the scripts
can cause this problem, such as adding comments. The scrip is completing,
and it runs fine from the command line.

Can anyone shed some light on this for me ?

Thanks for the help

John Love
Tolltex, Inc.

int
exec_sh_cmd(char *cmd)
{
FILE *fp;
char buf[128];
int cmd_ret_val;

fp = fopen(cmd, “r”);

if( fp == NULL )
return( -1 );

while( (c = fgetc(fp)) != EOF ){
// send output to ncurses window
}

cmd_ret_val = pclose(fp);
return( cmd_ret_val );
}

Here is the output of sin, sin in, and ps commands. THe parent program is
vinstall, the child is the sombie
sin
SID PID PROGRAM PRI STATE BLK CODE DATA
– – Microkernel — ----- — 10448 0
0 1 sys/Proc32 30f READY — 118k 471k
0 2 sys/Slib32 10r RECV 0 53k 4096
0 4 /bin/Fsys 10r RECV 0 77k 46247k
0 5 /bin/Fsys.floppy 10o RECV 0 20k 36k
0 8 idle 0r READY — 0 81k
0 14 //1/bin/Dev 24f RECV 0 32k 90k
0 17 //1/bin/Dev.con 20r RECV 0 40k 49k
1 29 //1/bin/ksh 10o WAIT -1 47k 36k
2 30 //1/bin/ksh 10o WAIT -1 47k 36k
1 67 //1/ram/vinstall 10o REPLY 4 126k 2646k
1 78 //1/ram/Net 23r RECV 0 32k 73k
1 80 //1/ram/Net.ether905 20r RECV 0 45k 86k
1 83 //1/ram/Socklet 22r RECV 0 114k 139k
1 104 //1/ram/vdir 10o RECV 0 20k 3248k
1 115 (zombie) 30f DEAD 67 0 0
1 119 //1/ram/bin32/Fsys.ata 22o RECV 0 24k 20k
2 132 //1/ram/sin 10o REPLY 1 45k 40k


sin in
Node CPU Machine Speed Memory Ticksize Display
Flags
1 686/687 PCI 33848 200M/253M 10.0ms VGA
Color -3P±---------8P

Heapp Heapf Heapl Heapn Hands Names Sessions Procs Timers Nodes Virtual
0 0 23392 0 64 100 64 500 125 1 203M/
268M

Boot from Flop at Dec 05 16:15 Locators:

ps
PID PGRP SID PRI STATE BLK SIZE COMMAND
1 1 0 30f READY 262066K Proc32 -l 1
2 2 0 10r RECV 0 108K Slib32
4 4 0 10r RECV 0 90520K Fsys -r 10000
5 4 0 10o RECV 0 45332K Fsys.floppy
8 8 0 0r READY 80K (idle)
14 7 0 24f RECV 0 252K Dev
17 7 0 20r RECV 0 392K Dev.con -n 2
29 29 1 10o WAIT -1 36K /bin/sh
30 30 2 10o WAIT -1 44K /bin/sh
67 29 1 10o REPLY 4 2584K vinstall
78 29 1 23r RECV 0 192K Net
80 29 1 20r RECV 0 172K Net.ether905
83 83 1 22r RECV 0 136K Socklet lane
104 29 1 10o RECV 0 3172K vdir -n /version/tmp
115 29 1 30f DEAD 67 0K
119 4 1 22o RECV 0 45320K /ram/bin32/Fsys
134 30 2 10o REPLY 1 24K ps

I have encounter ths same problem, using spawnl to fork a new process, the
parent never receives a SIGCHLD (strange behaviour … i would be glad if
someone could explain me).

You should try to use waitpid (instead of fgetc) to detect the death of the
child process created with popen.


“John Love” <john@tolltex.com> a écrit dans le message de news:
asogv4$kmc$1@inn.qnx.com

Hi,

I have written a program under qnx 424 that has the following function for
executing commands and sending the output to an ncurses window.
when executing some scripts, the shell spawned by popen never sends the
SIGCHLD signal to my process which causes the fgetc to wait forever and
the
spawned shell to zombie. Making seemingly frivolus changes in the scripts
can cause this problem, such as adding comments. The scrip is completing,
and it runs fine from the command line.

Can anyone shed some light on this for me ?

Thanks for the help

John Love
Tolltex, Inc.

int
exec_sh_cmd(char *cmd)
{
FILE *fp;
char buf[128];
int cmd_ret_val;

fp = fopen(cmd, “r”);

if( fp == NULL )
return( -1 );

while( (c = fgetc(fp)) != EOF ){
// send output to ncurses window
}

cmd_ret_val = pclose(fp);
return( cmd_ret_val );
}

Here is the output of sin, sin in, and ps commands. THe parent program is
vinstall, the child is the sombie
sin
SID PID PROGRAM PRI STATE BLK CODE DATA
– – Microkernel — ----- — 10448 0
0 1 sys/Proc32 30f READY — 118k 471k
0 2 sys/Slib32 10r RECV 0 53k 4096
0 4 /bin/Fsys 10r RECV 0 77k 46247k
0 5 /bin/Fsys.floppy 10o RECV 0 20k 36k
0 8 idle 0r READY — 0 81k
0 14 //1/bin/Dev 24f RECV 0 32k 90k
0 17 //1/bin/Dev.con 20r RECV 0 40k 49k
1 29 //1/bin/ksh 10o WAIT -1 47k 36k
2 30 //1/bin/ksh 10o WAIT -1 47k 36k
1 67 //1/ram/vinstall 10o REPLY 4 126k 2646k
1 78 //1/ram/Net 23r RECV 0 32k 73k
1 80 //1/ram/Net.ether905 20r RECV 0 45k 86k
1 83 //1/ram/Socklet 22r RECV 0 114k 139k
1 104 //1/ram/vdir 10o RECV 0 20k 3248k
1 115 (zombie) 30f DEAD 67 0 0
1 119 //1/ram/bin32/Fsys.ata 22o RECV 0 24k 20k
2 132 //1/ram/sin 10o REPLY 1 45k 40k


sin in
Node CPU Machine Speed Memory Ticksize Display
Flags
1 686/687 PCI 33848 200M/253M 10.0ms VGA
Color -3P±---------8P

Heapp Heapf Heapl Heapn Hands Names Sessions Procs Timers Nodes Virtual
0 0 23392 0 64 100 64 500 125 1 203M/
268M

Boot from Flop at Dec 05 16:15 Locators:

ps
PID PGRP SID PRI STATE BLK SIZE COMMAND
1 1 0 30f READY 262066K Proc32 -l 1
2 2 0 10r RECV 0 108K Slib32
4 4 0 10r RECV 0 90520K Fsys -r 10000
5 4 0 10o RECV 0 45332K Fsys.floppy
8 8 0 0r READY 80K (idle)
14 7 0 24f RECV 0 252K Dev
17 7 0 20r RECV 0 392K Dev.con -n 2
29 29 1 10o WAIT -1 36K /bin/sh
30 30 2 10o WAIT -1 44K /bin/sh
67 29 1 10o REPLY 4 2584K vinstall
78 29 1 23r RECV 0 192K Net
80 29 1 20r RECV 0 172K Net.ether905
83 83 1 22r RECV 0 136K Socklet lane
104 29 1 10o RECV 0 3172K vdir -n /version/tmp
115 29 1 30f DEAD 67 0K <defunct
119 4 1 22o RECV 0 45320K /ram/bin32/Fsys
134 30 2 10o REPLY 1 24K ps

\

“Sebastien Cantos” <scantos@technodiva.com> wrote in message
news:at75mm$7sd$1@inn.qnx.com

I have encounter ths same problem, using spawnl to fork a new process, the
parent never receives a SIGCHLD (strange behaviour … i would be glad
if
someone could explain me).

As far as i understand, SIGCHLD is blocked by default. So in order to
receive SIGCHLD signal you must unblock it withing parent’s code like this:

#include <signal.h>

sigset_t ss;

sigemptyset(&ss);
sigaddset(&ss, SIGCHLD);
sigprocmask(SIG_UNBLOCK, &ss, NULL);

BTW the same situation with SIGPWR signal.

// wbr

I found the problem, details below

It seems that the SIGCHLD signal would come in while the code was updating
the ncurses screen.
The top of the while loop would then call fgetc(), which would wait for the
SIGCHLD which had already arrived.
So I installed a sighandler that sets a global child_died int that my while
loop can check before calling fgetc().

Best Regards
John Love

The original code…
int
exec_sh_cmd(char *cmd)
{
FILE *fp;
char buf[128];
int cmd_ret_val;

fp = fopen(cmd, “r”);

if( fp == NULL )
return( -1 );

while( (c = fgetc(fp)) != EOF ){
// send output to ncurses window
}

cmd_ret_val = pclose(fp);
return( cmd_ret_val );
}

The new code…

// global
int child_died;

void
sig_handler(int sig)
{
if( sig == SIGCHLD ){
child_died = 1;
}
}

int
exec_sh_cmd(char *cmd)
{
FILE *fp;
char buf[128];
int cmd_ret_val;

// open the pipe
fp = fopen(cmd, “r”);
if( fp == NULL )
return( -1 );

// setup the signal handler
signal(SIGCHLD, sig_handler );
child_died = 0;

// read from the pipe
while( (c = fgetc(fp)) != EOF && !child_died ){ // expressions are
tested right to left
// send output to ncurses window
}

// reset the signal to default processing
signal(SIGCHLD, SIG_DFLT);

// close the pipe
cmd_ret_val = pclose(fp);
return( cmd_ret_val );
}


“Ian Zagorskih” <ianzag@megasignal.com> wrote in message
news:at782c$aij$1@inn.qnx.com

“Sebastien Cantos” <> scantos@technodiva.com> > wrote in message
news:at75mm$7sd$> 1@inn.qnx.com> …
I have encounter ths same problem, using spawnl to fork a new process,
the
parent never receives a SIGCHLD (strange behaviour … i would be glad
if
someone could explain me).


As far as i understand, SIGCHLD is blocked by default. So in order to
receive SIGCHLD signal you must unblock it withing parent’s code like
this:

#include <signal.h

sigset_t ss;

sigemptyset(&ss);
sigaddset(&ss, SIGCHLD);
sigprocmask(SIG_UNBLOCK, &ss, NULL);

BTW the same situation with SIGPWR signal.

// wbr