PtSpawn problem

I am having a problem with PtSpawn in a phab app. Basically, the app
manages a bunch of servers and can start and stop them via buttons.
it also monitors the status of the servers and shows information about
their state. (i.e. running, stopped,etc)

Basically the PtSpawn call works the first time - starting the
executable and returning the PID of the spawned process. but the
second (or more) time PtSpawn is called, it returns a -1 and a errno
of 3 which is ESRCH (error message “no such process”) This is not
listed as a valid error code in the spawn* functions. What does this
errno mean in this context…

Some things I tried to no avail…

Tried using spawnp - with the same results…
I have passed the full pathname.
I have varied the defaults and presence of the notifcation callback.

The above code works fine (i.e. multiple invocations) on a simple
“hello world” type program but fails on a non-trivial app… (Is there
maybe a stack problem here – why does it work the first time…)

I CAN start the server executable with a shell at any time and it runs
fine.

I then made a toy PHAB app with 1 button which spawns just 1 server
using the code cut and pasted from the larger app…and it works
correctly!?!?!?! (expletive!)

Any guesses? – link order? shared vs. static…

Of course, This code works flawlessly in QNX4 - what am I missing?

-Bill

BTW PtSpawn Docs should mention that the return codes can
be found in the spawn documentation – all it says is -1 is an
error… Assuming this is true…

On Thu, 22 Feb 2001 21:40:50 GMT, derbyw@derbtronics.com (William M.
Derby Jr.) wrote:

I am having a problem with PtSpawn in a phab app. Basically, the app
manages a bunch of servers and can start and stop them via buttons.
it also monitors the status of the servers and shows information about
their state. (i.e. running, stopped,etc)

Basically the PtSpawn call works the first time - starting the
executable and returning the PID of the spawned process. but the
second (or more) time PtSpawn is called, it returns a -1 and a errno
of 3 which is ESRCH (error message “no such process”) This is not
listed as a valid error code in the spawn* functions. What does this
errno mean in this context…

I have found out the cause of this is having a connection open with a
dead process. If you do a ConnectAttach to the server process
and then it dies in any way you cannot call PtSpawn succesfully until
you call ConnectDetach on the old coid. This apparently is what the
ESRCH return means in a round about way…

The connect detach can be done in the death notificaton routine

I was previously assuming that the coid was invalid with the death of
process like and old pid - obviously I was wrong… This might be
mentioned as a warning somewhere in the docs (spawn, and
ConnectAttach come to mind), So others avoid what was a fairly
lengthy debug process…

-Bill

William M. Derby Jr. <derbyw@derbtronics.com> wrote:

I was previously assuming that the coid was invalid with the death of
process like and old pid - obviously I was wrong… This might be

Even with the death of a process, the pid doesn’t just disappear until
the parent gets the dead child’s exit status.

mentioned as a warning somewhere in the docs (spawn, and
ConnectAttach come to mind), So others avoid what was a fairly
lengthy debug process…

If the kernel just removed this kind of broken connections for you, this
would create a chance that the coid could get reused when you open a new
connection (possibly in an unrelated part of your code), without giving
you a chance to detect that the original one was broken. As a result,
any attempts to use the old (and now dead) connection would end up
sending to the new connection. Imagine debugging that!!!


Wojtek Lerch (wojtek@qnx.com) QNX Software Systems Ltd.

On 23 Feb 2001 15:40:34 GMT, Wojtek Lerch <wojtek@qnx.com> wrote:

William M. Derby Jr. <> derbyw@derbtronics.com> > wrote:
I was previously assuming that the coid was invalid with the death of
process like and old pid - obviously I was wrong… This might be

Even with the death of a process, the pid doesn’t just disappear until
the parent gets the dead child’s exit status.

mentioned as a warning somewhere in the docs (spawn, and
ConnectAttach come to mind), So others avoid what was a fairly
lengthy debug process…

If the kernel just removed this kind of broken connections for you, this
would create a chance that the coid could get reused when you open a new
connection (possibly in an unrelated part of your code), without giving
you a chance to detect that the original one was broken. As a result,
any attempts to use the old (and now dead) connection would end up
sending to the new connection. Imagine debugging that!!!

OK I can see that – but this line of reasoning doesn’t make it clear
why should this prevent one from spawning another task - with an
undocumented error code…

I didn’t check - does ConnectAttach also fail with an error after a
“stale” coid is created?

I seems to me that the stale coid should remain unusable until the
ConnectDetach. but the proper place for rturning an error would be on
an attempted use of the coid - I.e. MsgSend, etc… This would be a
place which would be prepared to deal with a stale coid…

So why is it that spawn should fail in this situation?

-Bill

William M. Derby Jr. <derbyw@derbtronics.com> wrote:

On 23 Feb 2001 15:40:34 GMT, Wojtek Lerch <> wojtek@qnx.com> > wrote:
If the kernel just removed this kind of broken connections for you, this
would create a chance that the coid could get reused when you open a new
connection (possibly in an unrelated part of your code), without giving
you a chance to detect that the original one was broken. As a result,
any attempts to use the old (and now dead) connection would end up
sending to the new connection. Imagine debugging that!!!

OK I can see that – but this line of reasoning doesn’t make it clear
why should this prevent one from spawning another task - with an
undocumented error code…

I think spawn() is trying to dup() all the fds for the new process.
Does your connection have the SIDE_CHANNEL flag?

I didn’t check - does ConnectAttach also fail with an error after a
“stale” coid is created?

ConnectAttach() doesn’t take a coid. Did you mean some other function?

I seems to me that the stale coid should remain unusable until the
ConnectDetach. but the proper place for rturning an error would be on
an attempted use of the coid - I.e. MsgSend, etc… This would be a
place which would be prepared to deal with a stale coid…

I believe they all do fail as well.


\

Wojtek Lerch (wojtek@qnx.com) QNX Software Systems Ltd.