fork()

I am using QNX 4.23A. We are having a problem where all of a sudden the
fork() function (and it would seem, all exec and spawn calls) return error
11 which means “Resource unavailable”. It will continue to do this until I
reboot. I have run a hack of osinfo and logged to file periodically in the
background just before the system fails, but there doesn’t seem to be any
resources that are either increasing or close to the limit. Does anyone
know what resource is unavailable? Has anyone else seen this type of
behaviour? Unfortunately it takes at least three days to recreate the
problem and when it does happen, I can’t do anything because the system is
so crippled.


Thanks in advance,

Marc.

The normal osinfo shows everything I have found to matter in this
situation except for ldt/gdt availability. Our solution, which has
always worked for us, is to add to increase the values for the “-S”
option to Proc.

Richard

Marc Desjardine wrote:

I am using QNX 4.23A. We are having a problem where all of a sudden the
fork() function (and it would seem, all exec and spawn calls) return error
11 which means “Resource unavailable”. It will continue to do this until I
reboot. I have run a hack of osinfo and logged to file periodically in the
background just before the system fails, but there doesn’t seem to be any
resources that are either increasing or close to the limit. Does anyone
know what resource is unavailable? Has anyone else seen this type of
behaviour? Unfortunately it takes at least three days to recreate the
problem and when it does happen, I can’t do anything because the system is
so crippled.

Thanks in advance,

Marc.

Marc Desjardine <MDesjardine@aptpower.com> wrote:

I am using QNX 4.23A. We are having a problem where all of a sudden the
fork() function (and it would seem, all exec and spawn calls) return error
11 which means “Resource unavailable”.

Maybe this helps:

We faced the following problem sometimes - the number of processes exceeded
and no other process could be started. However, I do not know, what was the
errno code for fork() call in such case (nothing else could be launched
on the machine). This problem was usually caused by “zombie” processes -
the processes, that finished the execution, but were not removed from
the process table (or how is the structure administered by Proc32 called).

The number of running processes can be increased by a Proc32 option -p
(this requires however making a new boot image).

To get rid of zombies, one can use the following construction:


#include <sys/qnx_glob.h>

qnx_spawn_options.flags |= _SPAWN_NOZOMBIE;
spawn( P_NOWAIT, …);





Mgr. Martin Gazak, MicroStep-MIS
Ilkovicova 3, 841 04 Bratislava, Slovakia
Tel: +421 2 60291 816
e-mail:matog@microstep-mis.sk

Try to check open file descriptors in your system.
Rami
Marc Desjardine <MDesjardine@aptpower.com> wrote in message
news:a00a7t$4sd$1@inn.qnx.com

I am using QNX 4.23A. We are having a problem where all of a sudden the
fork() function (and it would seem, all exec and spawn calls) return error
11 which means “Resource unavailable”. It will continue to do this until
I
reboot. I have run a hack of osinfo and logged to file periodically in
the
background just before the system fails, but there doesn’t seem to be any
resources that are either increasing or close to the limit. Does anyone
know what resource is unavailable? Has anyone else seen this type of
behaviour? Unfortunately it takes at least three days to recreate the
problem and when it does happen, I can’t do anything because the system is
so crippled.


Thanks in advance,

Marc.

Thanks,
In your system did the zombies appear in the process list (sin)?
Marc.

<matog@microstep-mis.sk> wrote in message
news:a01n1k$ve8$1@charon.microstep-mis.sk

Marc Desjardine <> MDesjardine@aptpower.com> > wrote:
I am using QNX 4.23A. We are having a problem where all of a sudden the
fork() function (and it would seem, all exec and spawn calls) return
error
11 which means “Resource unavailable”.

Maybe this helps:

We faced the following problem sometimes - the number of processes
exceeded
and no other process could be started. However, I do not know, what was
the
errno code for fork() call in such case (nothing else could be launched
on the machine). This problem was usually caused by “zombie” processes -
the processes, that finished the execution, but were not removed from
the process table (or how is the structure administered by Proc32 called).

The number of running processes can be increased by a Proc32 option -p
(this requires however making a new boot image).

To get rid of zombies, one can use the following construction:


#include <sys/qnx_glob.h

qnx_spawn_options.flags |= _SPAWN_NOZOMBIE;
spawn( P_NOWAIT, …);





Mgr. Martin Gazak, MicroStep-MIS
Ilkovicova 3, 841 04 Bratislava, Slovakia
Tel: +421 2 60291 816
e-mail:> matog@microstep-mis.sk

What are ldt/gdts used for?
Marc.
“Richard R. Kramer” <rrkramer@kramer-smilko.com> wrote in message
news:3C23CE7E.D1008400@kramer-smilko.com

The normal osinfo shows everything I have found to matter in this
situation except for ldt/gdt availability. Our solution, which has
always worked for us, is to add to increase the values for the “-S”
option to Proc.

Richard

Marc Desjardine wrote:

I am using QNX 4.23A. We are having a problem where all of a sudden the
fork() function (and it would seem, all exec and spawn calls) return
error
11 which means “Resource unavailable”. It will continue to do this
until I
reboot. I have run a hack of osinfo and logged to file periodically in
the
background just before the system fails, but there doesn’t seem to be
any
resources that are either increasing or close to the limit. Does anyone
know what resource is unavailable? Has anyone else seen this type of
behaviour? Unfortunately it takes at least three days to recreate the
problem and when it does happen, I can’t do anything because the system
is
so crippled.

Thanks in advance,

Marc.

Marc Desjardine <MDesjardine@aptpower.com> wrote:

Thanks,
In your system did the zombies appear in the process list (sin)?
Marc.

This is a reuslt of the simple test program:

void main (void)
{
int i;
for( i = 0; i < 5; i++ )
if( fork() == 0 ) exit( -1 );
for(;:wink: {}
}

The parent process creates 5 children, that became zombies, but still
exist in process table until the parent exits.

The output from “ps”:
2789 66 1 10o READY 12K parent
2792 66 1 30f DEAD 2789 0K
2793 66 1 30f DEAD 2789 0K
2794 66 1 30f DEAD 2789 0K
2795 66 1 30f DEAD 2789 0K
2796 66 1 30f DEAD 2789 0K

The output from “sin”:
1 2789 //10/*/fork/parent 9o READY — 4096 12k
1 2792 (zombie) 30f DEAD 2789 0 0
1 2793 (zombie) 30f DEAD 2789 0 0
1 2794 (zombie) 30f DEAD 2789 0 0
1 2795 (zombie) 30f DEAD 2789 0 0
1 2796 (zombie) 30f DEAD 2789 0 0




matog@microstep-mis.sk> > wrote in message
news:a01n1k$ve8$> 1@charon.microstep-mis.sk> …
Marc Desjardine <> MDesjardine@aptpower.com> > wrote:
I am using QNX 4.23A. We are having a problem where all of a sudden the
fork() function (and it would seem, all exec and spawn calls) return
error
11 which means “Resource unavailable”.

Maybe this helps:

We faced the following problem sometimes - the number of processes
exceeded
and no other process could be started. However, I do not know, what was
the
errno code for fork() call in such case (nothing else could be launched
on the machine). This problem was usually caused by “zombie” processes -
the processes, that finished the execution, but were not removed from
the process table (or how is the structure administered by Proc32 called).

The number of running processes can be increased by a Proc32 option -p
(this requires however making a new boot image).

To get rid of zombies, one can use the following construction:


#include <sys/qnx_glob.h

qnx_spawn_options.flags |= _SPAWN_NOZOMBIE;
spawn( P_NOWAIT, …);


Mgr. Martin Gazak, MicroStep-MIS
Ilkovicova 3, 841 04 Bratislava, Slovakia
Tel: +421 2 60291 816
e-mail:matog@microstep-mis.sk

Right, we do intentially clean up after our zombies by calling a waitpid()
for each child that is created. I am pretty sure that we don’t have a bunch
of zombies in the list, but as you experienced, I can’t verify this once we
get into this failed state, I can only say that 30 seconds prior to the
failure, there were no zombies accumulated in the system.

Marc.

<matog@microstep-mis.sk> wrote in message
news:a1132b$9sc$1@charon.microstep-mis.sk

Marc Desjardine <> MDesjardine@aptpower.com> > wrote:
Thanks,
In your system did the zombies appear in the process list (sin)?
Marc.

This is a reuslt of the simple test program:

void main (void)
{
int i;
for( i = 0; i < 5; i++ )
if( fork() == 0 ) exit( -1 );
for(;:wink: {}
}

The parent process creates 5 children, that became zombies, but still
exist in process table until the parent exits.

The output from “ps”:
2789 66 1 10o READY 12K parent
2792 66 1 30f DEAD 2789 0K <defunct
2793 66 1 30f DEAD 2789 0K <defunct
2794 66 1 30f DEAD 2789 0K <defunct
2795 66 1 30f DEAD 2789 0K <defunct
2796 66 1 30f DEAD 2789 0K <defunct

The output from “sin”:
1 2789 //10/*/fork/parent 9o READY — 4096 12k
1 2792 (zombie) 30f DEAD 2789 0 0
1 2793 (zombie) 30f DEAD 2789 0 0
1 2794 (zombie) 30f DEAD 2789 0 0
1 2795 (zombie) 30f DEAD 2789 0 0
1 2796 (zombie) 30f DEAD 2789 0 0




matog@microstep-mis.sk> > wrote in message
news:a01n1k$ve8$> 1@charon.microstep-mis.sk> …
Marc Desjardine <> MDesjardine@aptpower.com> > wrote:
I am using QNX 4.23A. We are having a problem where all of a sudden
the
fork() function (and it would seem, all exec and spawn calls) return
error
11 which means “Resource unavailable”.

Maybe this helps:

We faced the following problem sometimes - the number of processes
exceeded
and no other process could be started. However, I do not know, what was
the
errno code for fork() call in such case (nothing else could be launched
on the machine). This problem was usually caused by “zombie”
processes -
the processes, that finished the execution, but were not removed from
the process table (or how is the structure administered by Proc32
called).

The number of running processes can be increased by a Proc32 option -p
(this requires however making a new boot image).

To get rid of zombies, one can use the following construction:


#include <sys/qnx_glob.h

qnx_spawn_options.flags |= _SPAWN_NOZOMBIE;
spawn( P_NOWAIT, …);


Mgr. Martin Gazak, MicroStep-MIS
Ilkovicova 3, 841 04 Bratislava, Slovakia
Tel: +421 2 60291 816
e-mail:> matog@microstep-mis.sk

“Marc Desjardine” <MDesjardine@aptpower.com> wrote in message
news:a11tgm$4as$1@inn.qnx.com

Right, we do intentially clean up after our zombies by calling a waitpid()
for each child that is created. I am pretty sure that we don’t have a
bunch
of zombies in the list, but as you experienced, I can’t verify this once
we
get into this failed state, I can only say that 30 seconds prior to the
failure, there were no zombies accumulated in the system.

I would run a high priority shell, and have a secondary computer connected
via serial cable to access that shell. In it you should be able to do a sin
fds , sin ver and sin gdt which might lead to some more clues. Post what
you find out so we can all help out.

-Adam

Previously, Marc Desjardine wrote in qdn.public.qnx4:

What are ldt/gdts used for?

gdt - global descriptor table
ldt - local descriptor table

This is part of the 386/486/Pentium architecture. Each memory
access in the processor is via a two part address. The first
part of the address comes from a selector register. This register
holds an index into a descriptor table. There is one bit in this
register that controls whether it is the global or local table that
is used. The descriptor tables hold “linear” addresses. Think of
these as real address for right now. The 2nd part of an address is
an offset register. Examples are SI, DI, BX, BP, and SP.

There is no absolute need for two tables, however makes implementing
processes much easier. There is typically one global descriptor table
for the entire system. Each process will have its own local
descriptor table. Local descriptor tables make dealing with
the “fork()” call much easier to implement. Consider what
happens if you allocate memory and store its address in a
far pointer. The pointer holds both a selector and an
offset. If you fork the process, you now have two pointers
with identical selector values. If they both were in the
GDT, then both processes would be pointing to the same data.
The LDT solves this problem. Each process has its own LDT.

If this is not confusing enough, check out one of Intel’s programmer
manuals. There’s lots of additional confusing things like pages,
jump games and the like.



Mitchell Schoenbrun --------- maschoen@pobox.com

Thanks for the explanation, I appreciate it.
Marc.

“Mitchell Schoenbrun” <maschoen@pobox.com> wrote in message
news:Voyager.020103093908.1504A@schoenbrun.com

Previously, Marc Desjardine wrote in qdn.public.qnx4:

What are ldt/gdts used for?

gdt - global descriptor table
ldt - local descriptor table

This is part of the 386/486/Pentium architecture. Each memory
access in the processor is via a two part address. The first
part of the address comes from a selector register. This register
holds an index into a descriptor table. There is one bit in this
register that controls whether it is the global or local table that
is used. The descriptor tables hold “linear” addresses. Think of
these as real address for right now. The 2nd part of an address is
an offset register. Examples are SI, DI, BX, BP, and SP.

There is no absolute need for two tables, however makes implementing
processes much easier. There is typically one global descriptor table
for the entire system. Each process will have its own local
descriptor table. Local descriptor tables make dealing with
the “fork()” call much easier to implement. Consider what
happens if you allocate memory and store its address in a
far pointer. The pointer holds both a selector and an
offset. If you fork the process, you now have two pointers
with identical selector values. If they both were in the
GDT, then both processes would be pointing to the same data.
The LDT solves this problem. Each process has its own LDT.

If this is not confusing enough, check out one of Intel’s programmer
manuals. There’s lots of additional confusing things like pages,
jump games and the like.



Mitchell Schoenbrun --------- > maschoen@pobox.com

I will try that and post the results as soon as I have some. Unfortunately
(or fortunately!!) this problem does not occur very often at all. Under
accelerated testing I still have to wait up to 3 days or so for the problem
to show up.
Marc.

“Core OS Product Group” <os@qnx.com> wrote in message
news:a1220h$42q$1@nntp.qnx.com

“Marc Desjardine” <> MDesjardine@aptpower.com> > wrote in message
news:a11tgm$4as$> 1@inn.qnx.com> …
Right, we do intentially clean up after our zombies by calling a
waitpid()
for each child that is created. I am pretty sure that we don’t have a
bunch
of zombies in the list, but as you experienced, I can’t verify this once
we
get into this failed state, I can only say that 30 seconds prior to the
failure, there were no zombies accumulated in the system.

I would run a high priority shell, and have a secondary computer connected
via serial cable to access that shell. In it you should be able to do a
sin
fds , sin ver and sin gdt which might lead to some more clues. Post what
you find out so we can all help out.

-Adam

Maybe you should try osinfo. This utility will monitor the system resources
and can potentially point out which resource is being exhausted. A few other
things to monitor are virtual memory consumption (via sin in) and the number
of fds per process (sin fds).

Logging these periodically to see which (if any) of then is increasing might
help.

  • Richard

Core OS Product Group <os@qnx.com> wrote:

“Marc Desjardine” <> MDesjardine@aptpower.com> > wrote in message
news:a11tgm$4as$> 1@inn.qnx.com> …
Right, we do intentially clean up after our zombies by calling a waitpid()
for each child that is created. I am pretty sure that we don’t have a
bunch
of zombies in the list, but as you experienced, I can’t verify this once
we
get into this failed state, I can only say that 30 seconds prior to the
failure, there were no zombies accumulated in the system.

I would run a high priority shell, and have a secondary computer connected
via serial cable to access that shell. In it you should be able to do a sin
fds , sin ver and sin gdt which might lead to some more clues. Post what
you find out so we can all help out.

Actually, if you can’t fork you probably can’t just run sin anything.

BUT… what you can do if you get into this situation as a desperate
last-ditch attempt to grab some info… try:

exec sin fds

This will have the shell do an exec(), rather than trying to create
a new process, and may allow you to sneak in one more command… or
maybe one more command per shell you have available to canibilise in
this way. Of course, you only get one per shell.

-David

QNX Training Services
I do not answer technical questions by email.

I am having problems recreating the situation. I just can’t seem to get it
to happen. I am wondering what I should be looking for? Is there a count
that may be increasing or fragmentation that occurs? I think I understand
was ldt/gdts are but don’t know what I am seeing or looking for in the
output of sin gdt.

Marc.
“Mitchell Schoenbrun” <maschoen@pobox.com> wrote in message
news:Voyager.020103093908.1504A@schoenbrun.com

Previously, Marc Desjardine wrote in qdn.public.qnx4:

What are ldt/gdts used for?

gdt - global descriptor table
ldt - local descriptor table

This is part of the 386/486/Pentium architecture. Each memory
access in the processor is via a two part address. The first
part of the address comes from a selector register. This register
holds an index into a descriptor table. There is one bit in this
register that controls whether it is the global or local table that
is used. The descriptor tables hold “linear” addresses. Think of
these as real address for right now. The 2nd part of an address is
an offset register. Examples are SI, DI, BX, BP, and SP.

There is no absolute need for two tables, however makes implementing
processes much easier. There is typically one global descriptor table
for the entire system. Each process will have its own local
descriptor table. Local descriptor tables make dealing with
the “fork()” call much easier to implement. Consider what
happens if you allocate memory and store its address in a
far pointer. The pointer holds both a selector and an
offset. If you fork the process, you now have two pointers
with identical selector values. If they both were in the
GDT, then both processes would be pointing to the same data.
The LDT solves this problem. Each process has its own LDT.

If this is not confusing enough, check out one of Intel’s programmer
manuals. There’s lots of additional confusing things like pages,
jump games and the like.



Mitchell Schoenbrun --------- > maschoen@pobox.com

Previously, Marc Desjardine wrote in qdn.public.qnx4:

I don’t think I’m alone in remembering what “situation” you are talking
about. Why not be a little more specific.

I am having problems recreating the situation. I just can’t seem to get it
to happen. I am wondering what I should be looking for? Is there a count
that may be increasing or fragmentation that occurs? I think I understand
was ldt/gdts are but don’t know what I am seeing or looking for in the
output of sin gdt.

Mitchell Schoenbrun --------- maschoen@pobox.com

I am sorry, it looks like the head of the thread I was posting to has
disappeared. Here are the original contents:

Posted 2001-12-21 by me:
I am using QNX 4.23A. We are having a problem where all of a sudden the
fork() function (and it would seem, all exec and spawn calls) return error
11 which means “Resource unavailable”. It will continue to do this until I
reboot. I have run a hack of osinfo and logged to file periodically in the
background just before the system fails, but there doesn’t seem to be any
resources that are either increasing or close to the limit. Does anyone
know what resource is unavailable? Has anyone else seen this type of
behaviour? Unfortunately it takes at least three days to recreate the
problem and when it does happen, I can’t do anything because the system is
so crippled.

One response was:
“Richard R. Kramer” <rrkramer@kramer-smilko.com> wrote in message
news:3C23CE7E.D1008400@kramer-smilko.com

The normal osinfo shows everything I have found to matter in this
situation except for ldt/gdt availability. Our solution, which has
always worked for us, is to add to increase the values for the “-S”
option to Proc.

Richard

And now I am asking what I should be looking for when viewing the output of
sin gdt. Sorry the confusion.


Marc.

Previously, Marc Desjardine wrote in qdn.public.qnx4:

Well “sin gdt” will only tell you what selectors in
the GDT are in use and what they are set to. Run “sin gdt”
on a newly rebooted machine and you will see that a large
group of them are all zeros, obviously not in use. If
they all show up in use when this event happens, then
you will know that you ran out of room in the gdt.



I am sorry, it looks like the head of the thread I was posting to has
disappeared. Here are the original contents:

Posted 2001-12-21 by me:
I am using QNX 4.23A. We are having a problem where all of a sudden the
fork() function (and it would seem, all exec and spawn calls) return error
11 which means “Resource unavailable”. It will continue to do this until I
reboot. I have run a hack of osinfo and logged to file periodically in the
background just before the system fails, but there doesn’t seem to be any
resources that are either increasing or close to the limit. Does anyone
know what resource is unavailable? Has anyone else seen this type of
behaviour? Unfortunately it takes at least three days to recreate the
problem and when it does happen, I can’t do anything because the system is
so crippled.

One response was:
“Richard R. Kramer” <> rrkramer@kramer-smilko.com> > wrote in message
news:> 3C23CE7E.D1008400@kramer-smilko.com> …
The normal osinfo shows everything I have found to matter in this
situation except for ldt/gdt availability. Our solution, which has
always worked for us, is to add to increase the values for the “-S”
option to Proc.

Richard

And now I am asking what I should be looking for when viewing the output of
sin gdt. Sorry the confusion.


Marc.
\


Mitchell Schoenbrun --------- maschoen@pobox.com