fork(), mmap_device_memory() and shm_ctl()

Igor_Kovalenko2 · May 20, 2002, 8:20pm

“Kris Warkentin” <kewarken@qnx.com> wrote in message
news:ac0td0$1m2$1@nntp.qnx.com…
[…]

Exactly. There are any number of different times when you might take a
hit
after the fork() with COW that would depend on a lot of different
variables.
Without COW, you KNOW that you will take the hit at a certain time in your
execution an that’s it. Igor is looking for best case or average case
performance. We have to look at worst case for true real time.

There is large number of Unix-originated daemons which do fork() to handle
incoming requests. Apache would be most notorious example. Of course in an
ideal world QNX would only run software written for QNX, but alas our world
sucks

This issue is exactly the same as why we don’t have lazy mapping of shared
libraries. Linux, et.al. will only run the linker when a function is
first
called. That means you never really know when the linker will be taking
cycles from your app. We run the linker at startup to make sure
everything
is mapped in before we start execution. You take a hit on program startup
but it’s a known hit at a known time.

I know. However I am curious why do you have lazy-mapped stack then? Not
afraid of hits when someone allocates a 4Mb array on stack, are you? And
certainly not afraid of hits on those pesky recursive functions …

– igor

Kris_Warkentin1 · May 21, 2002, 2:32pm

“Igor Kovalenko” <Igor.Kovalenko@motorola.com> wrote in message
news:acbl2d$br8$1@inn.qnx.com…

There is large number of Unix-originated daemons which do fork() to handle
incoming requests. Apache would be most notorious example. Of course in an
ideal world QNX would only run software written for QNX, but alas our
world
sucks >

That is certainly a case for having COW as a per-process option. It’s also
a case for using server OSes for server applications. Seriously…if
your performance is bogging down due to forks from a high number of incoming
requests to something like Apache, aren’t you probably using the wrong tool
for the job? BTW, I believe that they may have finally threaded the newest
version of Apache.

I know. However I am curious why do you have lazy-mapped stack then? Not
afraid of hits when someone allocates a 4Mb array on stack, are you? And
certainly not afraid of hits on those pesky recursive functions …

Not sure what you mean by this. I believe that a certain amount of stack
space is allocated at execution startup (see “ldrel -s” for how to change
the size) and the program just used that. Then, if you run out during
execution, you die, which puts the burden on the implementer to decide an
appropriate stack size.

– igor

Igor_Kovalenko2 · May 22, 2002, 12:09am

“Kris Warkentin” <kewarken@qnx.com> wrote in message
news:acdldd$d3q$1@nntp.qnx.com…

“Igor Kovalenko” <> Igor.Kovalenko@motorola.com> > wrote in message
news:acbl2d$br8$> 1@inn.qnx.com> …

There is large number of Unix-originated daemons which do fork() to
handle
incoming requests. Apache would be most notorious example. Of course in
an
ideal world QNX would only run software written for QNX, but alas our
world
sucks >

That is certainly a case for having COW as a per-process option. It’s
also
a case for using server OSes for server applications. > > Seriously…if
your performance is bogging down due to forks from a high number of
incoming
requests to something like Apache, aren’t you probably using the wrong
tool
for the job? BTW, I believe that they may have finally threaded the
newest
version of Apache.

Yeah Kris, I keep telling the same thing to all the
wannabe-running-webserver-on-QNX-ers but they keep coming and trying to do
that no matter what… Which leaves bad aftertaste and creates unjust
reputation for QNX

I know. However I am curious why do you have lazy-mapped stack then? Not
afraid of hits when someone allocates a 4Mb array on stack, are you? And
certainly not afraid of hits on those pesky recursive functions …

Not sure what you mean by this. I believe that a certain amount of stack
space is allocated at execution startup (see “ldrel -s” for how to change
the size) and the program just used that. Then, if you run out during
execution, you die, which puts the burden on the implementer to decide an
appropriate stack size.

That was QNX4. In QNX6 stack pages are always mapped lazy and faulted-in as
your stack grows (thus you have ‘real’ and ‘virtual’ stack).

– igor

Kris_Warkentin1 · May 22, 2002, 1:46pm

“Igor Kovalenko” <Igor.Kovalenko@motorola.com> wrote in message
news:acemse$kqk$1@inn.qnx.com…

Not sure what you mean by this. I believe that a certain amount of
stack
space is allocated at execution startup (see “ldrel -s” for how to
change
the size) and the program just used that. Then, if you run out during
execution, you die, which puts the burden on the implementer to decide
an
appropriate stack size.

That was QNX4. In QNX6 stack pages are always mapped lazy and faulted-in
as
your stack grows (thus you have ‘real’ and ‘virtual’ stack).

Ah. Now I see what you mean. So if you wanted to make it happen at a
certain time, you’d actually have to write to every page you think you’ll
need to make sure your stack is all mapped in. Annoying. Maybe something
like:

void
force_stack_mapping(unsigned kbytes)
{
unsigned pages = kbytes / 4, i;
char *foo = (char *)&foo;

for(i=0 ; i < pages ; i++){
foo -= 4096;
*foo = 0;
}
}

Does that look right? It’s still early and I haven’t had a coffee…

Kris

– igor

Tony_Lee · May 22, 2002, 6:26pm

“Kris Warkentin” <kewarken@qnx.com> wrote in message
news:acg72h$b8v$1@nntp.qnx.com…

“Igor Kovalenko” <> Igor.Kovalenko@motorola.com> > wrote in message
news:acemse$kqk$> 1@inn.qnx.com> …
Not sure what you mean by this. I believe that a certain amount of
stack
space is allocated at execution startup (see “ldrel -s” for how to
change
the size) and the program just used that. Then, if you run out during
execution, you die, which puts the burden on the implementer to decide
an
appropriate stack size.

That was QNX4. In QNX6 stack pages are always mapped lazy and faulted-in
as
your stack grows (thus you have ‘real’ and ‘virtual’ stack).

Ah. Now I see what you mean. So if you wanted to make it happen at a
certain time, you’d actually have to write to every page you think you’ll
need to make sure your stack is all mapped in. Annoying. Maybe something
like:

QNX should add an API to let developer to specify the real time behavior of
an application instead of let us flight about it on the news group.

For example, if an app is “realtime”, it should always copy all the
modifiable page
on fork(). otherwise, it should be copy-on-write.
Same for the stack-growing method.

I still can’t get an answer on why my fork does not work.

–
-Tony Lee

Andrew_Thomas1 · May 27, 2002, 4:59pm

Kris Warkentin wrote:

void
force_stack_mapping(unsigned kbytes)
{
unsigned pages = kbytes / 4, i;
char *foo = (char *)&foo;

for(i=0 ; i < pages ; i++){
foo -= 4096;
*foo = 0;
}
}

Does that look right? It’s still early and I haven’t had a coffee…>

That’s some sneaky stuff going on there. Wouldn’t it be more
portable to do something like:

void
force_stack_mapping(unsigned kbytes)
{
unsigned bytes = kbytes * 1024;
char *foo = alloca (bytes);
foo[0] = foo[bytes-1] = 0;
}

Cheers,
Andrew

Kris_Warkentin1 · May 27, 2002, 5:30pm

Probably…but then again, what if the difference between foo[0] and
foo[bytes-1] is greater than one page? I would think that this would only
force mapping of the first and last pages wouldn’t it? Or would it bring
everything in between in too? Hmm… Either way, I think your method is
better because then you have some error checking with the alloca function to
keep a user error from writing all the way down into your heap or whatever.

Kris

“Andrew Thomas” <andrew@cogent.ca.no.spam> wrote in message
news:3CF265DB.6080401@cogent.ca.no.spam…

Kris Warkentin wrote:
void
force_stack_mapping(unsigned kbytes)
{
unsigned pages = kbytes / 4, i;
char *foo = (char *)&foo;

for(i=0 ; i < pages ; i++){
foo -= 4096;
*foo = 0;
}
}

Does that look right? It’s still early and I haven’t had a coffee…>

That’s some sneaky stuff going on there. Wouldn’t it be more
portable to do something like:

void
force_stack_mapping(unsigned kbytes)
{
unsigned bytes = kbytes * 1024;
char *foo = alloca (bytes);
foo[0] = foo[bytes-1] = 0;
}

Cheers,
Andrew