Executable image file times

Thomas_Haupt · March 12, 2003, 1:01pm

Hi,

I’m having an interesting problem here…

In a small system monitoring program, I want to detect (for running processes) whether
the proc’s executable image has changed since the process was started.

To do so, I

call qnx_psinfo(),
call stat() for the returned un.proc.name[]
compare psdata.un.proc.file_time to st.st_mtime

This work for almost all processes. That it doesn’t work for Fsys et.al. is ok,
but for one of our own programs, I see a difference of exactly one second.

Can You explain why this happens?

Below You will find a list of processes currently running on my machine, together
with their psinfo() and stat() times. The processes in question are marked ‘>>>’.

Thanks in advance.

PID psinfo() time stat() mtime Executable
4 ps: 0 st: 872093345 /bin/Fsys
5 ps: 0 st: 869144580 /bin/Fsys.aha7scsi
16 ps: 844452529 st: 844452529 //7/bin/Dev32
19 ps: 848595538 st: 848595538 //7/bin/Dev32.ansi
22 ps: 867421112 st: 867421112 //7/bin/Dev32.ser
23 ps: 844452392 st: 844452392 //7/bin/Dev32.pty
24 ps: 872028262 st: 872028262 //7/bin/Fsys.floppy
26 ps: 825352018 st: 825352018 //7/bin/Pipe
31 ps: 870384125 st: 870384125 //7/bin/Net
33 ps: 950805915 st: 950805915 //7/bin/Net.ether905
62 ps: 850934034 st: 850934034 //7/bin/cron
74 ps: 872270841 st: 872270841 //7/bin/Mouse
99 ps: 866052038 st: 866052038 //7/bin/tinit
102 ps: 867352685 st: 867352685 //7/usr/bin/lpsrvr
106 ps: 941036023 st: 941036023 //7/qnx4/photon/bin/phfontpfr
420 ps: 847402003 st: 847402003 //7/bin/ksh
552 ps: 847402003 st: 847402003 //7/bin/ksh
1811 ps: 936381493 st: 936381493 //7/qnx4/photon/bin/Photon
1820 ps: 940448272 st: 940448272 //7/qnx4/graphics/drivers/Null.ms
1822 ps: 937867519 st: 937867519 //7/qnx4/graphics/drivers/Pg.rage
1828 ps: 872100014 st: 872100014 //7/bin/Input
1832 ps: 937325268 st: 937325268 //7/qnx4/photon/bin/pwm
1837 ps: 872100014 st: 872100014 //7/bin/Input
1848 ps: 937324748 st: 937324748 //7/qnx4/photon/bin/pdm
1870 ps: 937325131 st: 937325131 //7/qnx4/photon/bin/pterm
1873 ps: 847402003 st: 847402003 //7/bin/ksh
1941 ps: 937324687 st: 937324687 //7/qnx4/photon/bin/helpviewer
5369 ps:1046915285 st:1046915285 //7/usr/fc6/bin/fccore
5378 ps: 847402003 st: 847402003 //7/bin/ksh
7629 ps: 972410351 st: 972410351 //7/usr/tcprt/5.0/usr/ucb/portmap
8567 ps: 937490183 st: 937490183 //7/usr/tcprt/5.0/usr/ucb/Dns
8619 ps: 980529729 st: 980529729 //7/usr/tcprt/5.0/usr/ucb/Tcpip
8635 ps: 945104289 st: 945104289 //7/usr/tcprt/5.0/usr/ucb/routed
8640 ps: 924271653 st: 924271653 //7/usr/tcprt/5.0/usr/ucb/inetd
8646 ps: 950886627 st: 950886627 //7/usr/tcprt/5.0/usr/ucb/lpd
8658 ps: 978533108 st: 978533108 //7/usr/tcprt/5.0/usr/ucb/Nfsd
8662 ps: 979668246 st: 979668246 //7/usr/tcprt/5.0/usr/ucb/NFSfsys
8669 ps: 979668246 st: 979668246 //7/usr/tcprt/5.0/usr/ucb/NFSfsys
8677 ps: 979668246 st: 979668246 //7/usr/tcprt/5.0/usr/ucb/NFSfsys
8684 ps: 910418395 st: 910418395 //7/qnx4/voyager/bin/voyager
8686 ps: 910419644 st: 910419644 //7/qnx4/voyager/bin/voyager.server
8991 ps: 910418569 st: 910418569 //7/qnx4/voyager/bin/vmail
10181 ps: 937325131 st: 937325131 //7/qnx4/photon/bin/pterm
10185 ps: 847402003 st: 847402003 //7/bin/ksh
12048 ps:1047470360 st:1047470360 //7/usr/fc6/bin/fcterm
12508 ps: 937325131 st: 937325131 //7/qnx4/photon/bin/pterm
12511 ps: 847402003 st: 847402003 //7/bin/ksh
12535 ps:1047458985 st:1047458985 //7/usr/fc6/bin/fcproc
12550 ps: 847402003 st: 847402003 //7/bin/ksh
13048 ps: 937325131 st: 937325131 //7/qnx4/photon/bin/pterm
14100 ps:1047470360 st:1047470360 //7/usr/fc6/bin/fcterm

14105 ps:1046915509 st:1046915510 //7/usr/fc6/bin/rtdb
14107 ps:1046915509 st:1046915510 //7/usr/fc6/bin/rtdb
14134 ps:1047035976 st:1047035976 //7/usr/fc6/bin/fcshlib

14656 ps: 965210252 st: 965210252 //7/usr/lib/vedit/vedit
14819 ps:1019042595 st:1019042595 //7/usr/bin/photon/Atom
16255 ps:1047473333 st:1047473333 //7/home/frk/stuff/qnx/psinfo/tst
22753 ps: 937325131 st: 937325131 //7/qnx4/photon/bin/pterm
25316 ps: 847402003 st: 847402003 //7/bin/ksh
25386 ps: 965210252 st: 965210252 //7/usr/lib/vedit/vedit
29435 ps: 937325131 st: 937325131 //7/qnx4/photon/bin/pterm
30461 ps: 937325131 st: 937325131 //7/qnx4/photon/bin/pterm
30464 ps: 847402003 st: 847402003 //7/bin/ksh

\

Here is the program which created above output:

#include <stdio.h>
#include <stdlib.h>
#include <sys/psinfo.h>
#include <sys/kernel.h>
#include <sys/stat.h>

int main()
{
struct _psinfo psdata;
struct stat st;
pid_t pid;

for ( pid = 1; pid < 32768; pid++ )
{
if ( pid <= 0 )
break;

if ( ( qnx_psinfo( PROC_PID, pid, &psdata, 0, 0 ) != -1 )
&& ( psdata.pid == pid )
&& ! ( psdata.flags & _PPF_VID )
&& ! ( psdata.flags & _PPF_MID )
&& ! stat( psdata.un.proc.name, &st ) )
printf( “%5d ps:%10d st:%10d %s\n”,
psdata.pid, psdata.un.proc.file_time,
st.st_mtime, psdata.un.proc.name );
}

return EXIT_SUCCESS;
}

–
T. Haupt

BitCtrl Systems GmbH
eMail: frk bitctrl de

John_Garvey1 · March 12, 2003, 10:21pm

Thomas Haupt <frk@bitctrl.de> wrote:

This work for almost all processes. That it doesn’t work for Fsys
et.al. is ok,
4 ps: 0 st: 872093345 /bin/Fsys

Processes started from the boot image (.boot) predate the setting
of the system clock and so will have a time of 0 (as you can see).

but for one of our own programs, I see a difference of exactly
one second.

Proc itself uses the st_mtime entry of the executable to determine
if the code is the same so that it can share the code entry (if
you run multiple copies of the same executable it keeps a link
count on the code segments). When it loads a new one it does:

cp->file_time = stat->st_mtime;

to set this up. But there are situations where you cannot share
code (debugging is the obvious example, as breakpoints you plant
in one process shouldn’t affect the other). So the cheesy way
Proc does this is to then modify that mtime with:

if (load->flags & (_SPAWN_HOLD | _SPAWN_DEBUG))
–cp->file_time; /* So it won’t be shared. */

14105 ps:1046915509 st:1046915510 //7/usr/fc6/bin/rtdb
14107 ps:1046915509 st:1046915510 //7/usr/fc6/bin/rtdb

So, you are either debugging this process or spawning it held
(e.g. “on -h rtdb”).

Bill_Caroselli1 · March 12, 2003, 10:30pm

John Garvey <jgarvey@qnx.com> wrote:

to set this up. But there are situations where you cannot share
code (debugging is the obvious example, as breakpoints you plant
in one process shouldn’t affect the other). So the cheesy way
Proc does this is to then modify that mtime with:

if (load->flags & (_SPAWN_HOLD | _SPAWN_DEBUG))
–cp->file_time; /* So it won’t be shared. */

14105 ps:1046915509 st:1046915510 //7/usr/fc6/bin/rtdb
14107 ps:1046915509 st:1046915510 //7/usr/fc6/bin/rtdb

So, you are either debugging this process or spawning it held
(e.g. “on -h rtdb”).

Wait a minute!

I haven’t actually tried loading a process in the held state yet in
QNX6, but one of the reasons for doing this is so that subsequent
attempts to load that program will be faster because the executable
is already in RAM.

So, your saying that loading a process in the held state does nothing
to help in that situation?

Thomas_Haupt · March 13, 2003, 10:07am

Previously, you (John Garvey) wrote:

Thomas Haupt <> frk@bitctrl.de> > wrote:

This work for almost all processes. That it doesn’t work for Fsys
et.al. is ok,
4 ps: 0 st: 872093345 /bin/Fsys

Processes started from the boot image (.boot) predate the setting
of the system clock and so will have a time of 0 (as you can see).

but for one of our own programs, I see a difference of exactly
one second.

Proc itself uses the st_mtime entry of the executable to determine
if the code is the same so that it can share the code entry (if
you run multiple copies of the same executable it keeps a link
count on the code segments). When it loads a new one it does:

cp->file_time = stat->st_mtime;

to set this up. But there are situations where you cannot share
code (debugging is the obvious example, as breakpoints you plant
in one process shouldn’t affect the other). So the cheesy way
Proc does this is to then modify that mtime with:

if (load->flags & (_SPAWN_HOLD | _SPAWN_DEBUG))
–cp->file_time; /* So it won’t be shared. */

14105 ps:1046915509 st:1046915510 //7/usr/fc6/bin/rtdb
14107 ps:1046915509 st:1046915510 //7/usr/fc6/bin/rtdb

So, you are either debugging this process or spawning it held
(e.g. “on -h rtdb”).

You are absolutely right, that’s exactly what I’m doing…
Maybe this isn’t the really ‘clean’ way of transporting the ‘don’t
share me’ information, but it definitely is quite clever - and to be
honest, it is just the kind of way I like to choose myself now and then…

Thanks a lot!

–
T. Haupt

BitCtrl Systems GmbH
Weissenfelser Str. 67
04229 Leipzig

Phone: +49 (0)341 49067 0
Phax: +49 (0)341 49067 15
eMail: frk bitctrl de

Thomas_Haupt · March 13, 2003, 10:12am

Previously, you (Bill Caroselli) wrote:

John Garvey <> jgarvey@qnx.com> > wrote:

to set this up. But there are situations where you cannot share
code (debugging is the obvious example, as breakpoints you plant
in one process shouldn’t affect the other). So the cheesy way
Proc does this is to then modify that mtime with:

if (load->flags & (_SPAWN_HOLD | _SPAWN_DEBUG))
–cp->file_time; /* So it won’t be shared. */

14105 ps:1046915509 st:1046915510 //7/usr/fc6/bin/rtdb
14107 ps:1046915509 st:1046915510 //7/usr/fc6/bin/rtdb

So, you are either debugging this process or spawning it held
(e.g. “on -h rtdb”).

Wait a minute!

I haven’t actually tried loading a process in the held state yet in
QNX6, but one of the reasons for doing this is so that subsequent
attempts to load that program will be faster because the executable
is already in RAM.

So, your saying that loading a process in the held state does nothing
to help in that situation?

I believe yes, he is. Reason is: if one of the processes is being debugged,
it’s code must not be shared with other processes, whether debugged or
not. This is because setting breakpoints is implemented by writing ‘break’
instructions at the proper address directly into the code segment. So if the
code was shared with another process running the same program, that process
would stop as well when reaching that address - which is not what we want.

So for debugged processes, we can never help to load faster because we’ll
always need a new copy.

Regards,

–
T. Haupt

BitCtrl Systems GmbH
Weissenfelser Str. 67
04229 Leipzig

Phone: +49 (0)341 49067 0
Phax: +49 (0)341 49067 15
eMail: frk bitctrl de

Thomas_Haupt · March 13, 2003, 11:04am

Previously, you (John Garvey) wrote:

Proc itself uses the st_mtime entry of the executable to determine
if the code is the same so that it can share the code entry (if
you run multiple copies of the same executable it keeps a link
count on the code segments).
snip

But there are situations where you cannot share
code (debugging is the obvious example, as breakpoints you plant
in one process shouldn’t affect the other). So the cheesy way
Proc does this is to then modify that mtime with:

if (load->flags & (_SPAWN_HOLD | _SPAWN_DEBUG))
–cp->file_time; /* So it won’t be shared. */
snip

Wonderful mechanism, but what happens if a program is started twice
or more with _SPAWN_HOLD or _SPAWN_DEBUG set? Won’t the same conflict
we just avoided arise again?

As in:
$ on -h myprog &
$ slay -s SIGCONT myprog
$ wd myprog

We’d have two instances of ‘myprog’, both with a decreented file_time.
Won’t they share their code, having the same time stamp? Or will both
get their own copy, since neither of them is ‘up to date’? Or what?

Thanks,

–
T. Haupt

BitCtrl Systems GmbH
Weissenfelser Str. 67
04229 Leipzig

Phone: +49 (0)341 49067 0
Phax: +49 (0)341 49067 15
eMail: frk bitctrl de

John_Garvey1 · March 13, 2003, 1:22pm

Thomas Haupt <frk@bitctrl.de> wrote:

in one process shouldn’t affect the other). So the cheesy way
Proc does this is to then modify that mtime with:
if (load->flags & (_SPAWN_HOLD | _SPAWN_DEBUG))
–cp->file_time; /* So it won’t be shared. */
Wonderful mechanism, but what happens if a program is started twice
or more with _SPAWN_HOLD or _SPAWN_DEBUG set? Won’t the same conflict
we just avoided arise again?

Notice I used the word “cheesy”?! Actually, I think it works out,
because part of the criteria Proc uses for sharing (code snippet not
shown) is that the program is not being held/debugged (flags). So
even if both images have their file time decremented to the same value,
the fact that the second one comes in with SPAWN_HOLD|SPAWN_DEBUG means
that it will not be shared in spite of matching/decremented times.

We’d have two instances of ‘myprog’, both with a decreented file_time.
Won’t they share their code, having the same time stamp? Or will both
get their own copy, since neither of them is ‘up to date’? Or what?

There are are number of criteria for sharing: name, time, flags. As
mentioned above, I think this situation would not be shared (flags).

Back to your original question, have you tried allowing for this
scenario by looking at the psinfo flags and compensating 1 second?!

Thomas_Haupt · March 13, 2003, 3:15pm

Previously, you (John Garvey) wrote:

Thomas Haupt <> frk@bitctrl.de> > wrote:
in one process shouldn’t affect the other). So the cheesy way
Proc does this is to then modify that mtime with:
if (load->flags & (_SPAWN_HOLD | _SPAWN_DEBUG))
–cp->file_time; /* So it won’t be shared. */
Wonderful mechanism, but what happens if a program is started twice
or more with _SPAWN_HOLD or _SPAWN_DEBUG set? Won’t the same conflict
we just avoided arise again?

Notice I used the word “cheesy”?! Actually, I think it works out,
because part of the criteria Proc uses for sharing (code snippet not
shown) is that the program is not being held/debugged (flags). So
even if both images have their file time decremented to the same value,
the fact that the second one comes in with SPAWN_HOLD|SPAWN_DEBUG means
that it will not be shared in spite of matching/decremented times.

Hm - now this leaves me curious: If code isn’t shared if any of those flags
is set, then what do we need this decremented-file_time-mechanism for ?

We’d have two instances of ‘myprog’, both with a decreented file_time.
Won’t they share their code, having the same time stamp? Or will both
get their own copy, since neither of them is ‘up to date’? Or what?

There are are number of criteria for sharing: name, time, flags. As
mentioned above, I think this situation would not be shared (flags).

Back to your original question, have you tried allowing for this
scenario by looking at the psinfo flags and compensating 1 second?! >

That’s exactly what I’m just doing, and when implementing the 1s compensation,
I was struck by a thought like ‘hey, hopefully this time will never be
decremented by more than one second, because otherwise my compensation won’t
help a lot’.

E.g. what if a process spawned HELD spawns one of its kind again, also HELD?

And another question passed my mind: How is the shared code problem dealt with
in case we debug a process which is already running?

Thanks for Your patience and insightful answers…

–
T. Haupt

BitCtrl Systems GmbH
Weissenfelser Str. 67
04229 Leipzig

Phone: +49 (0)341 49067 0
Phax: +49 (0)341 49067 15
eMail: frk bitctrl de

Bill_Caroselli1 · March 13, 2003, 6:46pm

Thomas Haupt <frk@bitctrl.de> wrote:

Previously, you (Bill Caroselli) wrote:
John Garvey <> jgarvey@qnx.com> > wrote:
So, you are either debugging this process or spawning it held
(e.g. “on -h rtdb”).

Wait a minute!

I haven’t actually tried loading a process in the held state yet in
QNX6, but one of the reasons for doing this is so that subsequent
attempts to load that program will be faster because the executable
is already in RAM.

So, your saying that loading a process in the held state does nothing
to help in that situation?

I believe yes, he is. Reason is: if one of the processes is being debugged,
it’s code must not be shared with other processes, whether debugged or
not. This is because setting breakpoints is implemented by writing ‘break’
instructions at the proper address directly into the code segment. So if the
code was shared with another process running the same program, that process
would stop as well when reaching that address - which is not what we want.

So for debugged processes, we can never help to load faster because we’ll
always need a new copy.

OK. I understand why this must be done for a process being debugged.
But why is it necessary for a process being HELD?

To hold a process all you need to do is remove it from a READY queue,
or not put it there if it was just spawned.

I.E. If two processes are already code sharing and I HOLD one of them,
does that force the kernel to copy the code segment on the fly so it can drop a breakpoint into one of them? I assume not.

Adam_Mallory1 · March 13, 2003, 8:42pm

Bill Caroselli <qtps@earthlink.net> wrote in message
news:b4qjn1$mar$1@inn.qnx.com…

I.E. If two processes are already code sharing and I HOLD one of them,
does that force the kernel to copy the code segment on the fly so it can
drop a breakpoint into one of them? I assume not.

So, say you modify this ‘shared’ page and don’t copy the page on the write,
when you UNHOLD the process how will we know if the page has been modified?

-Adam

Bill_Caroselli1 · March 13, 2003, 8:51pm

Adam Mallory <amallory@qnx.com> wrote:

Bill Caroselli <> qtps@earthlink.net> > wrote in message
news:b4qjn1$mar$> 1@inn.qnx.com> …

snip
I.E. If two processes are already code sharing and I HOLD one of them,
does that force the kernel to copy the code segment on the fly so it can
drop a breakpoint into one of them? I assume not.

So, say you modify this ‘shared’ page and don’t copy the page on the write,
when you UNHOLD the process how will we know if the page has been modified?

Of course if you modify the shared page you must make a copy of it
first. No argument there.

I’m saying that I don’t see why it should be necessary to modify a
shared code page just for holding a process.

If QNX has chosen to implement a HOLD state by dropping a software
break point there it seems to me like that is a lot of extra work.

Instead, the kernel just has to remove the threads for that process
from any READY queues and mark the state as HELD.

Adam_Mallory1 · March 13, 2003, 9:48pm

Bill Caroselli <qtps@earthlink.net> wrote in message
news:b4qqvs$t7u$1@inn.qnx.com…

Of course if you modify the shared page you must make a copy of it
first. No argument there.

I’m saying that I don’t see why it should be necessary to modify a
shared code page just for holding a process.

?? I don’t recall seeing anyone saying that you had to modify a shared page
in order to hold a process?? I think I’m not understanding what you’re
trying to say here.

If QNX has chosen to implement a HOLD state by dropping a software
break point there it seems to me like that is a lot of extra work.

?? We definately don’t drop a breakpoint to put a process into a HOLD state.

-Adam

Bill_Caroselli1 · March 13, 2003, 9:48pm

Adam Mallory <amallory@qnx.com> wrote:

Bill Caroselli <> qtps@earthlink.net> > wrote in message
news:b4qqvs$t7u$> 1@inn.qnx.com> …

Of course if you modify the shared page you must make a copy of it
first. No argument there.

I’m saying that I don’t see why it should be necessary to modify a
shared code page just for holding a process.

?? I don’t recall seeing anyone saying that you had to modify a shared page
in order to hold a process?? I think I’m not understanding what you’re
trying to say here.

If QNX has chosen to implement a HOLD state by dropping a software
break point there it seems to me like that is a lot of extra work.

?? We definately don’t drop a breakpoint to put a process into a HOLD state.

-Adam <confused

Thank you. Then I’m not longer confused.

But to answer your question, in the very first reply to the original
question:

John Garvey <jgarvey@qnx.com> wrote:

So, you are either debugging this process or spawning it held
(e.g. “on -h rtdb”).

John_Garvey1 · March 13, 2003, 11:33pm

Thomas Haupt <frk@bitctrl.de> wrote:

Hm - now this leaves me curious: If code isn’t shared if any of those flags
is set, then what do we need this decremented-file_time-mechanism for ?

Process flags are not kept on the in-memory code segment. Otherwise,
yes it could check the DEBUG flag on the new process and not share
and check the flags on any loaded segments and not share. But since
such flags don’t exist, the time associated with the code is modified
instead (in an almost-too-clever encoding). All that aside, I was just
telling you why you saw the 1-second discrepency …