Process load

How can I get an accurate idea of process loads?

If I use ‘spin’ the numbers reported are often wrong.

But what about ‘sin threads’ and ‘sin cpu’? Is sin cpu accurate? Is sin
threads accurate if the threads are long-lived (eg for a given thread
number, if that thread is not continually begin killed and restarted)?

I have a process with a thread which is normally in the INTR state.
sin threads indicates that it takes about 50% of cpu time, but I am told by
the author that is impossible. Why should the value reported be wrong and
what is the correct way to measure such things?

Thanks for any help

William Morris

Take a look at the following program - it will give you the
percent of time over each second that the specified process (or
a particular thread in that process) was running.

Note that using pid 1, thread gives you the amount of time that
cpu is idle, and it was the shelf cpu monitor uses.

#include <stdlib.h>
#include <stdio.h>
#include <sys/neutrino.h>
#include <inttypes.h>

int main( int argc, char *argv )
{
int id;
pid_t pid = 0; /
this process /
pthread_t tid = 0; /
all threads */
uint64_t start_time[2], end_time[2];
double percent;

if ( argc > 1 )
pid = atoi(argv[1]);
if ( argc > 2 )
tid = atoi(argv[2]);

id = ClockId( pid, tid );

while(1) {
ClockTime( CLOCK_REALTIME, NULL, &start_time[0] );
ClockTime( id, NULL, &start_time[1] );
sleep(1);
ClockTime( CLOCK_REALTIME, NULL, &end_time[0] );
ClockTime( id, NULL, &end_time[1] );

end_time[0] -= start_time[0];
end_time[1] -= start_time[1];

percent = end_time[1];
percent /= end_time[0];
percent *= 100.0;

printf(“PID %d, TID %d, %lld/%lld ticks %%%d\n”,
pid, tid, end_time[1], end_time[0], (int)percent );
}
return 0;
}

William Morris <william@bangel.demon.co.uk> wrote:

How can I get an accurate idea of process loads?

If I use ‘spin’ the numbers reported are often wrong.

But what about ‘sin threads’ and ‘sin cpu’? Is sin cpu accurate? Is sin
threads accurate if the threads are long-lived (eg for a given thread
number, if that thread is not continually begin killed and restarted)?

I have a process with a thread which is normally in the INTR state.
sin threads indicates that it takes about 50% of cpu time, but I am told by
the author that is impossible. Why should the value reported be wrong and
what is the correct way to measure such things?

Thanks for any help

William Morris


cburgess@qnx.com

On 13 Jun 2003 13:30:02 GMT, Colin Burgess <cburgess@qnx.com> wrote:

Take a look at the following program - it will give you the
percent of time over each second that the specified process (or
a particular thread in that process) was running.

Note that using pid 1, thread gives you the amount of time that
cpu is idle, and it was the shelf cpu monitor uses.

Many thanks. Running this program confirms what sin threads says - thread
3 of the offending process is taking 60%

However… when the process is running, your program tells me that there is
about 25% idle time. If I slay the offending process, this rises to just
50%. Why not 85% (60+25) ?

Does the fact that the thread blocks on interrupts affect any of this?
Regards
William

One thing to check is that you are running the monitor at a higher
priority than anything else in the system. It could be getting preempted
in between ClockTimes, and gettings skewed results. In fact, it may
be worth disabling interrupts around the ClockTimes for that matter.

William Morris <william@bangel.demon.co.uk> wrote:

On 13 Jun 2003 13:30:02 GMT, Colin Burgess <> cburgess@qnx.com> > wrote:
Take a look at the following program - it will give you the
percent of time over each second that the specified process (or
a particular thread in that process) was running.

Note that using pid 1, thread gives you the amount of time that
cpu is idle, and it was the shelf cpu monitor uses.

Many thanks. Running this program confirms what sin threads says - thread
3 of the offending process is taking 60%

However… when the process is running, your program tells me that there is
about 25% idle time. If I slay the offending process, this rises to just
50%. Why not 85% (60+25) ?

Does the fact that the thread blocks on interrupts affect any of this?
Regards
William


cburgess@qnx.com

On 13 Jun 2003 15:11:36 GMT, Colin Burgess <cburgess@qnx.com> wrote:

One thing to check is that you are running the monitor at a higher
priority than anything else in the system. It could be getting preempted
in between ClockTimes, and gettings skewed results. In fact, it may
be worth disabling interrupts around the ClockTimes for that matter.

That imporoves it a bit, making it more consistent, but the values are
still
far too big. And killing the process doesn’t return the reported load to
the idle task.

Is it possible that the kernel over-counts? For example, if it counts
thread CPU occupancy in clock ticks and a thread executes for part of a
tick (even a small part), the kernel increments its occupancy by one tick.
That would give a frequently triggered thread which did nothing when
triggered a large CPU occupancy.

Regards
William

Yes, it’s not entirely accurate. For an exact analysis you’d be best
looking at the instrumented kernel and the System Profiler that comes
with Momentic’s PE. That would show you every thread state transition
and message pass etc.

Do you have PE?

William Morris <william@bangel.demon.co.uk> wrote:

On 13 Jun 2003 15:11:36 GMT, Colin Burgess <> cburgess@qnx.com> > wrote:

One thing to check is that you are running the monitor at a higher
priority than anything else in the system. It could be getting preempted
in between ClockTimes, and gettings skewed results. In fact, it may
be worth disabling interrupts around the ClockTimes for that matter.

That imporoves it a bit, making it more consistent, but the values are
still
far too big. And killing the process doesn’t return the reported load to
the idle task.

Is it possible that the kernel over-counts? For example, if it counts
thread CPU occupancy in clock ticks and a thread executes for part of a
tick (even a small part), the kernel increments its occupancy by one tick.
That would give a frequently triggered thread which did nothing when
triggered a large CPU occupancy.

Regards
William


cburgess@qnx.com

On 13 Jun 2003 17:58:44 GMT, Colin Burgess <cburgess@qnx.com> wrote:

Yes, it’s not entirely accurate. For an exact analysis you’d be best
looking at the instrumented kernel and the System Profiler that comes
with Momentic’s PE. That would show you every thread state transition
and message pass etc.

Do you have PE?

Unfortunately not.

I am told that the code in question is a continuous loop which just does:
InterruptWait()
InterruptUnmask()
and on every 100th interrupt
MsgSendPulse()

I tried a loop like this and failed to get the load to register above 0.5%
in spin using IRQ0 on a 266MHz Pentium. But I don’t know the source of the
interrupts used in the real code.

Regards
William

Wouldn’t it be a lot cheaper to register an interrupt handler and do
the count there, just returning a sigevent every 100th interrupt?

William Morris <william@bangel.demon.co.uk> wrote:

On 13 Jun 2003 17:58:44 GMT, Colin Burgess <> cburgess@qnx.com> > wrote:

Yes, it’s not entirely accurate. For an exact analysis you’d be best
looking at the instrumented kernel and the System Profiler that comes
with Momentic’s PE. That would show you every thread state transition
and message pass etc.

Do you have PE?

Unfortunately not.

I am told that the code in question is a continuous loop which just does:
InterruptWait()
InterruptUnmask()
and on every 100th interrupt
MsgSendPulse()

I tried a loop like this and failed to get the load to register above 0.5%
in spin using IRQ0 on a 266MHz Pentium. But I don’t know the source of the
interrupts used in the real code.

Regards
William


cburgess@qnx.com