SYSPAGE_ENTRY(qtime)->cycles_per_sec problem

I am running QNX 6.1 on a X86 machine. I measure a time difference between
to interrupts using ClockCycles().

I measure the same time difference with a high quality 1ns resolution GPS
which I assume is precise enough for my application.

However, when I compute the time difference using the clockcycles and divide
the difference by SYSPAGE_ENTRY(qtime)->cycles_per_sec I have a difference
of about 1ms against the GPS;

Recomputing the processor speed with the GPS time difference yields in a
difference of about 340kHz on a 860MHz PIII versus the
SYSPAGE_ENTRY(qtime)->cycles_per_sec value.

How can I obtain a more precise value for the processor cycles.

Thanks,

Marcus

news.fernuni-hagen.de <mdaiber@gmx.net> wrote:

I am running QNX 6.1 on a X86 machine. I measure a time difference between
to interrupts using ClockCycles().

I measure the same time difference with a high quality 1ns resolution GPS
which I assume is precise enough for my application.

However, when I compute the time difference using the clockcycles and divide
the difference by SYSPAGE_ENTRY(qtime)->cycles_per_sec I have a difference
of about 1ms against the GPS;

Recomputing the processor speed with the GPS time difference yields in a
difference of about 340kHz on a 860MHz PIII versus the
SYSPAGE_ENTRY(qtime)->cycles_per_sec value.

So, the reported value is off by about 4 parts in 10,000. Next chip
down the line might be different. Run it hotter/colder, it might be
different. Even which instructions are in cache, and what data is in
cache, could affect this.

How can I obtain a more precise value for the processor cycles.

Looks like you just did.

As an OS, we can’t give you anything better.

-David

QNX Training Services
http://www.qnx.com/support/training/
Please followup in this newsgroup if you have further questions.

David,
I appreciate your comments; however, QNX claims to produce an operating
system for mission critical embedded applications; my application is
embedded and precise time measurement (without GPS) is mission critical;
your statements do not encourage me to use QNX if that is the general
attitude within your company. I hope some people there strive to be better
than 4 parts in 10000.

Thanks,
Marcus

“David Gibbs” <dagibbs@qnx.com> wrote in message
news:bhtn0v$k9o$4@nntp.qnx.com

news.fernuni-hagen.de <> mdaiber@gmx.net> > wrote:
I am running QNX 6.1 on a X86 machine. I measure a time difference
between
to interrupts using ClockCycles().

I measure the same time difference with a high quality 1ns resolution
GPS
which I assume is precise enough for my application.

However, when I compute the time difference using the clockcycles and
divide
the difference by SYSPAGE_ENTRY(qtime)->cycles_per_sec I have a
difference
of about 1ms against the GPS;

Recomputing the processor speed with the GPS time difference yields in a
difference of about 340kHz on a 860MHz PIII versus the
SYSPAGE_ENTRY(qtime)->cycles_per_sec value.

So, the reported value is off by about 4 parts in 10,000. Next chip
down the line might be different. Run it hotter/colder, it might be
different. Even which instructions are in cache, and what data is in
cache, could affect this.

How can I obtain a more precise value for the processor cycles.

Looks like you just did.

As an OS, we can’t give you anything better.

-David

QNX Training Services
http://www.qnx.com/support/training/
Please followup in this newsgroup if you have further questions.

Marcus Daiber <mdaiber@gmx.net> wrote in message
news:bhtp89$85l$1@inn.qnx.com

David,
I appreciate your comments; however, QNX claims to produce an operating
system for mission critical embedded applications; my application is
embedded and precise time measurement (without GPS) is mission critical;
your statements do not encourage me to use QNX if that is the general
attitude within your company. I hope some people there strive to be better
than 4 parts in 10000.

As per Intel Docs the rdtsc instruction is not context synchronizing, thus
we cannot tell where in the pipeline the fetch from the counter actually
occurred, and out of order execution just exasperates this. While this
isn’t directly your issue - the fact remains that without hardware support
for accurate timing, we’re only as good as the hardware can be.

For mission critical applications that need precise time measurement
(resolution outside what can be provided by existing hardware) , then you
need further hardware to support the resolution you require. I also am not
encouraged to hear that your “precise time measurement” is done via non
precise mechanisms.

-Adam

Adam,

For mission critical applications that need precise time measurement
(resolution outside what can be provided by existing hardware) , then you
need further hardware to support the resolution you require. I also am
not
encouraged to hear that your “precise time measurement” is done via non
precise mechanisms.
After detecting the problem I am looking for a solution, I have not claimed

mine to be the ultima ratio; in fact I am currently searching for a
solution, which is indeed the main reason for me to post my problem; I would
appreciate comments for possible solutions rather than a “It doesn’t get any
better”.

I am also aware of the fact that there are many oscillators on a standard
X86 system, which provide a clock precise enough to measure a time interval
with a resolution of <1ms (which is what I need); if there is access to this
information I appreciate comments.

Marcus


“Adam Mallory” <amallory@qnx.com> wrote in message
news:bhtset$rg5$1@nntp.qnx.com

Marcus Daiber <> mdaiber@gmx.net> > wrote in message
news:bhtp89$85l$> 1@inn.qnx.com> …
David,
I appreciate your comments; however, QNX claims to produce an operating
system for mission critical embedded applications; my application is
embedded and precise time measurement (without GPS) is mission critical;
your statements do not encourage me to use QNX if that is the general
attitude within your company. I hope some people there strive to be
better
than 4 parts in 10000.

As per Intel Docs the rdtsc instruction is not context synchronizing, thus
we cannot tell where in the pipeline the fetch from the counter actually
occurred, and out of order execution just exasperates this. While this
isn’t directly your issue - the fact remains that without hardware support
for accurate timing, we’re only as good as the hardware can be.

For mission critical applications that need precise time measurement
(resolution outside what can be provided by existing hardware) , then you
need further hardware to support the resolution you require. I also am
not
encouraged to hear that your “precise time measurement” is done via non
precise mechanisms.

-Adam

Hi Marcus

There are timer boards out that that do more precise timing. Some you
can configure to generate an interupt at regular intervals and some
you can just read a value as often as desired. And of course, some do
both. I have not needed such a board for a while. So I can’t
recommend one. But they are out there.


Marcus Daiber <mdaiber@gmx.net> wrote:
MD > Adam,

For mission critical applications that need precise time measurement
(resolution outside what can be provided by existing hardware) , then you
need further hardware to support the resolution you require. I also am
MD > not
encouraged to hear that your “precise time measurement” is done via non
precise mechanisms.
MD > After detecting the problem I am looking for a solution, I have not claimed

MD > mine to be the ultima ratio; in fact I am currently searching for a
MD > solution, which is indeed the main reason for me to post my problem; I would
MD > appreciate comments for possible solutions rather than a “It doesn’t get any
MD > better”.

MD > I am also aware of the fact that there are many oscillators on a standard
MD > X86 system, which provide a clock precise enough to measure a time interval
MD > with a resolution of <1ms (which is what I need); if there is access to this
MD > information I appreciate comments.

MD > Marcus


Bill Caroselli – Q-TPS Consulting
1-(626) 824-7983
qtps@earthlink.net

Marcus Daiber <mdaiber@gmx.net> wrote in message
news:bhtsqa$aj7$1@inn.qnx.com

After detecting the problem I am looking for a solution, I have not
claimed
mine to be the ultima ratio; in fact I am currently searching for a
solution, which is indeed the main reason for me to post my problem; I
would
appreciate comments for possible solutions rather than a “It doesn’t get
any
better”.

My understanding from your post was that you where saying that the OS should
be providing you more accurate cycle timings, which it cannot given that the
mechanism (ie. rdtsc) doesn’t provide the acurracy you want consistantly.

If all you want is suggestions on how to improve or what other hardware
solutions their are, I’m sure others with experience in this can comment.

-Adam

Marcus Daiber <mdaiber@gmx.net> wrote:

David,
I appreciate your comments; however, QNX claims to produce an operating
system for mission critical embedded applications; my application is
embedded and precise time measurement (without GPS) is mission critical;
your statements do not encourage me to use QNX if that is the general
attitude within your company. I hope some people there strive to be better
than 4 parts in 10000.

On general purpose hardware, how can we do better than what the hardware
can supply?

Precise time measurement on a computer system is NOT an easy thing. Neither
is accurate time measurement.

The following little program:

#include <inttypes.h>
#include <stdio.h>
#include <sys/neutrino.h>
#include <pthread.h>

#define BILLION 1000000000
#define NumSamples 30

main ()
{
uint64_t cycs [NumSamples];
int i;

printf ("%d ClockCycles values:\n", NumSamples);

for (i = 0; i < NumSamples; i++) {
cycs _= ClockCycles ();
}

printf ("%llu\n", cycs [0]);
for (i = 1; i < NumSamples; i++) {
printf ("%llu, delta %llu (decimal)\n",
cycs , cycs - cycs );
}
}

Gives:

30 ClockCycles values:
661424286661766
661424286662110, delta 344 (decimal)
661424286662202, delta 92 (decimal)
661424286662294, delta 92 (decimal)
661424286662386, delta 92 (decimal)
661424286662478, delta 92 (decimal)
661424286662570, delta 92 (decimal)

(rest consistently delta 92)

Why’s that first one 4 times the second and all the rest?

Code generation, code ordering, cache hits/misses, CPU pipelining and
instructure re-ordering can all greatly affect something like this.

Since you didn’t give any details on exactly how you calculated
your benchmark, I can’t look at that. (Was ClockCycles long or
short of the GPS? How did you get the time from the GPS? By
serial port? If so, how did you account for the time consumed
in the serial interactions. Or, does this GPS supply a memory-mapped
interface?)

But, you did seem to get a more accurate calibration of ClockCycles().
You didn’t say you couldn’t use that. Maybe you need to (more) accurately
calibrate ClockCycles() on your machines and use it.

You also didn’t really state what you actually needed to do – how long
a period do you need to measure time over, how accurate and precise,
do you need the result to be?

Standard PC hardware may NOT be able to achieve this. And, without knowing
what you’re really trying to achieve (rather than just deal with a stated
“finding”), its hard to know what constructive suggestion I might make.

-David

QNX Training Services
http://www.qnx.com/support/training/
Please followup in this newsgroup if you have further questions._

Hi Marcus…

For mission critical timing, DSP, microprocessors, and many other types
of hardware will do better than x86 + QNX at a fraction of the cost (of
course, there are a number or tradeoffs, but…). For example, we do
some ‘mission critical’ aspects of a helicopter mission (i.e. controls
and the like) with MPC555, and the less -time- critical aspects of the
mission (i.e. mission planning + navigation, etc.) with x86 + QNX.
Motorola MPC555 or MPC565 or MC68332, or… will give timings with
accuracy in the nanoseconds if you want. But something tells me that you
know all of this! :slight_smile: Just wanted to tell you that the likes of MC555
<–> x86 + QNX are viable alternatives for mission critical systems.

Regards…

Miguel.


Marcus Daiber wrote:

David,
I appreciate your comments; however, QNX claims to produce an operating
system for mission critical embedded applications; my application is
embedded and precise time measurement (without GPS) is mission critical;
your statements do not encourage me to use QNX if that is the general
attitude within your company. I hope some people there strive to be better
than 4 parts in 10000.

Thanks,
Marcus

“David Gibbs” <> dagibbs@qnx.com> > wrote in message
news:bhtn0v$k9o$> 4@nntp.qnx.com> …

news.fernuni-hagen.de <> mdaiber@gmx.net> > wrote:

I am running QNX 6.1 on a X86 machine. I measure a time difference

between

to interrupts using ClockCycles().

I measure the same time difference with a high quality 1ns resolution

GPS

which I assume is precise enough for my application.

However, when I compute the time difference using the clockcycles and

divide

the difference by SYSPAGE_ENTRY(qtime)->cycles_per_sec I have a

difference

of about 1ms against the GPS;

Recomputing the processor speed with the GPS time difference yields in a
difference of about 340kHz on a 860MHz PIII versus the
SYSPAGE_ENTRY(qtime)->cycles_per_sec value.

So, the reported value is off by about 4 parts in 10,000. Next chip
down the line might be different. Run it hotter/colder, it might be
different. Even which instructions are in cache, and what data is in
cache, could affect this.


How can I obtain a more precise value for the processor cycles.

Looks like you just did.

As an OS, we can’t give you anything better.

-David

QNX Training Services
http://www.qnx.com/support/training/
Please followup in this newsgroup if you have further questions.

Sometimes ‘system controller’ chips (aka northbridge) also have
timer/counter registers, especially those used in embedded designs. If you
can get full documentation on your chip, you might find just what you need.
For example on the controller chip that I have on my PPC board there are 4
timer/counter registers that can generate interrupt when they count to zero.
Granularity can be in range of 8 to 240 nanosec.

“Miguel Simon” <simon@ou.edu> wrote in message
news:3F458BF5.9020708@ou.edu

Hi Marcus…

For mission critical timing, DSP, microprocessors, and many other types
of hardware will do better than x86 + QNX at a fraction of the cost (of
course, there are a number or tradeoffs, but…). For example, we do
some ‘mission critical’ aspects of a helicopter mission (i.e. controls
and the like) with MPC555, and the less -time- critical aspects of the
mission (i.e. mission planning + navigation, etc.) with x86 + QNX.
Motorola MPC555 or MPC565 or MC68332, or… will give timings with
accuracy in the nanoseconds if you want. But something tells me that you
know all of this! > :slight_smile: > Just wanted to tell you that the likes of MC555
→ x86 + QNX are viable alternatives for mission critical systems.

Regards…

Miguel.


Marcus Daiber wrote:
David,
I appreciate your comments; however, QNX claims to produce an operating
system for mission critical embedded applications; my application is
embedded and precise time measurement (without GPS) is mission critical;
your statements do not encourage me to use QNX if that is the general
attitude within your company. I hope some people there strive to be
better
than 4 parts in 10000.

Thanks,
Marcus

“David Gibbs” <> dagibbs@qnx.com> > wrote in message
news:bhtn0v$k9o$> 4@nntp.qnx.com> …

news.fernuni-hagen.de <> mdaiber@gmx.net> > wrote:

I am running QNX 6.1 on a X86 machine. I measure a time difference

between

to interrupts using ClockCycles().

I measure the same time difference with a high quality 1ns resolution

GPS

which I assume is precise enough for my application.

However, when I compute the time difference using the clockcycles and

divide

the difference by SYSPAGE_ENTRY(qtime)->cycles_per_sec I have a

difference

of about 1ms against the GPS;

Recomputing the processor speed with the GPS time difference yields in
a
difference of about 340kHz on a 860MHz PIII versus the
SYSPAGE_ENTRY(qtime)->cycles_per_sec value.

So, the reported value is off by about 4 parts in 10,000. Next chip
down the line might be different. Run it hotter/colder, it might be
different. Even which instructions are in cache, and what data is in
cache, could affect this.


How can I obtain a more precise value for the processor cycles.

Looks like you just did.

As an OS, we can’t give you anything better.

-David

QNX Training Services
http://www.qnx.com/support/training/
Please followup in this newsgroup if you have further questions.
\

In article <bhu4ir$79q$1@nntp.qnx.com>, dagibbs@qnx.com says…

Hi David,

Your little program perfectly explains that execution of the same code could take CPU a different
time. But original question was why SYSPAGE_ENTRY(qtime)->cycles_per_sec isn’t precise enough. I
can’t say it’s completely wrong, but it isn’t precise. Seems it has some offset error and it kills
the idea of that system page entry because you can’t use it to transform ClockCycles() results into
actual time precisely. Could you share with us the algorithm of calculating that cycles_per_sec
entry during startup? Not that we will find anything wrong with that algorithm :slight_smile: Could be a lot of
reasons why it isn’t accurate. One of them is a “cold” hardware during startup, but I doubt it –
too big difference, almost 41 kHz on my 350 MHz system.

My little test program doesn’t require any additional hardware, it uses RTC chip as marks of time.
If you’re interested it’s available here:
http://ed1k.qnx.org.ru/examples/CPUSPEED.TGZ

./speedchk

Measuring CPU speed. Please wait 50 seconds…
Got samples, statistical processing…

CPS from SYSPAGE is 348526700
Average CPS is 348485720
Medium CPS is 348485856.000000
Dispersion is 154.975893Hz
Difference between SYSPAGE entry and actual speed is 40832.000000Hz

Cheers,
Eduard.


Marcus Daiber <> mdaiber@gmx.net> > wrote:
David,
I appreciate your comments; however, QNX claims to produce an operating
system for mission critical embedded applications; my application is
embedded and precise time measurement (without GPS) is mission critical;
your statements do not encourage me to use QNX if that is the general
attitude within your company. I hope some people there strive to be better
than 4 parts in 10000.

On general purpose hardware, how can we do better than what the hardware
can supply?

Precise time measurement on a computer system is NOT an easy thing. Neither
is accurate time measurement.

The following little program:

#include <inttypes.h
#include <stdio.h
#include <sys/neutrino.h
#include <pthread.h

#define BILLION 1000000000
#define NumSamples 30

main ()
{
uint64_t cycs [NumSamples];
int i;

printf ("%d ClockCycles values:\n", NumSamples);

for (i = 0; i < NumSamples; i++) {
cycs > = ClockCycles ();
}

printf ("%llu\n", cycs [0]);
for (i = 1; i < NumSamples; i++) {
printf ("%llu, delta %llu (decimal)\n",
cycs > , cycs > - cycs > );
}
}

Gives:

30 ClockCycles values:
661424286661766
661424286662110, delta 344 (decimal)
661424286662202, delta 92 (decimal)
661424286662294, delta 92 (decimal)

In article <MPG.19b80f118652cb299896eb@inn.qnx.com>, ed1k@humber.bay says…

In article <bhu4ir$79q$> 1@nntp.qnx.com> >, > dagibbs@qnx.com > says…

Hi David,

Your little program perfectly explains that execution of the same code could take CPU a different
time. But original question was why SYSPAGE_ENTRY(qtime)->cycles_per_sec isn’t precise enough. I
can’t say it’s completely wrong, but it isn’t precise. Seems it has some offset error and it kills
the idea of that system page entry because you can’t use it to transform ClockCycles() results into
actual time precisely. Could you share with us the algorithm of calculating that cycles_per_sec
entry during startup? Not that we will find anything wrong with that algorithm > :slight_smile: > Could be a lot of
reasons why it isn’t accurate. One of them is a “cold” hardware during startup, but I doubt it –

Well, previously I posted result on “cold” machine just booted in the morning. After 3 hours I
rebooted in order to update syspage entry, here is result (a bit better, but still…):

Measuring CPU speed. Please wait 50 seconds…
Got samples, statistical processing…

CPS from SYSPAGE is 348517300
Average CPS is 348486264
Medium CPS is 348486400.000000
Dispersion is 176.228122Hz
Difference between SYSPAGE entry and actual speed is 30912.000000Hz

Eduard.

ed1k wrote:

In article <> MPG.19b80f118652cb299896eb@inn.qnx.com> >, > ed1k@humber.bay > says…
In article <bhu4ir$79q$> 1@nntp.qnx.com> >, > dagibbs@qnx.com > says…

Hi David,

Your little program perfectly explains that execution of the same code could take CPU a different
time. But original question was why SYSPAGE_ENTRY(qtime)->cycles_per_sec isn’t precise enough. I
can’t say it’s completely wrong, but it isn’t precise. Seems it has some offset error and it kills
the idea of that system page entry because you can’t use it to transform ClockCycles() results into
actual time precisely. Could you share with us the algorithm of calculating that cycles_per_sec
entry during startup? Not that we will find anything wrong with that algorithm > :slight_smile: > Could be a lot of
reasons why it isn’t accurate. One of them is a “cold” hardware during startup, but I doubt it –

Well, previously I posted result on “cold” machine just booted in the morning. After 3 hours I
rebooted in order to update syspage entry, here is result (a bit better, but still…):

Measuring CPU speed. Please wait 50 seconds…
Got samples, statistical processing…

CPS from SYSPAGE is 348517300
Average CPS is 348486264
Medium CPS is 348486400.000000
Dispersion is 176.228122Hz
Difference between SYSPAGE entry and actual speed is 30912.000000Hz

Eduard.

Your actual speed seems very consistent, if the dispersion of 176 is
real. If you calculate a SYSPAGE number for each machine, based on
perhaps hours of testing, is it stable enough to give what you want?

I have desktop, laptop and single board systems that have clock errors
of several seconds per day. In QNX 2 times there was a command something
like ‘ticksize’ that I could and did use in sysinit to set each system
to its proper speed and get much less than one second per day drift. I
couldn’t find anything like it in QNX 4.

John Halpenny

Natural Resources Canada Ressources Naturelles Canada
Geodetic Survey Division Division des levés géodésiques
615 Booth St., Room 498H 615 rue Booth, Pièce 498H
Ottawa, Ontario, Canada
K1A-0E9
Phone: (613) 996-9321

In article <3F4E68B5.84ED3C4D@nrcan.gc.ca>, jhalpenn@nrcan.gc.ca says…

Your actual speed seems very consistent, if the dispersion of 176 is
real. If you calculate a SYSPAGE number for each machine, based on
perhaps hours of testing, is it stable enough to give what you want?

I just wanted to point out that cycles_per_sec entry in SYSPAGE isn’t quite accurate. I bothered
with mesurements theory years ago. Now I am in other country and I don’t have my notes or any books
on subject. So, if my memory serves me right, I hope that dispersion is a real one :slight_smile: As you
downloaded sources (this program is for QNX6!), you can see I just got array (101 samples) of free-
running counter (RDTSC) with interval of 500 mS, then I calculated array of an increases and made
some statistical calculations. As you can see from two my posts above, calculated CPU speed wasn’t
changed a lot after 3 hours of working my old IBM 300 GL, but value calculated by OS is different
(better by 10 kHz :slight_smile:)


I have desktop, laptop and single board systems that have clock errors
of several seconds per day. In QNX 2 times there was a command something
like ‘ticksize’ that I could and did use in sysinit to set each system
to its proper speed and get much less than one second per day drift. I
couldn’t find anything like it in QNX 4.

It isn’t related problem, sorry. OS gets clock ticks from 8254 timer to update system time. There is
ClockAdjust() in QNX6, I believe there is some solutions for QNX4 also. If you really need
synchronous time in your network you might want to use NTP, though I don’t know if it’s ported (or
could be ported) to QNX4.

Best regards,
Eduard.

In article <MPG.19b854cef62dfd249896ed@inn.qnx.com>, ed1k@humber.bay says…

One more observation. According to message posted by John A. Murphy quite long ago, cycles_per_sec
is calculated by two RDTSC with interval 10 ms between them. So, I changed my program to use 128 Hz
time marks (7.8125 mS between samples). Here is result:

348437376 348485760 348485760 348485760 348497792 348498816 348457984
348485760 348533248 348475008 348482560 348496512 348498304 348493824
348454400 348484864 348486656 348484864 348486656 348484864 348505472
348498304 348451712 348484864 348486656 348484864 348486656 348484864
348524800 348481664 348456192 348480384 348486656 348484864 348486656
348484864 348508160 348505088 348448128 348481664 348483968 348484864
348486656 348484864 348510848 348498304 348449024 348484864 348486656
348484864 348493824 348484480 348505856 348501888 348445440 348484480
348484352 348487168 348484352 348487168 348516608 348501888 348437376
348494720 348480896 348481664 348487168 348484352 348519424 348499200
348437376 348484352 348489856 348481664 348487168 348484352 348522112
348495104 348441472 348484352 348483968 348484864 348486656 348484864
348521600 348483072 348453504 348484864 348486656 348490240 348481792
348484352 348524800 348479872 348454016 348484352 348500096 348471424
348484480 348484352
CPS from SYSPAGE is 348532000
Average CPS is 348480000
Medium CPS is 348485408
Dispersion is 1634.668162Hz
Difference between SYSPAGE entry and actual speed is 46592Hz

So, if I use 100 points to calculate CPU speed it gives me some accuracy, but it is obviously that
two points isn’t enough because some measurements fall out of row.

Just for comparison here is result for 500 ms between RDTSCs:

348484514 348485718 348485906 348485782 348485438 348484982 348485558
348485500 348485746 348485564 348485494 348485446 348485696 348485670
348485536 348485634 348485508 348485592 348485542 348485726 348485564
348485452 348485606 348486106 348485106 348485564 348485640 348485424
348485634 348485844 348485942 348485264 348485864 348484984 348485654
348485782 348485760 348485990 348485292 348485032 348485564 348485696
348485474 348485584 348485600 348485564 348485528 348485578 348485614
348485416 348485746 348485550 348485572 348485774 348485466 348485486
348485620 348485466 348485690 348485676 348485452 348485564 348485620
348485676 348486300 348484906 348485970 348485122 348485432 348485844
348485844 348485438 348485844 348485052 348485586 348485368 348485676
348485696 348485480 348485536 348485656 348485696 348485466 348485866
348485340 348485668 348485578 348485788 348485354 348485544 348485528
348485760 348485620 348485382 348485634 348485508 348485676 348485732
348485852 348485276
CPS from SYSPAGE is 348532000
Average CPS is 348485468
Medium CPS is 348485504
Dispersion is 186.260895Hz
Difference between SYSPAGE entry and actual speed is 46496Hz

Actually, I don’t think QNX has some problems with accuracy of cycles_per_sec, but this should be
documented (in ClockCycles help file?). Thus, if someone is going to use ClockCycles() for precise
time calculations, s(he) have to find a way to calculate CPU speed more precisely instead of rely on
SYSPAGE entry.

Best regards,
Eduard.

ed1k <ed1k@humber.bay> wrote:

In article <bhu4ir$79q$> 1@nntp.qnx.com> >, > dagibbs@qnx.com > says…

Hi David,

Could you share with us the algorithm of calculating that cycles_per_sec
entry during startup?

Its shipped with (at least) Momentics PE.

On a self-hosted system, take a look at:

/usr/src/bsp-6.2.1/libs/src/hardware/startup/lib/x86/set_cycles.c

If not self-hosted, prefix the above with $(QNX_TARGET), I think.

In fact, if you have a better algorithm for your system(s)… you could
just build yourself a new startup, using your version of set_cycles.c
rather than the standard one, and all your machines will now have a
better value for cycles_per_sec.

-David

QNX Training Services
http://www.qnx.com/support/training/
Please followup in this newsgroup if you have further questions.

David Gibbs wrote:

ed1k <> ed1k@humber.bay> > wrote:
In article <bhu4ir$79q$> 1@nntp.qnx.com> >, > dagibbs@qnx.com > says…

Hi David,

Could you share with us the algorithm of calculating that cycles_per_sec
entry during startup?

Its shipped with (at least) Momentics PE.

On a self-hosted system, take a look at:

/usr/src/bsp-6.2.1/libs/src/hardware/startup/lib/x86/set_cycles.c

If not self-hosted, prefix the above with $(QNX_TARGET), I think.

In fact, if you have a better algorithm for your system(s)… you could
just build yourself a new startup, using your version of set_cycles.c
rather than the standard one, and all your machines will now have a
better value for cycles_per_sec.

-David

Could you have the option for a fixed value to be entered from something like
the sysinit file. Each computer should have a fairly stable value once you
calibrate it.

John Halpenny

In article <3F5D037D.58CDF89B@rogers.com>, j.halpenny@rogers.com says…

David Gibbs wrote:

ed1k <> ed1k@humber.bay> > wrote:
In article <bhu4ir$79q$> 1@nntp.qnx.com> >, > dagibbs@qnx.com > says…

Hi David,

Could you share with us the algorithm of calculating that cycles_per_sec
entry during startup?

Its shipped with (at least) Momentics PE.

On a self-hosted system, take a look at:

/usr/src/bsp-6.2.1/libs/src/hardware/startup/lib/x86/set_cycles.c

If not self-hosted, prefix the above with $(QNX_TARGET), I think.

In fact, if you have a better algorithm for your system(s)… you could
just build yourself a new startup, using your version of set_cycles.c
rather than the standard one, and all your machines will now have a
better value for cycles_per_sec.

-David

Could you have the option for a fixed value to be entered from something like
the sysinit file. Each computer should have a fairly stable value once you
calibrate it.

Take a look at startup-bios -f. Once you have calibrated value you can build special image for your
computer :slight_smile:

Eduard.


John Halpenny

In article <bjitbk$i36$1@nntp.qnx.com>, dagibbs@qnx.com says…

ed1k <> ed1k@humber.bay> > wrote:
In article <bhu4ir$79q$> 1@nntp.qnx.com> >, > dagibbs@qnx.com > says…

Hi David,

Could you share with us the algorithm of calculating that cycles_per_sec
entry during startup?

Its shipped with (at least) Momentics PE.

On a self-hosted system, take a look at:

/usr/src/bsp-6.2.1/libs/src/hardware/startup/lib/x86/set_cycles.c

If not self-hosted, prefix the above with $(QNX_TARGET), I think.

Thank you.

In fact, if you have a better algorithm for your system(s)…

No, I don’t. My algorithm isn’t better and definitely it isn’t for start-up. And my goal was just
check out what’s wrong with standard one (which is used now in QNX). Start-up time is better for
such a task as CPU speed measurement than the same measurement in working multi-task system, so your
algorithm should be better. But in fact, my algorithm gives me result much closer to result of other
benchmark tools…

Algorithm which I used has some advantages: it takes long time in order to get 100 (it could be
extended) measurements and make some statistical processing of result. Also it has a lot of
disanvatnages:

  1. it takes long time;
  2. it uses float point arithmetic;
  3. it has precission limitation of method. I use periodical interrupts to get state of CPU free-
    running counter. That means, after RTC chip rise up an interrupt request, there is some time before
    rdtsc instruction executed in ISR - interrupt latency. RTC interrupt isn’t high priority interrupt,
    and generally speaking, interrupt latency could vary from time to time. That fact that I have so
    small dispersion of results, indirectly says that interrupt latency in QNX doesn’t vary much (it’s
    also good to know, though it doesn’t say this latency is small or big :slight_smile:)

you could
just build yourself a new startup, using your version of set_cycles.c
rather than the standard one, and all your machines will now have a
better value for cycles_per_sec.

Yes, I could. Why QSSL doesn’t want it too? Then all our machines (x86 at least) will have a better
value for cycles_per_sec. In file init_qtime.c, there is defined period of 8254 clock:
#define PC_CLOCK_RATE 838095345UL
And this is used to calculated another PC related constant, in set_cycles.c, when we need to know
how many ticks of 8254 are in 0.01s period. The problem is it’s calculated by integer arithmetic,
and result is 11933. I don’t understand why you need to calculate it rather than just
#define PC_TICKS_IN_10MS 11932
It looks like not a big error, big deal 11933 instead of 11932, but
11933 * 0.838095345 us = 10000.992 us, i.e. error is almost 1 us.
My slow CPU makes extra 347 cycles in that extra 0.992 us, and because we have to multiply result by
100 (we took 0.01s period but need cycles_per_sec) it is 34700 Hz difference. This is exactly what I
saw as offset error in my experiments.

For now,
systematic error (Hz) = Fcpu (MHz) * 99.2
so, the faster CPU the bigger error.
And it isn’t error because of hardware limitation.

Cheers,
Eduard

ed1k wrote:

In article <bjitbk$i36$> 1@nntp.qnx.com> >, > dagibbs@qnx.com > says…
ed1k <> ed1k@humber.bay> > wrote:
In article <bhu4ir$79q$> 1@nntp.qnx.com> >, > dagibbs@qnx.com > says…

Hi David,

Could you share with us the algorithm of calculating that cycles_per_sec
entry during startup?

Its shipped with (at least) Momentics PE.

On a self-hosted system, take a look at:

/usr/src/bsp-6.2.1/libs/src/hardware/startup/lib/x86/set_cycles.c

If not self-hosted, prefix the above with $(QNX_TARGET), I think.


Thank you.

In fact, if you have a better algorithm for your system(s)…

No, I don’t. My algorithm isn’t better and definitely it isn’t for start-up. And my goal was just
check out what’s wrong with standard one (which is used now in QNX). Start-up time is better for
such a task as CPU speed measurement than the same measurement in working multi-task system, so your
algorithm should be better. But in fact, my algorithm gives me result much closer to result of other
benchmark tools…

Algorithm which I used has some advantages: it takes long time in order to get 100 (it could be
extended) measurements and make some statistical processing of result. Also it has a lot of
disanvatnages:

  1. it takes long time;
  2. it uses float point arithmetic;
  3. it has precission limitation of method. I use periodical interrupts to get state of CPU free-
    running counter. That means, after RTC chip rise up an interrupt request, there is some time before
    rdtsc instruction executed in ISR - interrupt latency. RTC interrupt isn’t high priority interrupt,
    and generally speaking, interrupt latency could vary from time to time. That fact that I have so
    small dispersion of results, indirectly says that interrupt latency in QNX doesn’t vary much (it’s
    also good to know, though it doesn’t say this latency is small or big > :slight_smile:> )

you could
just build yourself a new startup, using your version of set_cycles.c
rather than the standard one, and all your machines will now have a
better value for cycles_per_sec.

Yes, I could. Why QSSL doesn’t want it too? Then all our machines (x86 at least) will have a better
value for cycles_per_sec. In file init_qtime.c, there is defined period of 8254 clock:
#define PC_CLOCK_RATE 838095345UL
And this is used to calculated another PC related constant, in set_cycles.c, when we need to know
how many ticks of 8254 are in 0.01s period. The problem is it’s calculated by integer arithmetic,
and result is 11933. I don’t understand why you need to calculate it rather than just
#define PC_TICKS_IN_10MS 11932
It looks like not a big error, big deal 11933 instead of 11932, but
11933 * 0.838095345 us = 10000.992 us, i.e. error is almost 1 us.
My slow CPU makes extra 347 cycles in that extra 0.992 us, and because we have to multiply result by
100 (we took 0.01s period but need cycles_per_sec) it is 34700 Hz difference. This is exactly what I
saw as offset error in my experiments.

For now,
systematic error (Hz) = Fcpu (MHz) * 99.2
so, the faster CPU the bigger error.
And it isn’t error because of hardware limitation.

Cheers,
Eduard

Don’t forget that the magic number of 838095345UL assumes a crystal
that’s exactly on frequency. The “error” introduced by the integer
arithmetic is on the order of 84 parts per million; the crystal
frequency can easily be off by several times that amount. So the
“error” you attribute to the OS is still small with respect to the
hardware limitation.

Murf