TCPIP Performance

Bill_Caroselli1 · January 29, 2003, 5:52pm

Dave Edwards <Dave.edwards@abicom-international.com> wrote:

I’m currently seeing my ISR routines executing with around 1ms
resolution, these are triggered by an isr deliver event signal with a
high priority.

After a long period of experimentation I found that this delay was
reduced by using the ClockPeriod function.

I don’t claim to know or understand what is going on here, but it
appears to me that the servicing of the interrupt via the event
mechanism occurs at the rate of the microkernel tick period.

Since this discovery, I’ve tried operating within the HW ISR itself (By
writing my own function), this works better but it can still take 1 tick
time to pass control over to the software isr.

Let’s be fair. No one is more addictd to the speed of QNX4 over QNX6
then I am. But your judging one thing and comparing it to another.

There are two issues involved here.

The clock period, or tick(size). This is typically 1 ms. If the
clock ticked 10 us ago, it ain’t going to go off again for another
990 us no matter what else your software demands of it.

BUT . . .

When you talking about (or should be talking about) is interrupt
latency. Latency is defined as the time between when a piece of
hardware generates an interrupt and when the ISR begins to execute.
Under QNX6 the latency (while not quite as impressive as QNX4) is
still much better than many (most) other OSs.

Typically, there is also another latency involved here. With QNX the
philosophy is to do as little work as possible in the ISR and then cause
a non-ISR process/thread to do most of the work. So there is a latency
between scheduling that process/thread to start and it’s actually
starting. This latency is also not as good as QNX4’s, but still isn’t bad.

The real problem most people have is adjusting the priorities of all of
the processes/threads in the system so that the time critical events
are process in a timely manner and the lower priority threads still
have enough CPU to do their thing.

It sounds like your using a timer interrupt (which is limited to the 1 ms
resolution and comparing it to what you expect from a hardware
interrupt. That apples and oranges and it just not fair.

–
Bill Caroselli – Q-TPS Consulting
1-(626) 824-7983
qtps@earthlink.net

Dave_Edwards · January 30, 2003, 9:44am

[PRE]

Let’s be fair. No one is more addictd to the speed of QNX4 over QNX6
then I am. But your judging one thing and comparing it to another.

There are two issues involved here.

The clock period, or tick(size). This is typically 1 ms. If the
clock ticked 10 us ago, it ain’t going to go off again for another
990 us no matter what else your software demands of it.

BUT . . .

Agreed and understood.

2) When you talking about (or should be talking about) is interrupt
latency. Latency is defined as the time between when a piece of
hardware generates an interrupt and when the ISR begins to execute.
Under QNX6 the latency (while not quite as impressive as QNX4) is
still much better than many (most) other OSs.

This appears fine under QNX6. I’ve confirmed that the ISR latency is low
and acceptable.

Typically, there is also another latency involved here. With QNX the
philosophy is to do as little work as possible in the ISR and then cause
a non-ISR process/thread to do most of the work. So there is a latency
between scheduling that process/thread to start and it’s actually
starting. This latency is also not as good as QNX4’s, but still isn’t bad.

This is where I’m seeing problems, My current driver uses the

InterruptAttachEvent method to kick of a worker thread that is waiting
for a message from the driver.

The issue that I’m seeing here is that it can take upto 1ms for the
event to be delivered to the worker thread. This delay can be adjusted
by changing the clock period, or tick time.

I’ve tried changing the priorities of other processes, but it still ends
up being related to the tick duration.

The real problem most people have is adjusting the priorities of all of
the processes/threads in the system so that the time critical events
are process in a timely manner and the lower priority threads still
have enough CPU to do their thing.

Agreed and tried, still no luck.

It sounds like your using a timer interrupt (which is limited to the 1 ms
resolution and comparing it to what you expect from a hardware
interrupt. That apples and oranges and it just not fair.

Agreed, but I do not expect the delivery of an event from an ISR to take
1ms.

Dave

Bill_Caroselli1 · January 30, 2003, 5:45pm

Dave Edwards <Dave.edwards@abicom-international.com> wrote:

[PRE]

Typically, there is also another latency involved here. With QNX the
philosophy is to do as little work as possible in the ISR and then cause
a non-ISR process/thread to do most of the work. So there is a latency
between scheduling that process/thread to start and it’s actually
starting. This latency is also not as good as QNX4’s, but still isn’t bad.

This is where I’m seeing problems, My current driver uses the
InterruptAttachEvent method to kick of a worker thread that is waiting
for a message from the driver.

The issue that I’m seeing here is that it can take upto 1ms for the
event to be delivered to the worker thread. This delay can be adjusted
by changing the clock period, or tick time.

I’ve tried changing the priorities of other processes, but it still ends
up being related to the tick duration.

The real problem most people have is adjusting the priorities of all of
the processes/threads in the system so that the time critical events
are process in a timely manner and the lower priority threads still
have enough CPU to do their thing.

Agreed and tried, still no luck.

It sounds like your using a timer interrupt (which is limited to the 1 ms
resolution and comparing it to what you expect from a hardware
interrupt. That apples and oranges and it just not fair.

Agreed, but I do not expect the delivery of an event from an ISR to take
1ms.

OK. If you say that you’ve adjusted priorities and tha makes no difference,
and you say that ii seems to be tied to the ticksize, then, are you sure
the thread that gets kicked isn’t waiting on anything else when the
interrupt occurs? Like, Oh I don’t know, maybe a timer? I know, it’s a
dumb question like Is it plugged in? But what your seeing shouldn’t be
happeneing.

–
Bill Caroselli – Q-TPS Consulting
1-(626) 824-7983
qtps@earthlink.net

Dave_Edwards · January 30, 2003, 7:50pm

[PRE]

OK. If you say that you’ve adjusted priorities and tha makes no difference,
and you say that ii seems to be tied to the ticksize, then, are you sure
the thread that gets kicked isn’t waiting on anything else when the
interrupt occurs? Like, Oh I don’t know, maybe a timer? I know, it’s a
dumb question like Is it plugged in? But what your seeing shouldn’t be
happeneing.

Absolutely! There is nothing else in the “ISR” function. Here is what
I’m doing, it’s taken from the PCnet example code:
while (1) {

/* Wait for message from the interrupt */
rcvid = MsgReceivev( ext->rxchid, &iov, 1, NULL );

if( rcvid == -1 ){
if( errno == ESRCH ){
pthread_exit( NULL );
}
continue;
}
/* Work out what we have received and act accordingly */
if (pulse.code & TX_UP_Pulse) {
}

}

TX_UP_Pulse is the signal attached to the interrupt

I’ve also tried having more processing in the ISR itself, this exhibts
the effect that the ISR executes timely, but the handover to the
“software” ISR can take upto 1ms.

It’s got me stumped

Dave

Xiaodan_Tang1 · January 30, 2003, 7:55pm

What is the priority of the pulse you set, druing your InterrupAttachEvent()
?
Does change that to 60 help ?

-xtang

Edwards <Dave.edwards@abicom-international.com> wrote in message
news:3E398216.9010606@abicom-international.com…

[PRE]

OK. If you say that you’ve adjusted priorities and tha makes no
difference,
and you say that ii seems to be tied to the ticksize, then, are you sure
the thread that gets kicked isn’t waiting on anything else when the
interrupt occurs? Like, Oh I don’t know, maybe a timer? I know, it’s a
dumb question like Is it plugged in? But what your seeing shouldn’t
be
happeneing.

Absolutely! There is nothing else in the “ISR” function. Here is what
I’m doing, it’s taken from the PCnet example code:
while (1) {

/* Wait for message from the interrupt */
rcvid = MsgReceivev( ext->rxchid, &iov, 1, NULL );

if( rcvid == -1 ){
if( errno == ESRCH ){
pthread_exit( NULL );
}
continue;
}
/* Work out what we have received and act accordingly */
if (pulse.code & TX_UP_Pulse)

}

}

TX_UP_Pulse is the signal attached to the interrupt

I’ve also tried having more processing in the ISR itself, this exhibts
the effect that the ISR executes timely, but the handover to the
“software” ISR can take upto 1ms.

It’s got me stumped

Dave

Mario_Charest1 · January 30, 2003, 10:47pm

“Dave Edwards” <Dave.edwards@abicom-international.com> wrote in message
news:3E398216.9010606@abicom-international.com…

[PRE]

OK. If you say that you’ve adjusted priorities and tha makes no
difference,
and you say that ii seems to be tied to the ticksize, then, are you sure
the thread that gets kicked isn’t waiting on anything else when the
interrupt occurs? Like, Oh I don’t know, maybe a timer? I know, it’s a
dumb question like Is it plugged in? But what your seeing shouldn’t
be
happeneing.

Absolutely! There is nothing else in the “ISR” function. Here is what
I’m doing, it’s taken from the PCnet example code:
while (1) {

/* Wait for message from the interrupt */
rcvid = MsgReceivev( ext->rxchid, &iov, 1, NULL );

if( rcvid == -1 ){
if( errno == ESRCH ){
pthread_exit( NULL );
}
continue;
}
/* Work out what we have received and act accordingly */
if (pulse.code & TX_UP_Pulse) {
}

}

TX_UP_Pulse is the signal attached to the interrupt

I’ve also tried having more processing in the ISR itself, this exhibts
the effect that the ISR executes timely, but the handover to the
“software” ISR can take upto 1ms.

It’s got me stumped

What function are you using to measure time, what is your CPU?

Dave

Igor_Kovalenko2 · January 31, 2003, 6:43am

Let’s speculate a bit…

The mechanism of event delivery is a pulse. That is, an asynchronous
event with payload. Which means, they get queued by the kernel. Nothing
unusual so far, very similar to queued signals in Unix. However, unlike
the queued signals in Unix, the queue for pulses in QNX is unbounded. It
can grow indefinitely.

This is a major difference from QNX4, where the delivery mechanism was
the proxy. The latter had canned payload and was not queued - only the
counter was incremented. That meant a Trigger() kernel call from an ISR
resulted in a synchronous message pass to the waiting process, so the
kernel immediately knew that process has become READY and could do
scheduling right away.

With QNX6 is is not that simple. One kernel path (in case of using
InterruptAttachEvent) is servicing interrupts, sending pulses to
registered threads. A pulse has to be queued by that path (producer) and
that queue must be being drained by another kernel path (consumer),
delivering the payload to waiting processes and triggering the
scheduler. The question is when that will happen. The consumer path
can’t be entered directly from the producer path, since they have to be
asynchronous and the queue is unbounded. Which means some other event
must cause the kernel to enter into consumer/scheduler path. And what
kind of event could it be? I am speculating, it could be the next clock
tick

– igor

Dave Edwards wrote:

[PRE]

OK. If you say that you’ve adjusted priorities and tha makes no
difference,
and you say that ii seems to be tied to the ticksize, then, are you sure
the thread that gets kicked isn’t waiting on anything else when the
interrupt occurs? Like, Oh I don’t know, maybe a timer? I know, it’s a
dumb question like Is it plugged in? But what your seeing shouldn’t
be happeneing.

Absolutely! There is nothing else in the “ISR” function. Here is what
I’m doing, it’s taken from the PCnet example code:
while (1) {

/* Wait for message from the interrupt */
rcvid = MsgReceivev( ext->rxchid, &iov, 1, NULL );

if( rcvid == -1 ){
if( errno == ESRCH ){
pthread_exit( NULL );
}
continue;
}
/* Work out what we have received and act accordingly */
if (pulse.code & TX_UP_Pulse) {
}

}

TX_UP_Pulse is the signal attached to the interrupt

I’ve also tried having more processing in the ISR itself, this exhibts
the effect that the ISR executes timely, but the handover to the
“software” ISR can take upto 1ms.

It’s got me stumped

Dave

Igor_Kovalenko2 · January 31, 2003, 6:54am

John A. Murphy wrote:

[snip]
Over the last 25 years or so we’ve found it to be true in general. It may be worse
when there’s a network and/or context switches involved, but EVERY transfer consists of
a setup time, or time_per_transaction, and a transfer time, or time_per_byte. The more
of those time_per_byte chunks of time that you can stuff under one time_per_transaction
chunk, the faster you go. When the time_per_transaction is made of of function calls
and register loads it may not make as big a difference as when it’s made up of context
switches or packet transmissions, but the principle still holds. The question is “How
high is high?”, or “How horrible is THAT horrible?”

We’re talking about several orders of magnitude. Bandwidth of raw local
message passing varies from single megabytes per sec to single gigabytes
per sec (physical bandwidth limit, essentially), depending on message
size. I’d say that price is pretty damn horrible.

– igor

Dave_Edwards · January 31, 2003, 10:36am

[PRE]

What function are you using to measure time, what is your CPU?

I’ve tried this on a number of CPU’s ranging from a 1500MHz AMD to a
300MHz NEC Geode.

I measure the ISR time in the following way, it’s a little complicated!

Firstly I have a protocol on my network driver that operates in a
polling mode. The master node will send a “ping” type packets to the
remote node, (about 60 Bytes of Data). The remote node will then respond
to the ping, again with 60 bytes of data.

My timing procedure is as follows:

1 record the system time just before transmit by using clock_gettime
2 send the packet (this happens instantaneously as it is already loaded
into the hardware)
3 Wait for the “ping ACK” from the remote station in the software ISR
and record the system time by using clock_gettime

What I should then have is the inherent driver delay + 2* the ISR to
SW-ISR delay (RX on remote + RX on Master)

The Inherent hardware (driver) delay is around 250uS per end. Therefore
I would expect the ACK packet to arrive (assuming 0uS OS ISR delays) at
around 500uS. I appreciate that this is not going to happen but I did
not expect to receive the packet 2.5ms later.

By adjusting the TICK time I can change the delay to (2*Tick time +
500uS), which leads me to the conclusion that the SW-ISR message
delivery is directly related to the system TICK timer.

I also have another version of this driver, where the Remote end does
all the processing within the Hardware ISR. In this situation the return
message arrives at the Master end at (1 * Tick time + 500uS), which
again confirms that the SW-ISR message is related to the TICK timer.

Hope this helps

Dave

John_A_Murphy1 · January 31, 2003, 1:44pm

Isn’t clock_gettime limited to TICK resolution? If so, then your
measurements, but not necessarily the intervals you’re measuring, are
directly related to the TICK size.

Murf

Dave Edwards wrote:

[PRE]
What function are you using to measure time, what is your CPU?

I’ve tried this on a number of CPU’s ranging from a 1500MHz AMD to a
300MHz NEC Geode.

I measure the ISR time in the following way, it’s a little complicated!

Firstly I have a protocol on my network driver that operates in a
polling mode. The master node will send a “ping” type packets to the
remote node, (about 60 Bytes of Data). The remote node will then respond
to the ping, again with 60 bytes of data.

My timing procedure is as follows:

1 record the system time just before transmit by using clock_gettime
2 send the packet (this happens instantaneously as it is already loaded
into the hardware)
3 Wait for the “ping ACK” from the remote station in the software ISR
and record the system time by using clock_gettime

What I should then have is the inherent driver delay + 2* the ISR to
SW-ISR delay (RX on remote + RX on Master)

The Inherent hardware (driver) delay is around 250uS per end. Therefore
I would expect the ACK packet to arrive (assuming 0uS OS ISR delays) at
around 500uS. I appreciate that this is not going to happen but I did
not expect to receive the packet 2.5ms later.

By adjusting the TICK time I can change the delay to (2*Tick time +
500uS), which leads me to the conclusion that the SW-ISR message
delivery is directly related to the system TICK timer.

I also have another version of this driver, where the Remote end does
all the processing within the Hardware ISR. In this situation the return
message arrives at the Master end at (1 * Tick time + 500uS), which
again confirms that the SW-ISR message is related to the TICK timer.

Hope this helps

Dave

Sean_Boudreau1 · January 31, 2003, 2:42pm

Exactly.

If you want something finer, you might look at
(ClockCycles() - CloclCycles()) / SYSPAGE_ENTRY(qtime)->cycles_per_sec
(assuming you’re running on a platform with an RDTSC like instruction).

See docs for ClockCycles() for more info.

-seanb

John A. Murphy <murf@perftech.com> wrote:

Isn’t clock_gettime limited to TICK resolution? If so, then your
measurements, but not necessarily the intervals you’re measuring, are
directly related to the TICK size.

Murf

Dave Edwards wrote:

[PRE]
What function are you using to measure time, what is your CPU?

I’ve tried this on a number of CPU’s ranging from a 1500MHz AMD to a
300MHz NEC Geode.

I measure the ISR time in the following way, it’s a little complicated!

Firstly I have a protocol on my network driver that operates in a
polling mode. The master node will send a “ping” type packets to the
remote node, (about 60 Bytes of Data). The remote node will then respond
to the ping, again with 60 bytes of data.

My timing procedure is as follows:

1 record the system time just before transmit by using clock_gettime
2 send the packet (this happens instantaneously as it is already loaded
into the hardware)
3 Wait for the “ping ACK” from the remote station in the software ISR
and record the system time by using clock_gettime

What I should then have is the inherent driver delay + 2* the ISR to
SW-ISR delay (RX on remote + RX on Master)

The Inherent hardware (driver) delay is around 250uS per end. Therefore
I would expect the ACK packet to arrive (assuming 0uS OS ISR delays) at
around 500uS. I appreciate that this is not going to happen but I did
not expect to receive the packet 2.5ms later.

By adjusting the TICK time I can change the delay to (2*Tick time +
500uS), which leads me to the conclusion that the SW-ISR message
delivery is directly related to the system TICK timer.

I also have another version of this driver, where the Remote end does
all the processing within the Hardware ISR. In this situation the return
message arrives at the Master end at (1 * Tick time + 500uS), which
again confirms that the SW-ISR message is related to the TICK timer.

Hope this helps

Dave

Brian_Stecher1 · January 31, 2003, 2:50pm

Igor Kovalenko <kovalenko@attbi.com> wrote:

With QNX6 is is not that simple. One kernel path (in case of using
InterruptAttachEvent) is servicing interrupts, sending pulses to
registered threads. A pulse has to be queued by that path (producer) and
that queue must be being drained by another kernel path (consumer),
delivering the payload to waiting processes and triggering the
scheduler. The question is when that will happen. The consumer path
can’t be entered directly from the producer path, since they have to be
asynchronous and the queue is unbounded. Which means some other event
must cause the kernel to enter into consumer/scheduler path. And what
kind of event could it be? I am speculating, it could be the next clock
tick >

Nope. Event delivery & scheduling decisions caused by the event will
happen before code from any user level thread is executed - non-SMP.
In the SMP case things are a little more complicated, but the event
delivery and rescheduling will certainly take place before the next
clock tick.

I think at least part of the problem is what was pointed out earlier.
Using clock_gettime() won’t give any better resolution than a clock
tick. Rerunning the test and employing ClockCycles() to figure the
intervals might give a better handle on what’s going on.

–
Brian Stecher (bstecher@qnx.com) QNX Software Systems, Ltd.
phone: +1 (613) 591-0931 (voice) 175 Terence Matthews Cr.
+1 (613) 591-3579 (fax) Kanata, Ontario, Canada K2M 1W8

Xiaodan_Tang1 · January 31, 2003, 2:55pm

Igor Kovalenko <kovalenko@attbi.com> wrote in message
news:3E3A1B17.6060409@attbi.com…

Let’s speculate a bit…

The mechanism of event delivery is a pulse. That is, an asynchronous
event with payload. Which means, they get queued by the kernel. Nothing
unusual so far, very similar to queued signals in Unix. However, unlike
the queued signals in Unix, the queue for pulses in QNX is unbounded. It
can grow indefinitely.

This is a major difference from QNX4, where the delivery mechanism was
the proxy. The latter had canned payload and was not queued - only the
counter was incremented. That meant a Trigger() kernel call from an ISR
resulted in a synchronous message pass to the waiting process, so the
kernel immediately knew that process has become READY and could do
scheduling right away.

With QNX6 is is not that simple. One kernel path (in case of using
InterruptAttachEvent) is servicing interrupts, sending pulses to
registered threads. A pulse has to be queued by that path (producer) and
that queue must be being drained by another kernel path (consumer),
delivering the payload to waiting processes and triggering the
scheduler. The question is when that will happen. The consumer path
can’t be entered directly from the producer path, since they have to be
asynchronous and the queue is unbounded. Which means some other event
must cause the kernel to enter into consumer/scheduler path. And what
kind of event could it be? I am speculating, it could be the next clock
tick >

Igor, no really.

First, pulse is not “ALWYAS QUEUED”. Pulse will only be queued IF
the target channle have no Receiveing thread. If there is a Thread waiting,
pulse is delievered immediatly (data xfered immediatly), and the thread
will be ready (at the pulse priority), and scheduled to run (once the kernel
interrupt handler returned). So if he have a high priority pulse, his
receiving
thread should be scheduled immediatly. That’s why I asked what priority
he used in his pulse.

Second, if there is no thread receve blocked on the channel, pulses will be
queued, based on pulse prioritys, that means high priority pulse delievered
FIRST. Another thing we do is we boost those server thread who is servicing
the channel so that no priority inversion could happen.

So my suspect is either he is using a “normal priority” pulse, which causing
his thread take time to be scheduled to RUN. Or, like others point out,
his time measuring is Tick bounded.

-xtang

Dave Edwards wrote:
[PRE]

OK. If you say that you’ve adjusted priorities and tha makes no
difference,
and you say that ii seems to be tied to the ticksize, then, are you
sure
the thread that gets kicked isn’t waiting on anything else when the
interrupt occurs? Like, Oh I don’t know, maybe a timer? I know, it’s
a
dumb question like Is it plugged in? But what your seeing shouldn’t
be happeneing.

Absolutely! There is nothing else in the “ISR” function. Here is what
I’m doing, it’s taken from the PCnet example code:
while (1) {

/* Wait for message from the interrupt */
rcvid = MsgReceivev( ext->rxchid, &iov, 1, NULL );

if( rcvid == -1 ){
if( errno == ESRCH ){
pthread_exit( NULL );
}
continue;
}
/* Work out what we have received and act accordingly */
if (pulse.code & TX_UP_Pulse)

}

}

TX_UP_Pulse is the signal attached to the interrupt

I’ve also tried having more processing in the ISR itself, this exhibts
the effect that the ISR executes timely, but the handover to the
“software” ISR can take upto 1ms.

It’s got me stumped

Dave
\

Robert_Krten1 · January 31, 2003, 4:03pm

Mario Charest postmaster@127.0.0.1 wrote:

“Dave Edwards” <> Dave.edwards@abicom-international.com> > wrote in message
news:> 3E3A51AF.8070603@abicom-international.com> …
[PRE]
What function are you using to measure time, what is your CPU?

1 record the system time just before transmit by using clock_gettime

Ouch, clock_gettime is limited by tick size resolution. That is what I
though you were doing. That a good sign for me that my confidence level
about the OS is increasin, as I assume the OS was doing the right thing and
you weren’t > > As other poster said use ClockCycles() but beware that on
some processors (<=486) ClockCycles is also limited by tick size resolution
as these CPU offer no hardware clock counter.

[slightly off topic]
I thought that on the 486 it was read in from the 8253/8254 chip, so that
it was more accurate than the tick size resolution (but still less accurate than
the “real” rdtsc opcode)… Am I confused?

Cheers,
-RK

–
Robert Krten, PARSE Software Devices +1 613 599 8316.
Realtime Systems Architecture, Books, Video-based and Instructor-led
Training and Consulting at www.parse.com.
Email my initials at parse dot com.

Mario_Charest1 · January 31, 2003, 4:09pm

“Dave Edwards” <Dave.edwards@abicom-international.com> wrote in message
news:3E3A51AF.8070603@abicom-international.com…

[PRE]
What function are you using to measure time, what is your CPU?

1 record the system time just before transmit by using clock_gettime

Ouch, clock_gettime is limited by tick size resolution. That is what I
though you were doing. That a good sign for me that my confidence level
about the OS is increasin, as I assume the OS was doing the right thing and
you weren’t As other poster said use ClockCycles() but beware that on
some processors (<=486) ClockCycles is also limited by tick size resolution
as these CPU offer no hardware clock counter.

Brian_Stecher1 · January 31, 2003, 4:19pm

Robert Krten <nospam84@parse.com> wrote:

Mario Charest postmaster@127.0.0.1 wrote:
As other poster said use ClockCycles() but beware that on
some processors (<=486) ClockCycles is also limited by tick size resolution
as these CPU offer no hardware clock counter.

[slightly off topic]
I thought that on the 486 it was read in from the 8253/8254 chip, so that
it was more accurate than the tick size resolution (but still less accurate than
the “real” rdtsc opcode)… Am I confused?

You’re not confused (at least about this . Interpolating between
the ticks for RDTSC emulation is the one reason the timer_value kernel
callout exists.

–
Brian Stecher (bstecher@qnx.com) QNX Software Systems, Ltd.
phone: +1 (613) 591-0931 (voice) 175 Terence Matthews Cr.
+1 (613) 591-3579 (fax) Kanata, Ontario, Canada K2M 1W8

Dave_Edwards · January 31, 2003, 4:44pm

Arrrrrrrgh!

Well that about messes up all the timing statements that I’ve made to date!

I still have a performance issue, but I’ll now have to go over the
system again and redo the timing!

Thanks for the info, any chance that the manual could be updated?

Dave

Mario Charest wrote:

“Dave Edwards” <> Dave.edwards@abicom-international.com> > wrote in message
news:> 3E3A51AF.8070603@abicom-international.com> …

[PRE]

What function are you using to measure time, what is your CPU?

1 record the system time just before transmit by using clock_gettime

Ouch, clock_gettime is limited by tick size resolution. That is what I
though you were doing. That a good sign for me that my confidence level
about the OS is increasin, as I assume the OS was doing the right thing and
you weren’t > > As other poster said use ClockCycles() but beware that on
some processors (<=486) ClockCycles is also limited by tick size resolution
as these CPU offer no hardware clock counter.

Bill_Caroselli1 · January 31, 2003, 5:45pm

Mario Charest postmaster@127.0.0.1 wrote:

“Dave Edwards” <> Dave.edwards@abicom-international.com> > wrote in message
news:> 3E3A51AF.8070603@abicom-international.com> …
[PRE]
What function are you using to measure time, what is your CPU?

1 record the system time just before transmit by using clock_gettime

Ouch, clock_gettime is limited by tick size resolution. That is what I
though you were doing. That a good sign for me that my confidence level
about the OS is increasin, as I assume the OS was doing the right thing and
you weren’t > > As other poster said use ClockCycles() but beware that on
some processors (<=486) ClockCycles is also limited by tick size resolution
as these CPU offer no hardware clock counter.

Let’s not write off what he’s does so quickly.

So the results are limited to timer tick size resolution. So what?

If we assume that in reality teh ISR is performing better than he’s
seeing, then most of his measurments should show 0 ms. Or ocassionally,
1 ms if the actual time was like 980 us past the last clock tick. But
he’s see 2 ms! In fact, he said 2.5 (or course I don’t know how he’s
measuring the .5 part!). So in reality there must be a latency of at
least over 1 full ms.

–
Bill Caroselli – Q-TPS Consulting
1-(626) 824-7983
qtps@earthlink.net

Igor_Kovalenko2 · January 31, 2003, 8:11pm

“Brian Stecher” <bstecher@qnx.com> wrote in message
news:b1e2f7$s9t$1@nntp.qnx.com…

Igor Kovalenko <> kovalenko@attbi.com> > wrote:
With QNX6 is is not that simple. One kernel path (in case of using
InterruptAttachEvent) is servicing interrupts, sending pulses to
registered threads. A pulse has to be queued by that path (producer) and
that queue must be being drained by another kernel path (consumer),
delivering the payload to waiting processes and triggering the
scheduler. The question is when that will happen. The consumer path
can’t be entered directly from the producer path, since they have to be
asynchronous and the queue is unbounded. Which means some other event
must cause the kernel to enter into consumer/scheduler path. And what
kind of event could it be? I am speculating, it could be the next clock
tick >

Nope. Event delivery & scheduling decisions caused by the event will
happen before code from any user level thread is executed - non-SMP.
In the SMP case things are a little more complicated, but the event
delivery and rescheduling will certainly take place before the next
clock tick.

What if a RR thread has its timeslice about to expire? To hold the above
statement true (and not postpone the scheduling) you have to be able to
complete delivery of ALL queued pulses with priority >= that thread within a
timeframe less than 1 TICK long. I am very curious, how that is done. From a
purely logical perspective, how do you maintain a queue of asynchronous
events without running the producer and consumer asynchronously (that is, at
potentially different pace)?

The queue CAN be indefinitely long, right? In fact it could be full of
pulses with even higher priority already before the pulse in question is
sent. And if you were delivering ALL eligible pulses before the next TICK,
that COULD potentially take indefinitely long time? Then there has to be a
bound on how many events can be delivered ‘per pass’. If there is, then
there got to be a potential latency, at least in a worst case. For some
people I know the word ‘queue’ is just another word for ‘latency’. I don’t
like poking in the dark like this, but docs are rather scarce on the
subject.

I think at least part of the problem is what was pointed out earlier.
Using clock_gettime() won’t give any better resolution than a clock
tick. Rerunning the test and employing ClockCycles() to figure the
intervals might give a better handle on what’s going on.

Indeed. But I am still dying to know the answers to above

Regards,
– igor

Dave_Edwards · February 2, 2003, 11:40pm

Gent’s

I’m seeing a large latency within the system. The question now is where
is it coming from? It would appear that my initial statement about the
delay being caused by the ISR is flawed, so I’m going to have to
investigate the timing using the approache described previously.

I’m going to rerun the ISR timing tests this week and post my results.
Hopefully these will clear up the issues.

Watch this space

Dave

BTW, The 0.5us Latency is due to physical time taken to write data into
the NIC, and to then initiate transmit. This is a wireless device,
without DMA facility and hence the data has to be written(and read)
using the processor.

Bill Caroselli wrote:

Mario Charest postmaster@127.0.0.1 wrote:

“Dave Edwards” <> Dave.edwards@abicom-international.com> > wrote in message
news:> 3E3A51AF.8070603@abicom-international.com> …

[PRE]

What function are you using to measure time, what is your CPU?

1 record the system time just before transmit by using clock_gettime

Ouch, clock_gettime is limited by tick size resolution. That is what I
though you were doing. That a good sign for me that my confidence level
about the OS is increasin, as I assume the OS was doing the right thing and
you weren’t > > As other poster said use ClockCycles() but beware that on
some processors (<=486) ClockCycles is also limited by tick size resolution
as these CPU offer no hardware clock counter.

Let’s not write off what he’s does so quickly.

So the results are limited to timer tick size resolution. So what?

If we assume that in reality teh ISR is performing better than he’s
seeing, then most of his measurments should show 0 ms. Or ocassionally,
1 ms if the actual time was like 980 us past the last clock tick. But
he’s see 2 ms! In fact, he said 2.5 (or course I don’t know how he’s
measuring the .5 part!). So in reality there must be a latency of at
least over 1 full ms.