Sorry to keep bothering you guys regarding this project, but my boss is
adamant that we get consistent packet round-trip times of under 100
microseconds, and we seem to be inching closer…
For those coming it late, here’s what we’ve got:
We have two programs: a background program called DQE, and a simple
text-mode user interactive program called UA. The user can use the UA to
send a message to the DQE (using MsgSend()) to tell the DQE to send a
packet over the network. A DQE on the other machine gets the packet and
sends it back. When the first machine’s DQE sends the packet, it stores
a time value in the packet (from ClockCycles()), so that when it gets
the packet back it can calculate the round-trip time.
The network consists of just these 2 computers with Netgear 83815
ethernet cards connected directly by a crossover cable. Thus there is no
other network traffic to interfere. The networking code just sends
straight ethernet packets containing our desired data (e.g. clock time)
in the payload; no TCP/IP or UDP or anything.
As I said, the UA can send a “send a packet” command message to the
DQE. But the user can also tell the UA to send 10,000 “send a packet”
messages in a tight loop (with a nap(1) after each one). The UA can also
send a “print timing stats” message to the DQE, which will then print
out min, max, and average packet trip times, and a simple histogram.
Thanks to the wonderful people on this newsgroup, we have learned that
running the DQE at a higher priority than the UA (and pretty much
everything else) gives us good packet times. When not running the Photon
environment on either machine we consistently get packet times of less
than 100 microseconds. However, when sending only one packet, the packet
time often measures longer than the average time when sending 10,000
packets (~84 usec. vs. ~62 usec. The difference was much worse before we
raised the priority of the DQE; sending a single packet was often ~118
usec. or more). Remember, the loop is in the UA, not the DQE; the UA
sends 10,000 “send a packet” requests.
Now, my boss insists that since QNX supports “hard realtime” which is
our goal, and since QNX strictly honors priority (no thread will ever be
pre-empted by a lower priority thread), our results shouldn’t get worse
if we run these programs in a terminal window in the Photon environment.
But they do. So that’s what I’m banging my head on now; how can we get a
background task that can send and receive packets under 100 usec. even
when running Photon and working on other things? If the DQE is the
highest priority task, nothing else should be preempting it, right?
Could it just be a matter of context switch time? Is my boss’
understanding of “realtime operating systems” correct in this regard?
Perhaps some results would better illustrate:
Typical results for 10,000 packets when running with Photon are:
Min: 62 usec. Max: 189 usec. Avg: 62 usec.
Histogram:
40-79 usec: 9456 packets
80-99 usec: 469 packets
100-199 usec: 76 packets
Typical results when running from a text only environment on both
machines are:
Min: 62 usec. Max: 88 usec. Avg: 62 usec.
Histogram:
40-79 usec: 9999 packets
80-99 usec: 1 packets
100-199 usec: 0 packets