Hello. My company has been struggling with a problem for months now,
and we have gotten nowhere. We are seeing data transfer rates and
latencies much greater than we expected. Although the problem is fairly
simple in concept, it’s hard to describe. I’m including an extensive
description below of exactly what’s happening and what we’ve already
tried.
First, some background. We are developing a driving simulator
application. An STS (student training station) consists of three QNX
nodes and associated support computers (which vary in number, but
perform such operations as controlling the steering feedback motor and
generating the graphics). There is also an IOS (instructor/operator
station), which is a Windows NT (not my fault) machine. A group of 4
STS’s and one IOS are together known as a pod. A 100 mbit ethernet
network connects each of the top-level STS nodes and the IOS machine
with a hub; that is, there is one wire running to each STS and to the
IOS from a hub.
STS’s are capable of running network scenarios, in which the involved
STS’s can fully interact with one another. The STS’s share data using
UDP multicast.
Each STS generates about 3.5 kbytes of data per 30 Hz frame; this is
roughly 100 kbytes per second per STS. Even with four STS’s involved,
this is still under half a megabyte per second; this is relatively
insignificant compared to the theoretical throughput of a 100 mbit
ethernet. There is a certain amount of other traffic: some data is
going to the IOS via UDP unicast to drive gauge displays (speedometer,
etc), and control data also uses this network. In total, however, this
additional data does not approach the UDP multicast in volume. Assuming
it is an additional quarter megabyte, this amounts to about 6% of the
available bandwidth.
This data is bursty; it tends to all be sent at once, in 30 Hz bursts.
This will lead to collisions, but we’re of the understanding that
collisions alone could not explain the slowdowns.
The hardware we’re using, as well as what we’ve already tried and what
is readily available for additional tests, is listed at the end of this
message.
The actual time needed for transfers varies hugely from one run to
another. For a four-player network game, the network data transfer
takes anywhere from 6 milliseconds to 25 milliseconds; the average seems
to be about 8-12 ms. Our back-of-the-envelope calculations led us to
believe that it should only take roughly 2-4 ms. The time varies
immensely; it is huge after rebooting the computers, then improves after
each run until stabilizing at some value, which varies but is generally
in the 8-12 ms range mentioned above.
This is where it gets weird. Despite the fact that everything is purely
100 mbit, the hub shows very high (~40%) 10 mbit usage. This correlates
to using multicast; if I disable the multicast, the 10 mbit activity
goes away. We have tried replacing the 10/100 hub with a purely 100
mbit hub, and the app continues to work as it does with the 10/100 hub;
the 10 mbit activity is presumably a firmware bug in the hub. 3Com
insists that it couldn’t be. Despite this evidence, it still seems
suspicious to me that the 10 mbit activity (~40%) corresponds so closely
to the predicted 100 mbit activity (~4%).
We have also tried replacing the Corman cards with 3Com cards, which use
a different chip and a different driver. The 3Com cards perform worse
than the Cormans (not surprising, based on previous experience with 3Com
products, but we thought it was worth testing).
Hopefully at least a few people understood that description.
In
short, my questions can be summed up below:
- Is anyone else using UDP multicast extensively, and if so, have you
noticed any slowdowns? - What else could we look at? We’ve tried everything we could think
of.
Thanks. Any and all suggestions are appreciated; we’re stumped.
Josh Hamacher and Dean Douthat
FAAC Incorporated
Current Configuration:
Software:
QNX OS 4.25D
Tcpip 5.00X
Net.ct100tx 4.25C
Hardware
Corman FE-120 Network Cards
All involved computers are roughly 800 MHz Pentium III’s with plenty
of RAM.
Also Tested:
Software:
Net.ct100tx 4.25E
Hardware:
3Com 3C905B-TX-NM
Available for testing:
Software:
QNX OS 4.25E
Hardware:
Corman FE-122 Network Cards