Ian Zagorskih <ianzag@megasignal.com> wrote:
We have a fast-ethernet network with 32 PC’s running QNX 4.25E.
26 nodes are using RTL8139-based NICs (Net.rtl) while 6 nodes are using
Intel82559-based NICs (Net.ether82552).
The network is completely switched (i.e. no collisions) and it
is normally moderately fast. Copying a big file between two nodes
using “cp -V” gives a transfer rate of about 1300Kb/sec. (almost
all CPUs are old Pentium 90 or 120 Mhz).
The problem is that when we reboot one or two nodes, sometimes
the whole network becomes very slow. The “sin net” command takes 10 to
20 seconds to complete (while it normally takes about 1-2 sec);
copying the same file as above results in a transfer rate of about 150-200
Kb/sec.
Even commands affecting a single node (such as “sin args”) are
slowed down.
The only way I found to bring the network back to a sane state is
to reboot all the nodes.
Are you running nameloc process on one/more nodes ? How node mapping is done
I’m running 10 nameloc processes nodes 7, 9, 11, 13, 16, 19, 21, 25, 30, 33
(this is because the network is very sparse, with groups of 2-4 PC connected
by long F.O. links and I want to make every group working even if the F.O.
links go down, so I have a nameloc for every group).
Node numbers range from 1 to 35. Numbers 1, 2 and 3 are not used, they are
reserved for connecting notebooks when I need to: 1 and 2 are masked while 3
is simply deleted from the netmap, so I can connect a notebook numbered 3 from
any place on the network and have it automatically inserted in the netmap.
Every nameloc uses the same ‘-e 35’ argument to avoid polling nodes
beyond node 35 (I have other licenses installed which I don’t use at the
moment).
Node mapping is done statically using /etc/config/netmap which is the same
on all nodes.
on each node ? Is there some primary “server” node (usually node 1) which
never reboots ?
All nodes are up 24H a day (it’s a supervisory system). When one or
two nodes are switched off for maintenance, sometimes the
problem arises, and the slowdown persists even after the missing
nodes are up again.
Now that you make me think about it, perhaps the slowdown is
more likely to happen if one of the switched-off nodes was running a
nameloc, but I’m not sure.
Thanks,
Alessandro
\
|
Gemmo Impianti S.p.A. |
Divisione Sistemi | Alessandro Sala
Viale Tunisia, 39 | Responsabile Software e Sistemi
20124 Milano |
|