Net -t option?

Brown_Richard · December 1, 2000, 12:17pm

From the use message this is my understanding of how the -t option works:

if a node determines that it cannot reach another node on a specific LAN it
associates a timestamp
with that entry in its table. From that point on the failed node/LAN
combination will not be retried for the number of seconds given by the -t
option unless what?

I assume that it will re-enable the node/LAN combination if it gets a packet
from it. Is this correct?

What happens if there are 2 LANs: for simplicity lets assume 2 nodes, 2
hubs, default -t (40 seconds) option and both nodes initiate some
communication with each other. What happens if I power down LAN1 hub and
allow each node to see LAN1 has failed, then power it back on @ 10 seconds
and power down LAN2 hub @ 20 seconds. Will Net wait the 20 remaining seconds
before attempting communication with LAN1?

Richard

Brown_Richard · December 7, 2000, 11:43am

Is any one going to respond to this?

Bill_at_Sierra_Desig · December 12, 2000, 12:09am

Hi Richard,

I assume that you are talking about a fully redundant network. I.E. Several
CPUs each connected to two different LANs. If my assumption is not correct
then everything that I am about to say does not apply.

I worked for 6 years in broading. We sent audio over the ethernet. We were
able to start an audio file playing on a diskless workstation. I.E. the DSP
was on a diskless CPU and the software was reading the audio file across the
network. When we used dual LANs (customers choice, based on cost), we were
able to unplug an ethernet and the audio would just keep right on playing
without a single hicup. We could then plug in that ethernet and unplug the
other ethernet and the audio would still keep right on playing. you could
toggle back and forth with no problem. Now here’s the good part. You could
unplug BOTH ethernets and the audio would still keep playing. The trick was
twofold.

We used a three second buffer. So one of the ethernets had to be plugged in
within 3 seconds. But the more important trick was that the Net.ether*
driver had to be changed.

The -t (number of 50 ms periods before a timeout) (default = 20) and the -n
(number of retries) (default = 3) were way too great. We used -t2 -n1.

Here’s what happens under the hood.

Some program sends a mesage to Net to send to some other CPU. Net decides
what paths are available to get that message from the local node to the
other node. It then passes the mesage to one of the Net.ether* drivers.
That driver begins trying to send the packet. After the -t (N * 50 ms)
timeout it will retry for -n N retries. Not until all of this fails will
that driver go back to Net and say “I can’t get this packet out”. If Net
has another alternate path, it will try that one. (Repeat the above
procedure) Not until all paths have failed will Net claim that it can’t
transmit the packet.

The default value are good and necessary for single LAN models (which is the
majority of the cases). but with redundant LANs it just takes too damn long
to time out and try the alternate path.

So, what are the consequences? If the -t value is too small the driver may
think that it can’t get the packet out when in fact it was successful, it
just didn’t get an ACK back yet. Or it may give up on a given LAN
prematurly. Obviously, for this to work, you need very clean LANs.
Otherwise you will get network failures all over the place.

BTW, if the receiving QNX node receives the same packet more than once, it
does handle it gracefully. It will log an error and do the right thing.

One more point. The biggest villian of ethernets are collisions. We found
that if you have a file server and a bunch of workstations we could greatly
reduce collisions by lying to the ether net drivers. The trick was that we
wanted all traffic going from the file server on one LAN and all of the
traffic going to the server on the other LAN. The drivers have a -r
MediaRate option. Net will choose the faster Net.* driver based on the
media rate BUT ONLY if one is 10 times faster than the other. So, on the
file server the Net drivers are loaded as follows:
Net.etherwhatever -l 1 -r 1000000 &
Net.etherwhatever -l 2 -r 100000 &
And on each of the workstations it is reversed as:
Net.etherwhatever -l 1 -r 100000 &
Net.etherwhatever -l 2 -r 1000000 &
Otherwise, almost all of your traffic will go over the first LAN in both
directions as indicated by the ‘netinfo -l’ statistics.

Hope some of this helps.

Brown, Richard <brownr@aecl.ca> wrote in message
news:9084sk$s84$1@inn.qnx.com…

From the use message this is my understanding of how the -t option works:

if a node determines that it cannot reach another node on a specific LAN
it
associates a timestamp
with that entry in its table. From that point on the failed node/LAN
combination will not be retried for the number of seconds given by the -t
option unless what?

I assume that it will re-enable the node/LAN combination if it gets a
packet
from it. Is this correct?

What happens if there are 2 LANs: for simplicity lets assume 2 nodes, 2
hubs, default -t (40 seconds) option and both nodes initiate some
communication with each other. What happens if I power down LAN1 hub and
allow each node to see LAN1 has failed, then power it back on @ 10 seconds
and power down LAN2 hub @ 20 seconds. Will Net wait the 20 remaining
seconds
before attempting communication with LAN1?

Richard

\

Brown_Richard · December 12, 2000, 12:43pm

Bill,

Thanks for your reply. We do have a full redundant LAN. Every machine has 2
cormans: one corman from each machine is plugged into one switch and the
other into another switch.

We use -t2 -n2 on the Net.**** drivers but I was wondering how the -t option
to Net (NOT Net.****) affects things. Does it remember when a path is bad
and not retry that path until this -t seconds has expired? From your posting
it appears that Net tried the failed path before the timeout expired. I
wonder if this is an exception that when 0 good paths are available it will
ignore the timeout (which makes sense).

With respect to the -r option: I was not aware of how Net chooses which LAN
to use. This is good to know. I’ll tuck it away for future consideration.

Cheers,

“Bill at Sierra Design” <BC@SierraDesign.com> wrote in message
news:913qdf$qfi$1@inn.qnx.com…

Hi Richard,

Non-QSSL Response

I assume that you are talking about a fully redundant network. I.E.
Several
CPUs each connected to two different LANs. If my assumption is not
correct
then everything that I am about to say does not apply.

I worked for 6 years in broading. We sent audio over the ethernet. We
were
able to start an audio file playing on a diskless workstation. I.E. the
DSP
was on a diskless CPU and the software was reading the audio file across
the
network. When we used dual LANs (customers choice, based on cost), we
were
able to unplug an ethernet and the audio would just keep right on playing
without a single hicup. We could then plug in that ethernet and unplug
the
other ethernet and the audio would still keep right on playing. you could
toggle back and forth with no problem. Now here’s the good part. You
could
unplug BOTH ethernets and the audio would still keep playing. The trick
was
twofold.

We used a three second buffer. So one of the ethernets had to be plugged
in
within 3 seconds. But the more important trick was that the Net.ether*
driver had to be changed.

The -t (number of 50 ms periods before a timeout) (default = 20) and
the -n
(number of retries) (default = 3) were way too great. We used -t2 -n1.

Here’s what happens under the hood.

Some program sends a mesage to Net to send to some other CPU. Net decides
what paths are available to get that message from the local node to the
other node. It then passes the mesage to one of the Net.ether* drivers.
That driver begins trying to send the packet. After the -t (N * 50 ms)
timeout it will retry for -n N retries. Not until all of this fails will
that driver go back to Net and say “I can’t get this packet out”. If Net
has another alternate path, it will try that one. (Repeat the above
procedure) Not until all paths have failed will Net claim that it can’t
transmit the packet.

The default value are good and necessary for single LAN models (which is
the
majority of the cases). but with redundant LANs it just takes too damn
long
to time out and try the alternate path.

So, what are the consequences? If the -t value is too small the driver
may
think that it can’t get the packet out when in fact it was successful, it
just didn’t get an ACK back yet. Or it may give up on a given LAN
prematurly. Obviously, for this to work, you need very clean LANs.
Otherwise you will get network failures all over the place.

BTW, if the receiving QNX node receives the same packet more than once, it
does handle it gracefully. It will log an error and do the right thing.

One more point. The biggest villian of ethernets are collisions. We
found
that if you have a file server and a bunch of workstations we could
greatly
reduce collisions by lying to the ether net drivers. The trick was that
we
wanted all traffic going from the file server on one LAN and all of the
traffic going to the server on the other LAN. The drivers have a -r
MediaRate option. Net will choose the faster Net.* driver based on the
media rate BUT ONLY if one is 10 times faster than the other. So, on the
file server the Net drivers are loaded as follows:
Net.etherwhatever -l 1 -r 1000000 &
Net.etherwhatever -l 2 -r 100000 &
And on each of the workstations it is reversed as:
Net.etherwhatever -l 1 -r 100000 &
Net.etherwhatever -l 2 -r 1000000 &
Otherwise, almost all of your traffic will go over the first LAN in both
directions as indicated by the ‘netinfo -l’ statistics.

Hope some of this helps.

Brown, Richard <> brownr@aecl.ca> > wrote in message
news:9084sk$s84$> 1@inn.qnx.com> …
From the use message this is my understanding of how the -t option
works:

if a node determines that it cannot reach another node on a specific LAN
it
associates a timestamp
with that entry in its table. From that point on the failed node/LAN
combination will not be retried for the number of seconds given by
the -t
option unless what?

I assume that it will re-enable the node/LAN combination if it gets a
packet
from it. Is this correct?

What happens if there are 2 LANs: for simplicity lets assume 2 nodes, 2
hubs, default -t (40 seconds) option and both nodes initiate some
communication with each other. What happens if I power down LAN1 hub and
allow each node to see LAN1 has failed, then power it back on @ 10
seconds
and power down LAN2 hub @ 20 seconds. Will Net wait the 20 remaining
seconds
before attempting communication with LAN1?

Richard

\

Bill_at_Sierra_Desig · December 12, 2000, 5:42pm

OK.

I have never used the Net -t option. use Net wasn’t a lot of help to me but
the helpviewer page on Net provided more information.

Apparently, once Net discovers that a driver has failed to send a packet it
won’t even try to use that driver again for -t N seconds. Also, this only
applies if there are multiple driver options. This could be useful to know.

Brown, Richard <brownr@aecl.ca> wrote in message
news:9156f5$lqf$1@inn.qnx.com…

Bill,

Thanks for your reply. We do have a full redundant LAN. Every machine has
2
cormans: one corman from each machine is plugged into one switch and the
other into another switch.

We use -t2 -n2 on the Net.**** drivers but I was wondering how the -t
option
to Net (NOT Net.****) affects things. Does it remember when a path is bad
and not retry that path until this -t seconds has expired? From your
posting
it appears that Net tried the failed path before the timeout expired. I
wonder if this is an exception that when 0 good paths are available it
will
ignore the timeout (which makes sense).

With respect to the -r option: I was not aware of how Net chooses which
LAN
to use. This is good to know. I’ll tuck it away for future consideration.

Cheers,

“Bill at Sierra Design” <> BC@SierraDesign.com> > wrote in message
news:913qdf$qfi$> 1@inn.qnx.com> …
Hi Richard,

Non-QSSL Response

I assume that you are talking about a fully redundant network. I.E.
Several
CPUs each connected to two different LANs. If my assumption is not
correct
then everything that I am about to say does not apply.

I worked for 6 years in broading. We sent audio over the ethernet. We
were
able to start an audio file playing on a diskless workstation. I.E. the
DSP
was on a diskless CPU and the software was reading the audio file across
the
network. When we used dual LANs (customers choice, based on cost), we
were
able to unplug an ethernet and the audio would just keep right on
playing
without a single hicup. We could then plug in that ethernet and unplug
the
other ethernet and the audio would still keep right on playing. you
could
toggle back and forth with no problem. Now here’s the good part. You
could
unplug BOTH ethernets and the audio would still keep playing. The trick
was
twofold.

We used a three second buffer. So one of the ethernets had to be
plugged
in
within 3 seconds. But the more important trick was that the Net.ether*
driver had to be changed.

The -t (number of 50 ms periods before a timeout) (default = 20) and
the -n
(number of retries) (default = 3) were way too great. We used -t2 -n1.

Here’s what happens under the hood.

Some program sends a mesage to Net to send to some other CPU. Net
decides
what paths are available to get that message from the local node to the
other node. It then passes the mesage to one of the Net.ether* drivers.
That driver begins trying to send the packet. After the -t (N * 50 ms)
timeout it will retry for -n N retries. Not until all of this fails
will
that driver go back to Net and say “I can’t get this packet out”. If
Net
has another alternate path, it will try that one. (Repeat the above
procedure) Not until all paths have failed will Net claim that it can’t
transmit the packet.

The default value are good and necessary for single LAN models (which is
the
majority of the cases). but with redundant LANs it just takes too damn
long
to time out and try the alternate path.

So, what are the consequences? If the -t value is too small the driver
may
think that it can’t get the packet out when in fact it was successful,
it
just didn’t get an ACK back yet. Or it may give up on a given LAN
prematurly. Obviously, for this to work, you need very clean LANs.
Otherwise you will get network failures all over the place.

BTW, if the receiving QNX node receives the same packet more than once,
it
does handle it gracefully. It will log an error and do the right thing.

One more point. The biggest villian of ethernets are collisions. We
found
that if you have a file server and a bunch of workstations we could
greatly
reduce collisions by lying to the ether net drivers. The trick was that
we
wanted all traffic going from the file server on one LAN and all of the
traffic going to the server on the other LAN. The drivers have a -r
MediaRate option. Net will choose the faster Net.* driver based on the
media rate BUT ONLY if one is 10 times faster than the other. So, on
the
file server the Net drivers are loaded as follows:
Net.etherwhatever -l 1 -r 1000000 &
Net.etherwhatever -l 2 -r 100000 &
And on each of the workstations it is reversed as:
Net.etherwhatever -l 1 -r 100000 &
Net.etherwhatever -l 2 -r 1000000 &
Otherwise, almost all of your traffic will go over the first LAN in both
directions as indicated by the ‘netinfo -l’ statistics.

Hope some of this helps.

Brown, Richard <> brownr@aecl.ca> > wrote in message
news:9084sk$s84$> 1@inn.qnx.com> …
From the use message this is my understanding of how the -t option
works:

if a node determines that it cannot reach another node on a specific
LAN
it
associates a timestamp
with that entry in its table. From that point on the failed node/LAN
combination will not be retried for the number of seconds given by
the -t
option unless what?

I assume that it will re-enable the node/LAN combination if it gets a
packet
from it. Is this correct?

What happens if there are 2 LANs: for simplicity lets assume 2 nodes,
2
hubs, default -t (40 seconds) option and both nodes initiate some
communication with each other. What happens if I power down LAN1 hub a
nd
allow each node to see LAN1 has failed, then power it back on @ 10
seconds
and power down LAN2 hub @ 20 seconds. Will Net wait the 20 remaining
seconds
before attempting communication with LAN1?

Richard

\