Question about the 'cp' command

I have an odd one here that maybe someone has seen. I have over a dozen
systems running
QNX 4.25 on an Ethernet network. We are using the Versalogic VSBC-6 266Mhz
with
64MB RAM, call them nodes 1 thru 12. The Ethernet is 10MB on the SBC, but
the
network is linked to the rest of the office network that is 10/100.

We are designing a distributed database system so I’ve been testing some
file copy times
between the nodes using the ‘cp’ command. What I have is a case where on
one node, and
one node only, the ‘cp’ takes ten times as long as any other node.

The file I’m copying is 1MB in size. Average time to copy is 2 seconds at
about 400 Kbytes
per second (according to the ‘cp -V’ option). On one node this time is as
long as 40 seconds
and at about 16 Kbytes per second.

The problem is when I copy from node 1 to node 6 only. Node 1 can copy to
all other nodes
at the faster rate. All other nodes can copy to node 6 at the faster rate.
Node 6 can copy to
all other nodes, even node 1, at the faster rate. It is only when copying
from node 1 to node 6.

Here is another odd thing. If I use FTP to copy the file from node 1 to
node 6 it transfers at
about 400 KBytes per second, so I don’t believe it is a network hardware
issue.

This ‘cp’ command to node 6, and results, look like this:

cp -V /data/testfile //6/data/testfile
100.00% (1028/1028 kbytes, 22 kp/s)

To other nodes it averages this:

cp -V /data/testfile //4/data/testfile
100.00% (1028/1028 kbytes, 426 kp/s)

Here are the versions we are running:

PROGRAM NAME VERSION DATE
sys/Proc32 Proc 4.25L Feb 15 2001
sys/Proc32 Slib16 4.23G Oct 04 1996
sys/Slib32 Slib32 4.24B Aug 12 1997
/bin/Fsys Fsys32 4.24V Feb 18 2000
/bin/Fsys Floppy 4.24B Aug 19 1997
/bin/Fsys.eide eide 4.25A Feb 09 2000
//200/bin/Dev32 Dev32 4.23G Oct 04 1996
//200/bin/Dev32.ansi Dev32.ansi 4.23H Nov 21 1996
//200/bin/Dev32.ser Dev32.ser 4.23I Jun 27 1997
//200/bin/Dev32.par Dev32.par 4.25A Jan 08 2001
//200/bin/Dev32.pty Dev32.pty 4.23G Oct 04 1996
//200/bin/Iso9660fsys Iso9660fsys 4.23D Mar 20 2000
//200/bin/Pipe Pipe 4.23A Feb 26 1996
//200/bin/Net Net 4.25C Aug 30 1999
//200/bin/Net.tulip Net.tulip 4.25Q Aug 30 1999
//200//usr/ucb/Socket Socket 4.25C Aug 19 1998
//200/bin/Mqueue mqueue 4.24A Aug 30 1999
//200/bin/cron cron 4.23B Oct 30 1997
//200/
/bin/Photon Photon 1.14B Sep 03 1999
//200/*/bin/phfontpfr Photon Font 1.14H Jan 17 2001

This is the results from ‘sin in’

Node CPU Machine Speed Memory Ticksize Display
Flags
2 686/687 PCI 34711 39567k/66711k 1.0ms VGA
Color -3P±---------8P

Heapp Heapf Heapl Heapn Hands Names Sessions Procs Timers Nodes Virtual
0 0 29736 0 64 100 64 2000 125 266 27M/
180M

Boot from Hard at Sep 24 14:39 Locators: 200

At systems are the same hardware and software.
Thanks in advance.

Ivan Bannon
RJG Inc.

“Ivan Bannon” <ivan.bannon@rjginc.com> wrote in message
news:9opruj$kkk$1@inn.qnx.com

The problem is when I copy from node 1 to node 6 only. Node 1 can copy to
all other nodes
at the faster rate. All other nodes can copy to node 6 at the faster
rate.
Node 6 can copy to
all other nodes, even node 1, at the faster rate. It is only when copying
from node 1 to node 6.

what netinfo tells for both nodes 1 and 6 during file copying ?

At systems are the same hardware and software.
Thanks in advance.

Ivan Bannon
RJG Inc.

// wbr

Here is another odd thing. If I use FTP to copy the file from node 1 to
node 6 it transfers at
about 400 KBytes per second, so I don’t believe it is a network hardware
issue.

Not necessarly, TCP/IP is much better then FLEET at handling bad
network.

What if you swap the cable of node 6 with another node?
What if you swap the cable of node 1 with another node?

How many nameloc you have running?

Are all netmaps the same?

Are all licenses the same?

netinfo shows no increase in either Tx or Rx errors, though there is an
occasional Tx
collision error, but I believe this is normal. We are running all the
network drivers
with the “-P” option.

“ian zagorskih” <ianzag@novosoft-us.com> wrote in message
news:9opt5l$l8s$1@inn.qnx.com

“Ivan Bannon” <> ivan.bannon@rjginc.com> > wrote in message
news:9opruj$kkk$> 1@inn.qnx.com> …

The problem is when I copy from node 1 to node 6 only. Node 1 can copy
to
all other nodes
at the faster rate. All other nodes can copy to node 6 at the faster
rate.
Node 6 can copy to
all other nodes, even node 1, at the faster rate. It is only when
copying
from node 1 to node 6.


what netinfo tells for both nodes 1 and 6 during file copying ?

At systems are the same hardware and software.
Thanks in advance.

Ivan Bannon
RJG Inc.


// wbr

I have swapped the cables for both nodes 1 and 6. I have also swapped out
the hub that they
are connected to. Problem still remains. There are 2 namelocs running on
nodes 1 and 2. All
netmaps are identical, some have a couple more nodes on them, but they are
the same for nodes
1 thru 6. I will check the licenses.

We are looking at going to all TCP/IP for file transferring because of the
routing situation so this may
be a mute problem for us. This type of problem just shakes my bosses
confidence in QNX a
bit since I and others convinced them to use it for this system.

Thanks for the quick feedback Mario.

Ivan

“Mario Charest” <mcharest@clipzinformatic.com> wrote in message
news:9optqn$lj7$1@inn.qnx.com

Here is another odd thing. If I use FTP to copy the file from node 1 to
node 6 it transfers at
about 400 KBytes per second, so I don’t believe it is a network hardware
issue.


Not necessarly, TCP/IP is much better then FLEET at handling bad
network.

What if you swap the cable of node 6 with another node?
What if you swap the cable of node 1 with another node?

How many nameloc you have running?

Are all netmaps the same?

Are all licenses the same?

\

Mario,

I went and checked the nameloc situation. I forgot that we are running
nameloc now
on all nodes but with the following command line:

nameloc -s$NODE -e$NODE

Could this be a problem? We have to do this because on this system nodes
will come
on and off line often, so we needed the nameloc to run on each for licensing
reasons.
We use TCP/IP messages to determine who is on the network so we don’t need
nameloc to find QNX nodes, this builds our netmap.

Again, thanks for the help.

Ivan

“Mario Charest” <mcharest@clipzinformatic.com> wrote in message
news:9optqn$lj7$1@inn.qnx.com

Here is another odd thing. If I use FTP to copy the file from node 1 to
node 6 it transfers at
about 400 KBytes per second, so I don’t believe it is a network hardware
issue.


Not necessarly, TCP/IP is much better then FLEET at handling bad
network.

What if you swap the cable of node 6 with another node?
What if you swap the cable of node 1 with another node?

How many nameloc you have running?

Are all netmaps the same?

Are all licenses the same?

\

“Ivan Bannon” <ivan.bannon@rjginc.com> wrote in message
news:9oq06t$mvm$1@inn.qnx.com

netinfo shows no increase in either Tx or Rx errors, though there is an
occasional Tx
collision error, but I believe this is normal. We are running all the
network drivers
with the “-P” option.

Have you tried without the -P on node 6 (why do you need -P?).
Does node 6 has the same CPU and network card model as the other.

“Ivan Bannon” <ivan.bannon@rjginc.com> wrote in message
news:9oq1v0$nvo$1@inn.qnx.com

Mario,

I went and checked the nameloc situation. I forgot that we are running
nameloc now
on all nodes but with the following command line:

nameloc -s$NODE -e$NODE


Could this be a problem?

Hum, that seem very odd, It’s the first time I see someone limiting
nameloc to the local node. I have no idea what side effect that
could have (if any).

Nameloc is a nasty beast as it deal with licenses. Since licenses
are meant to be shared across the network you are attempting
to “limit” the scope of nameloc,which I’m not sure how friendly that is.

Still I see potentiel for problem, but I can’t imagine why only
node 6 - node 1 is affected.

We have to do this because on this system nodes
will come on and off line often, so we needed the nameloc to run on each
for licensing
reasons.

There are various startegy to work around this problem. The idea
is to continually monitor if more then 2 (arbitrary number) namelocs are
running and to start/stop nameloc accordingly. I’ve seen people
write script for that. I would do it from within program since I’m
no good at writing script :wink:


We use TCP/IP messages to determine who is on the network so we don’t need
nameloc to find QNX nodes, this builds our netmap.

Nameloc is not used to find QNX nodes per se. It’s used to shared global
names
and to handle licenses.

Some people don’t like that because if TCP/IP isn’t setup properly you won’t
be able to setup the netmap. Which means you won’t be able to use
fleet to possibly fix the problem. Chicken and the egg thing.

Again, thanks for the help.

My pleasure.

Ivan

“Mario Charest” <> mcharest@clipzinformatic.com> > wrote in message
news:9optqn$lj7$> 1@inn.qnx.com> …


Here is another odd thing. If I use FTP to copy the file from node 1
to
node 6 it transfers at
about 400 KBytes per second, so I don’t believe it is a network
hardware
issue.


Not necessarly, TCP/IP is much better then FLEET at handling bad
network.

What if you swap the cable of node 6 with another node?
What if you swap the cable of node 1 with another node?

How many nameloc you have running?

Are all netmaps the same?

Are all licenses the same?



\

This type of problem just shakes my bosses confidence in QNX a
bit since I and others convinced them to use it for this system.

Your boss doesn’t know a thing about OSes and computer in general
does he, lol!

Thanks for the quick feedback Mario.

Ivan

“Mario Charest” <> mcharest@clipzinformatic.com> > wrote in message
news:9optqn$lj7$> 1@inn.qnx.com> …


Here is another odd thing. If I use FTP to copy the file from node 1
to
node 6 it transfers at
about 400 KBytes per second, so I don’t believe it is a network
hardware
issue.


Not necessarly, TCP/IP is much better then FLEET at handling bad
network.

What if you swap the cable of node 6 with another node?
What if you swap the cable of node 1 with another node?

How many nameloc you have running?

Are all netmaps the same?

Are all licenses the same?



\

“Mario Charest” <mcharest@clipzinformatic.com> wrote in message
news:9oq64a$qfb$1@inn.qnx.com

“Ivan Bannon” <> ivan.bannon@rjginc.com> > wrote in message
news:9oq1v0$nvo$> 1@inn.qnx.com> …
Mario,

I went and checked the nameloc situation. I forgot that we are running
nameloc now
on all nodes but with the following command line:

nameloc -s$NODE -e$NODE


Could this be a problem?

Hum, that seem very odd, It’s the first time I see someone limiting
nameloc to the local node. I have no idea what side effect that
could have (if any).

Nameloc is a nasty beast as it deal with licenses. Since licenses
are meant to be shared across the network you are attempting
to “limit” the scope of nameloc,which I’m not sure how friendly that is.

Still I see potentiel for problem, but I can’t imagine why only
node 6 - node 1 is affected.

We have to do this because on this system nodes
will come on and off line often, so we needed the nameloc to run on each
for licensing
reasons.

There are various startegy to work around this problem. The idea
is to continually monitor if more then 2 (arbitrary number) namelocs are
running and to start/stop nameloc accordingly. I’ve seen people
write script for that. I would do it from within program since I’m
no good at writing script > :wink:


We use TCP/IP messages to determine who is on the network so we don’t
need
nameloc to find QNX nodes, this builds our netmap.

Nameloc is not used to find QNX nodes per se. It’s used to shared global
names and to handle licenses.

but using nameloc greatly speeds up discovering nodes going down. i guess
when nameloc cannot access remote node it reports it as “down” to Net so
evety time i run “alive” command it responds a way faster in comparison to
when i have no running any namelocs at all across the net.

Some people don’t like that because if TCP/IP isn’t setup properly you
won’t
be able to setup the netmap. Which means you won’t be able to use
fleet to possibly fix the problem. Chicken and the egg thing.

to fix such problems Net has option “-A” which isn’t used in default
configuration including Net started by “nettrap start”. works perfect and
unknown external nodes are mapped by request with root priveledges :slight_smile: you
must know only destination node MAC address.

returning to the main problem:

  1. is it so critical to launch nameloc at every node with forced disabled
    nodes polling ?
  2. as i can see from “sin in” output there’s only one nameloc at 200th node
    running. is it always running this way ?
  3. is it possible to leave online only two nodes 1 and 6, start nameloc at
    node 1 without node polling limits and make same tests ?

// wbr

“ian zagorskih” <ianzag@novosoft-us.com> wrote in message
news:9oq8jf$s00$1@inn.qnx.com

“Mario Charest” <> mcharest@clipzinformatic.com> > wrote in message
news:9oq64a$qfb$> 1@inn.qnx.com> …

“Ivan Bannon” <> ivan.bannon@rjginc.com> > wrote in message
news:9oq1v0$nvo$> 1@inn.qnx.com> …
Mario,

I went and checked the nameloc situation. I forgot that we are
running
nameloc now
on all nodes but with the following command line:

nameloc -s$NODE -e$NODE


Could this be a problem?

Hum, that seem very odd, It’s the first time I see someone limiting
nameloc to the local node. I have no idea what side effect that
could have (if any).

Nameloc is a nasty beast as it deal with licenses. Since licenses
are meant to be shared across the network you are attempting
to “limit” the scope of nameloc,which I’m not sure how friendly that is.

Still I see potentiel for problem, but I can’t imagine why only
node 6 - node 1 is affected.

We have to do this because on this system nodes
will come on and off line often, so we needed the nameloc to run on
each
for licensing
reasons.

There are various startegy to work around this problem. The idea
is to continually monitor if more then 2 (arbitrary number) namelocs are
running and to start/stop nameloc accordingly. I’ve seen people
write script for that. I would do it from within program since I’m
no good at writing script > :wink:


We use TCP/IP messages to determine who is on the network so we don’t
need
nameloc to find QNX nodes, this builds our netmap.

Nameloc is not used to find QNX nodes per se. It’s used to shared
global
names and to handle licenses.


but using nameloc greatly speeds up discovering nodes going down. i guess
when nameloc cannot access remote node it reports it as “down” to Net so
evety time i run “alive” command it responds a way faster in comparison to
when i have no running any namelocs at all across the net.

Yes, but what I don’t understand is that if you limit nameloc polling to
itself
how can it get the status of other nodes. I will have to look at what
nameloc
is doing, really doing.

Some people don’t like that because if TCP/IP isn’t setup properly you
won’t
be able to setup the netmap. Which means you won’t be able to use
fleet to possibly fix the problem. Chicken and the egg thing.


to fix such problems Net has option “-A” which isn’t used in default
configuration including Net started by “nettrap start”. works perfect and
unknown external nodes are mapped by request with root priveledges > :slight_smile: > you
must know only destination node MAC address.

Indeed, if you can live with the fact that one MAC address
must be known all is well. -A used to be the default, but that was
changed to solve security issues.

returning to the main problem:

  1. is it so critical to launch nameloc at every node with forced disabled
    nodes polling ?

I think it is, it beats the purpose of nameloc.

  1. as i can see from “sin in” output there’s only one nameloc at 200th
    node
    running. is it always running this way ?

200th node? Do you mean you have 200 nodes/licenses/nameloc?

  1. is it possible to leave online only two nodes 1 and 6, start nameloc at
    node 1 without node polling limits and make same tests ?

Yes, in theory the best is to have two namelocs on the whole network.

// wbr

We actually have 250 node licenses for QNX, Photon, and TCP/IP. All units
are exactly the
same in hardware and software version.

As to the nameloc issue. When we did monitor and try to keep 2 or 3 nameloc
processes
running on the network we encountered a problem when a node did go offline.
When that
happened our software running on the same nodes as nameloc took a “serious”
hit in trying to
communicate over the network. Things just bogged down way too much.

These systems control plastic injection molding processes and we sample data
at up to 1000
times per second. A one, or terrible two, seconds delay means trouble. But
we need to
run nameloc for licensing. We also are not using any global naming in our
software, even though
Socket uses one. We have played with the poll period and timeout periods
with little success.

We have turned on the “-P” option because, and I do not know why, it appears
to increase our
network throughput. (I know this bears more investigation, but I have others
here who are
convinced it helps :-[)

What is so interesting about this is it only deals with the ‘cp’ command,
and from node 1 to node 6.
If I try say “cat /data/testfile >//6/data/testfile” it completes in a
couple of seconds. I’ve checked the
hard drive and looked at options in Fsys, Fsys.eide, Net, Net.ether9000, and
now nameloc. Its
and odd little puzzle. My next step it to take both theses nodes off the
network and put them on
their own little network, just the two of them, and try it. I’ll let you
know what happens.

Again thanks everyone for the input. If anyone knows of a way to get
nameloc to not hold up
processes communicating over FLEET when a node goes off line let me know.


“Mario Charest” <mcharest@clipzinformatic.com> wrote in message
news:9oq998$shj$1@inn.qnx.com

“ian zagorskih” <> ianzag@novosoft-us.com> > wrote in message
news:9oq8jf$s00$> 1@inn.qnx.com> …

“Mario Charest” <> mcharest@clipzinformatic.com> > wrote in message
news:9oq64a$qfb$> 1@inn.qnx.com> …

“Ivan Bannon” <> ivan.bannon@rjginc.com> > wrote in message
news:9oq1v0$nvo$> 1@inn.qnx.com> …
Mario,

I went and checked the nameloc situation. I forgot that we are
running
nameloc now
on all nodes but with the following command line:

nameloc -s$NODE -e$NODE


Could this be a problem?

Hum, that seem very odd, It’s the first time I see someone limiting
nameloc to the local node. I have no idea what side effect that
could have (if any).

Nameloc is a nasty beast as it deal with licenses. Since licenses
are meant to be shared across the network you are attempting
to “limit” the scope of nameloc,which I’m not sure how friendly that
is.

Still I see potentiel for problem, but I can’t imagine why only
node 6 - node 1 is affected.

We have to do this because on this system nodes
will come on and off line often, so we needed the nameloc to run on
each
for licensing
reasons.

There are various startegy to work around this problem. The idea
is to continually monitor if more then 2 (arbitrary number) namelocs
are
running and to start/stop nameloc accordingly. I’ve seen people
write script for that. I would do it from within program since I’m
no good at writing script > :wink:


We use TCP/IP messages to determine who is on the network so we
don’t
need
nameloc to find QNX nodes, this builds our netmap.

Nameloc is not used to find QNX nodes per se. It’s used to shared
global
names and to handle licenses.


but using nameloc greatly speeds up discovering nodes going down. i
guess
when nameloc cannot access remote node it reports it as “down” to Net so
evety time i run “alive” command it responds a way faster in comparison
to
when i have no running any namelocs at all across the net.


Yes, but what I don’t understand is that if you limit nameloc polling to
itself
how can it get the status of other nodes. I will have to look at what
nameloc
is doing, really doing.

Some people don’t like that because if TCP/IP isn’t setup properly you
won’t
be able to setup the netmap. Which means you won’t be able to use
fleet to possibly fix the problem. Chicken and the egg thing.


to fix such problems Net has option “-A” which isn’t used in default
configuration including Net started by “nettrap start”. works perfect
and
unknown external nodes are mapped by request with root priveledges > :slight_smile:
you
must know only destination node MAC address.

Indeed, if you can live with the fact that one MAC address
must be known all is well. -A used to be the default, but that was
changed to solve security issues.


returning to the main problem:

  1. is it so critical to launch nameloc at every node with forced
    disabled
    nodes polling ?

I think it is, it beats the purpose of nameloc.

  1. as i can see from “sin in” output there’s only one nameloc at 200th
    node
    running. is it always running this way ?

200th node? Do you mean you have 200 nodes/licenses/nameloc?

  1. is it possible to leave online only two nodes 1 and 6, start nameloc
    at
    node 1 without node polling limits and make same tests ?

Yes, in theory the best is to have two namelocs on the whole network.


// wbr

\

Ok, looks like we have found our culprit and it is network hardware, namely
the
Ethernet cable by the looks of it. This is what I’ve found out so far.

When I moved node 6 into my office and put it on a small hub with node 1 the
copy speed was ok, 600+ Kbytes per second. Then I moved node 6 onto
another hub on the office network and same results, good copy speed. Then I
put another node into the lab, where node 6 was, and it exhibited the same
problem as node 6.

I looked at the Ethernet wires run to 4 outlets and found that on a single 8
wire
Cat-5 I.S. had connected two outlet sockets?!>?!>? Can you split a standard
Cat-5 cable this way? It looks like they did match twisted pairs, but it
seems
shaky to me. Now I’ll have to check out the rest of the wiring and see
where
else this occurs.

Thanks Mario for the clue on FLEET vs. TCP/IP for handling network problems.
This gave me the best point to attack. And thanks everyone for the
feedback.
Long live QDN…

Ivan


“Mario Charest” <mcharest@clipzinformatic.com> wrote in message
news:9oq998$shj$1@inn.qnx.com

“ian zagorskih” <> ianzag@novosoft-us.com> > wrote in message
news:9oq8jf$s00$> 1@inn.qnx.com> …

“Mario Charest” <> mcharest@clipzinformatic.com> > wrote in message
news:9oq64a$qfb$> 1@inn.qnx.com> …

“Ivan Bannon” <> ivan.bannon@rjginc.com> > wrote in message
news:9oq1v0$nvo$> 1@inn.qnx.com> …
Mario,

I went and checked the nameloc situation. I forgot that we are
running
nameloc now
on all nodes but with the following command line:

nameloc -s$NODE -e$NODE


Could this be a problem?

Hum, that seem very odd, It’s the first time I see someone limiting
nameloc to the local node. I have no idea what side effect that
could have (if any).

Nameloc is a nasty beast as it deal with licenses. Since licenses
are meant to be shared across the network you are attempting
to “limit” the scope of nameloc,which I’m not sure how friendly that
is.

Still I see potentiel for problem, but I can’t imagine why only
node 6 - node 1 is affected.

We have to do this because on this system nodes
will come on and off line often, so we needed the nameloc to run on
each
for licensing
reasons.

There are various startegy to work around this problem. The idea
is to continually monitor if more then 2 (arbitrary number) namelocs
are
running and to start/stop nameloc accordingly. I’ve seen people
write script for that. I would do it from within program since I’m
no good at writing script > :wink:


We use TCP/IP messages to determine who is on the network so we
don’t
need
nameloc to find QNX nodes, this builds our netmap.

Nameloc is not used to find QNX nodes per se. It’s used to shared
global
names and to handle licenses.


but using nameloc greatly speeds up discovering nodes going down. i
guess
when nameloc cannot access remote node it reports it as “down” to Net so
evety time i run “alive” command it responds a way faster in comparison
to
when i have no running any namelocs at all across the net.


Yes, but what I don’t understand is that if you limit nameloc polling to
itself
how can it get the status of other nodes. I will have to look at what
nameloc
is doing, really doing.

Some people don’t like that because if TCP/IP isn’t setup properly you
won’t
be able to setup the netmap. Which means you won’t be able to use
fleet to possibly fix the problem. Chicken and the egg thing.


to fix such problems Net has option “-A” which isn’t used in default
configuration including Net started by “nettrap start”. works perfect
and
unknown external nodes are mapped by request with root priveledges > :slight_smile:
you
must know only destination node MAC address.

Indeed, if you can live with the fact that one MAC address
must be known all is well. -A used to be the default, but that was
changed to solve security issues.


returning to the main problem:

  1. is it so critical to launch nameloc at every node with forced
    disabled
    nodes polling ?

I think it is, it beats the purpose of nameloc.

  1. as i can see from “sin in” output there’s only one nameloc at 200th
    node
    running. is it always running this way ?

200th node? Do you mean you have 200 nodes/licenses/nameloc?

  1. is it possible to leave online only two nodes 1 and 6, start nameloc
    at
    node 1 without node polling limits and make same tests ?

Yes, in theory the best is to have two namelocs on the whole network.


// wbr

\

“Mario Charest” <mcharest@clipzinformatic.com> wrote in message
news:9oq998$shj$1@inn.qnx.com

but using nameloc greatly speeds up discovering nodes going down. i
guess
when nameloc cannot access remote node it reports it as “down” to Net so
evety time i run “alive” command it responds a way faster in comparison
to
when i have no running any namelocs at all across the net.


Yes, but what I don’t understand is that if you limit nameloc polling to
itself
how can it get the status of other nodes. I will have to look at what
nameloc
is doing, really doing.

don’t forget that all cards are running in promiscuous mode so nameloc
actually can sniff the net at least available segment. according to docs
it’s assumed that when global name is registered at some node it informs all
name locators across network. as i understand they don’t use global names in
direct form but Socket/Tcpip use. next i can only guess which way this
notification’s happening: with sending one broadcast packet or by sending
one packet per found name locator. if the second and regardless to local
nameloc options node sends notification to -all- found namelocs across the
lan and remembering that there’s one nameloc per node and >200 nodes… wow.
this must be fun. also if there’s a duplex protocol and response from every
nameloc is expected even during some basically little timeout, this can lead
to a really huge overal time…

well, seems to me that i draw too dark picture :slight_smile: just some ideas, nothing
more. fully agree that difficult to say anything for sure before some tests
with netsniff. probably just for fun i’ll make it later. or maybe network
support group people will be so kind to enlight us a little about this
question ?

but overal idea is: running 200 nameloc applications inside one logical
network one per node makes me feel that smth’s really wrong with network
architecture regardless to how dynamic is it.

to fix such problems Net has option “-A” which isn’t used in default
configuration including Net started by “nettrap start”. works perfect
and
unknown external nodes are mapped by request with root priveledges > :slight_smile:
you
must know only destination node MAC address.

Indeed, if you can live with the fact that one MAC address
must be known all is well. -A used to be the default, but that was
changed to solve security issues.

not so sure due to

  1. it wasn’t fixed at patch D, tested.
  2. there wasn’t any notes at patch E readme that even anything was changed
    at Net, only at Proc32 (with no comments but for sure not related directly
    with noted above feature).

so not so sure… :slight_smile: i have no runnign E machine behind me so difficult to
test.

  1. as i can see from “sin in” output there’s only one nameloc at 200th
    node
    running. is it always running this way ?

200th node? Do you mean you have 200 nodes/licenses/nameloc?

according to sin in output:

Boot from Hard at Sep 24 14:39 Locators: 200

…yes, there are.

  1. is it possible to leave online only two nodes 1 and 6, start nameloc
    at
    node 1 without node polling limits and make same tests ?

Yes, in theory the best is to have two namelocs on the whole network.

agree. maybe more but 200… imho it’s too high :slight_smile:

// wbr

“ian zagorskih” <ianzag@novosoft-us.com> wrote in message
news:9oqesq$2g0$1@inn.qnx.com

“Mario Charest” <> mcharest@clipzinformatic.com> > wrote in message
news:9oq998$shj$> 1@inn.qnx.com> …


but using nameloc greatly speeds up discovering nodes going down. i
guess
when nameloc cannot access remote node it reports it as “down” to Net
so
evety time i run “alive” command it responds a way faster in
comparison
to
when i have no running any namelocs at all across the net.


Yes, but what I don’t understand is that if you limit nameloc polling to
itself
how can it get the status of other nodes. I will have to look at what
nameloc
is doing, really doing.

don’t forget that all cards are running in promiscuous mode so nameloc
actually can sniff the net at least available segment

I beleive not, when promiscuty mode is use, packets not targeted to the node
do not make it to any application, unless they specificaly request it
(netraw)


. according to docs
it’s assumed that when global name is registered at some node it informs
all
name locators across network. as i understand they don’t use global names
in
direct form but Socket/Tcpip use. next i can only guess which way this
notification’s happening: with sending one broadcast packet or by sending
one packet per found name locator.

QNX (FLEET) never uses broadcast.

The global names are not send to other nodes. When a name is requested
the request goes to a node that has as a nameloc running.

if the second and regardless to local
nameloc options node sends notification to -all- found namelocs across the
lan and remembering that there’s one nameloc per node and >200 nodes…
wow.
this must be fun. also if there’s a duplex protocol and response from
every
nameloc is expected even during some basically little timeout, this can
lead
to a really huge overal time…

Doesn’t work like that. Nameloc is fairly steady, it polls nodes one by
one,
at the rate specify on the command line (1 sec per default)

well, seems to me that i draw too dark picture > :slight_smile: > just some ideas, nothing
more. fully agree that difficult to say anything for sure before some
tests
with netsniff. probably just for fun i’ll make it later. or maybe network
support group people will be so kind to enlight us a little about this
question ?

but overal idea is: running 200 nameloc applications inside one logical
network one per node makes me feel that smth’s really wrong with network
architecture regardless to how dynamic is it.



to fix such problems Net has option “-A” which isn’t used in default
configuration including Net started by “nettrap start”. works perfect
and
unknown external nodes are mapped by request with root priveledges > :slight_smile:
you
must know only destination node MAC address.

Indeed, if you can live with the fact that one MAC address
must be known all is well. -A used to be the default, but that was
changed to solve security issues.

not so sure due to


  1. it wasn’t fixed at patch D, tested.
  2. there wasn’t any notes at patch E readme that even anything was changed
    at Net, only at Proc32 (with no comments but for sure not related directly
    with noted above feature).

so not so sure… > :slight_smile: > i have no runnign E machine behind me so difficult to
test.

I think this was only for a beta.

  1. as i can see from “sin in” output there’s only one nameloc at 200th
    node
    running. is it always running this way ?

200th node? Do you mean you have 200 nodes/licenses/nameloc?

according to sin in output:

Boot from Hard at Sep 24 14:39 Locators: 200

…yes, there are.

He could have had more then 200 nodes. That just means node 200
is running nameloc

  1. is it possible to leave online only two nodes 1 and 6, start
    nameloc
    at
    node 1 without node polling limits and make same tests ?

Yes, in theory the best is to have two namelocs on the whole network.


agree. maybe more but 200… imho it’s too high > :slight_smile:

// wbr

“Ivan Bannon” <ivan.bannon@rjginc.com> wrote in message
news:9oqbmf$nj$1@inn.qnx.com

We actually have 250 node licenses for QNX, Photon, and TCP/IP. All units
are exactly the
same in hardware and software version.

As to the nameloc issue. When we did monitor and try to keep 2 or 3
nameloc
processes
running on the network we encountered a problem when a node did go
offline.



When that
happened our software running on the same nodes as nameloc took a
“serious”
hit in trying to communicate over the network. Things just bogged down
way too much.

Yes i’ve seen that too, I agree that ain’t nice.
There are solution to this though.

These systems control plastic injection molding processes and we sample
data
at up to 1000
times per second. A one, or terrible two, seconds delay means trouble.
But
we need to
run nameloc for licensing. We also are not using any global naming in our
software, even though Socket uses one.

You can start it with the -L option (from memory) to force it not to use
global name.

We have played with the poll period and timeout periods
with little success.

We have turned on the “-P” option because, and I do not know why, it
appears
to increase our network throughput. (I know this bears more investigation,
but I have others
here who are convinced it helps :-[)

I don’t see how that can be.

What is so interesting about this is it only deals with the ‘cp’ command,
and from node 1 to node 6.
If I try say “cat /data/testfile >//6/data/testfile” it completes in a
couple of seconds. I’ve checked the
hard drive and looked at options in Fsys, Fsys.eide, Net, Net.ether9000,
and
now nameloc. Its and odd little puzzle.

cp uses 16k blocks which possibly translate in bigger, higher rate of
packet.

My next step it to take both theses nodes off the
network and put them on
their own little network, just the two of them, and try it. I’ll let you
know what happens.

Again thanks everyone for the input. If anyone knows of a way to get
nameloc to not hold up
processes communicating over FLEET when a node goes off line let me know.


“Mario Charest” <> mcharest@clipzinformatic.com> > wrote in message
news:9oq998$shj$> 1@inn.qnx.com> …

“ian zagorskih” <> ianzag@novosoft-us.com> > wrote in message
news:9oq8jf$s00$> 1@inn.qnx.com> …

“Mario Charest” <> mcharest@clipzinformatic.com> > wrote in message
news:9oq64a$qfb$> 1@inn.qnx.com> …

“Ivan Bannon” <> ivan.bannon@rjginc.com> > wrote in message
news:9oq1v0$nvo$> 1@inn.qnx.com> …
Mario,

I went and checked the nameloc situation. I forgot that we are
running
nameloc now
on all nodes but with the following command line:

nameloc -s$NODE -e$NODE


Could this be a problem?

Hum, that seem very odd, It’s the first time I see someone limiting
nameloc to the local node. I have no idea what side effect that
could have (if any).

Nameloc is a nasty beast as it deal with licenses. Since licenses
are meant to be shared across the network you are attempting
to “limit” the scope of nameloc,which I’m not sure how friendly that
is.

Still I see potentiel for problem, but I can’t imagine why only
node 6 - node 1 is affected.

We have to do this because on this system nodes
will come on and off line often, so we needed the nameloc to run
on
each
for licensing
reasons.

There are various startegy to work around this problem. The idea
is to continually monitor if more then 2 (arbitrary number) namelocs
are
running and to start/stop nameloc accordingly. I’ve seen people
write script for that. I would do it from within program since I’m
no good at writing script > :wink:


We use TCP/IP messages to determine who is on the network so we
don’t
need
nameloc to find QNX nodes, this builds our netmap.

Nameloc is not used to find QNX nodes per se. It’s used to shared
global
names and to handle licenses.


but using nameloc greatly speeds up discovering nodes going down. i
guess
when nameloc cannot access remote node it reports it as “down” to Net
so
evety time i run “alive” command it responds a way faster in
comparison
to
when i have no running any namelocs at all across the net.


Yes, but what I don’t understand is that if you limit nameloc polling to
itself
how can it get the status of other nodes. I will have to look at what
nameloc
is doing, really doing.

Some people don’t like that because if TCP/IP isn’t setup properly
you
won’t
be able to setup the netmap. Which means you won’t be able to use
fleet to possibly fix the problem. Chicken and the egg thing.


to fix such problems Net has option “-A” which isn’t used in default
configuration including Net started by “nettrap start”. works perfect
and
unknown external nodes are mapped by request with root priveledges > :slight_smile:
you
must know only destination node MAC address.

Indeed, if you can live with the fact that one MAC address
must be known all is well. -A used to be the default, but that was
changed to solve security issues.


returning to the main problem:

  1. is it so critical to launch nameloc at every node with forced
    disabled
    nodes polling ?

I think it is, it beats the purpose of nameloc.

  1. as i can see from “sin in” output there’s only one nameloc at 200th
    node
    running. is it always running this way ?

200th node? Do you mean you have 200 nodes/licenses/nameloc?

  1. is it possible to leave online only two nodes 1 and 6, start
    nameloc
    at
    node 1 without node polling limits and make same tests ?

Yes, in theory the best is to have two namelocs on the whole network.


// wbr



\

“Ivan Bannon” <ivan.bannon@rjginc.com> wrote in message
news:9oqecj$25q$1@inn.qnx.com

Ok, looks like we have found our culprit and it is network hardware,
namely
the
Ethernet cable by the looks of it. This is what I’ve found out so far.

When I moved node 6 into my office and put it on a small hub with node 1
the
copy speed was ok, 600+ Kbytes per second. Then I moved node 6 onto
another hub on the office network and same results, good copy speed. Then
I
put another node into the lab, where node 6 was, and it exhibited the same
problem as node 6.

I looked at the Ethernet wires run to 4 outlets and found that on a single
8
wire
Cat-5 I.S. had connected two outlet sockets?!>?!>? Can you split a
standard
Cat-5 cable this way? It looks like they did match twisted pairs, but it
seems shaky to me.

That doesn’t sound good, I don’t know enough about electrical design
to tell if it’s that could cause problem though.

Now I’ll have to check out the rest of the wiring and see
where else this occurs.

At least you have a lead. Now you may be able to tell your
boss that it should loose confidence in networking :wink:

Thanks Mario for the clue on FLEET vs. TCP/IP for handling network
problems.
This gave me the best point to attack. And thanks everyone for the
feedback.
Long live QDN…

Ivan


“Mario Charest” <> mcharest@clipzinformatic.com> > wrote in message
news:9oq998$shj$> 1@inn.qnx.com> …

“ian zagorskih” <> ianzag@novosoft-us.com> > wrote in message
news:9oq8jf$s00$> 1@inn.qnx.com> …

“Mario Charest” <> mcharest@clipzinformatic.com> > wrote in message
news:9oq64a$qfb$> 1@inn.qnx.com> …

“Ivan Bannon” <> ivan.bannon@rjginc.com> > wrote in message
news:9oq1v0$nvo$> 1@inn.qnx.com> …
Mario,

I went and checked the nameloc situation. I forgot that we are
running
nameloc now
on all nodes but with the following command line:

nameloc -s$NODE -e$NODE


Could this be a problem?

Hum, that seem very odd, It’s the first time I see someone limiting
nameloc to the local node. I have no idea what side effect that
could have (if any).

Nameloc is a nasty beast as it deal with licenses. Since licenses
are meant to be shared across the network you are attempting
to “limit” the scope of nameloc,which I’m not sure how friendly that
is.

Still I see potentiel for problem, but I can’t imagine why only
node 6 - node 1 is affected.

We have to do this because on this system nodes
will come on and off line often, so we needed the nameloc to run
on
each
for licensing
reasons.

There are various startegy to work around this problem. The idea
is to continually monitor if more then 2 (arbitrary number) namelocs
are
running and to start/stop nameloc accordingly. I’ve seen people
write script for that. I would do it from within program since I’m
no good at writing script > :wink:


We use TCP/IP messages to determine who is on the network so we
don’t
need
nameloc to find QNX nodes, this builds our netmap.

Nameloc is not used to find QNX nodes per se. It’s used to shared
global
names and to handle licenses.


but using nameloc greatly speeds up discovering nodes going down. i
guess
when nameloc cannot access remote node it reports it as “down” to Net
so
evety time i run “alive” command it responds a way faster in
comparison
to
when i have no running any namelocs at all across the net.


Yes, but what I don’t understand is that if you limit nameloc polling to
itself
how can it get the status of other nodes. I will have to look at what
nameloc
is doing, really doing.

Some people don’t like that because if TCP/IP isn’t setup properly
you
won’t
be able to setup the netmap. Which means you won’t be able to use
fleet to possibly fix the problem. Chicken and the egg thing.


to fix such problems Net has option “-A” which isn’t used in default
configuration including Net started by “nettrap start”. works perfect
and
unknown external nodes are mapped by request with root priveledges > :slight_smile:
you
must know only destination node MAC address.

Indeed, if you can live with the fact that one MAC address
must be known all is well. -A used to be the default, but that was
changed to solve security issues.


returning to the main problem:

  1. is it so critical to launch nameloc at every node with forced
    disabled
    nodes polling ?

I think it is, it beats the purpose of nameloc.

  1. as i can see from “sin in” output there’s only one nameloc at 200th
    node
    running. is it always running this way ?

200th node? Do you mean you have 200 nodes/licenses/nameloc?

  1. is it possible to leave online only two nodes 1 and 6, start
    nameloc
    at
    node 1 without node polling limits and make same tests ?

Yes, in theory the best is to have two namelocs on the whole network.


// wbr



\

“Ivan Bannon” <ivan.bannon@rjginc.com> wrote in message
news:9oqecj$25q$1@inn.qnx.com

Ok, looks like we have found our culprit and it is network hardware,
namely
the
Ethernet cable by the looks of it. This is what I’ve found out so far.

When I moved node 6 into my office and put it on a small hub with node 1
the
copy speed was ok, 600+ Kbytes per second. Then I moved node 6 onto
another hub on the office network and same results, good copy speed. Then
I
put another node into the lab, where node 6 was, and it exhibited the same
problem as node 6.

I looked at the Ethernet wires run to 4 outlets and found that on a single
8
wire
Cat-5 I.S. had connected two outlet sockets?!>?!>? Can you split a
standard
Cat-5 cable this way? It looks like they did match twisted pairs, but it
seems shaky to me. Now I’ll have to check out the rest of the wiring and
see
where else this occurs.

maybe i understand you incorrectly but you mean there’s more than two (guess
four) sockets sharing the same cable ? then imho it depends. inside UTP
Category5 8 wires - 4 twisted pairs cable Ethernet 10Base-T (802.3i) uses
only two cable pairs RX and TX. so actually you can have a maximum of two
independent Ethernet lines per cable. so my guesses are:

  1. folks who routed lan cables used one UTP Cat5 cable for two Ethernet
    channels just to save space and materials.
  2. additional jacks are attached with parallel scheme at the main media so
    you can sniff media in passive mode.
  3. there were too much of jacks so they still had to attach it somewhere…

guess the frist is the most probable case. but imho with this scheme level
of induced noice is notably higher as a result errors rate should be also
higher.

// wbr

“Mario Charest” <mcharest@clipzinformatic.com> wrote in message
news:9oqg9b$3dp$1@inn.qnx.com

don’t forget that all cards are running in promiscuous mode so nameloc
actually can sniff the net at least available segment

I beleive not, when promiscuty mode is use, packets not targeted to the
node
do not make it to any application, unless they specificaly request it
(netraw)

if nameloc uses same hack with manual aliasing that netraw uses you cannot
be sure does it use raw network interface or not. if only you don’t have
source code of course. well, at least manualy hacked gdt/ldt descriptors
arn’t listed by “sin mem” and qnx_psinfo() so in order to discover real
state of things you’d to do it by hands walking across gdt and all ldts
building reall memory map.

also, from netraw package i don’t see any interface except raw network
packets interface using which nameloc could poll mapped with netmap nodes.
what am i missing though ?

. according to docs
it’s assumed that when global name is registered at some node it informs
all
name locators across network. as i understand they don’t use global
names
in
direct form but Socket/Tcpip use. next i can only guess which way this
notification’s happening: with sending one broadcast packet or by
sending
one packet per found name locator.

QNX (FLEET) never uses broadcast.

yes, agree. just a mine mistake :slight_smile:

The global names are not send to other nodes. When a name is requested
the request goes to a node that has as a nameloc running.

then i just got wrong docs at nameloc :frowning:

from http//qdn.qnx.com/support/docs/qnx4/utils/n/nameloc.html

—cut—

Upon starting, nameloc will immediately poll each node in the network for
its list of global names. It will then go into a slow poll mode to refresh
this information. This slow poll period may be changed using the -p option.
The slow polling is not the major means by which nameloc keeps informed of
process global names in the network. Each time a name is registered or
removed, all name locators are immediately informed. The slow poll is a
safety net to handle extraordinary error conditions that might cause it to
miss an attach or detach request.

—cut—

not so sure due to

  1. it wasn’t fixed at patch D, tested.
  2. there wasn’t any notes at patch E readme that even anything was
    changed
    at Net, only at Proc32 (with no comments but for sure not related
    directly
    with noted above feature).

so not so sure… > :slight_smile: > i have no runnign E machine behind me so difficult
to
test.


I think this was only for a beta.

anyway with Net -A option this is configurable. just reminding myself not to
forget to switch this option on for releast stations…

// wbr

“ian zagorskih” <ianzag@novosoft-us.com> wrote in message
news:9oqqha$966$1@inn.qnx.com

“Mario Charest” <> mcharest@clipzinformatic.com> > wrote in message
news:9oqg9b$3dp$> 1@inn.qnx.com> …

don’t forget that all cards are running in promiscuous mode so nameloc
actually can sniff the net at least available segment

I beleive not, when promiscuty mode is use, packets not targeted to the
node
do not make it to any application, unless they specificaly request it
(netraw)


if nameloc uses same hack with manual aliasing that netraw uses you cannot
be sure does it use raw network interface or not. if only you don’t have
source code of course. well, at least manualy hacked gdt/ldt descriptors
arn’t listed by “sin mem” and qnx_psinfo() so in order to discover real
state of things you’d to do it by hands walking across gdt and all ldts
building reall memory map.

also, from netraw package i don’t see any interface except raw network
packets interface using which nameloc could poll mapped with netmap nodes.
what am i missing though ?

Nameloc doesn’t use netraw.

. according to docs
it’s assumed that when global name is registered at some node it
informs
all
name locators across network. as i understand they don’t use global
names
in
direct form but Socket/Tcpip use. next i can only guess which way this
notification’s happening: with sending one broadcast packet or by
sending
one packet per found name locator.

QNX (FLEET) never uses broadcast.


yes, agree. just a mine mistake > :slight_smile:

The global names are not send to other nodes. When a name is requested
the request goes to a node that has as a nameloc running.

then i just got wrong docs at nameloc > :frowning:

from http//qdn.qnx.com/support/docs/qnx4/utils/n/nameloc.html

Yes and no. There is a difference between nodes running nameloc and the one
that don’t. Global names are send from one nameloc to another when
names are added/removed, but that only occurs within nodes running nameloc.
(In fact that is what the doc says but I’m not 100% sure that is the case)
Nothing is send to nodes NOT running nameloc.