QNET problems in 6.2

Hello,
I have problems with QNET since I upgraded to 6.2. I have 2 boxes running
6.2 on the same subnet, network cards are 3Com905-TX. Sometimes after I
reboot one of them, I can’t see the second one in the /net. On both
machines there is only localhost name in the /net directory. When I do ls
/net/second_machine I get “No route to host” though I can ping or telnet
between the boxes. I tried to restart io-net manually with different
parameters - nothing works. I must do HW reset of both machines (even a
few times) before i finally get both nodes to /net.

Thanks in advance,
Jan.

That is the correct behavor of “name confilict”.

The “hostname” (and domain, called FQNN) is the “global unique address” of
a node. That means, we will try to avoid if two nodes come up with same
name.
(it’s like in a IP network, you can’t have 2 machine have same IP).

As part of the QNET startup, it will try to determine if there is a “name
conflict”
on network. If it found one, (in your case, one of the “localhost” win, but
second one detect a conflict) QNET put itself into this mode that it only
configured
itself, but won’t talk to anybody on the network.

So in your case, if you have a 3rd machine with some other name, (other then
localhost)
you would actually see one “localhost” is Okey, but the other is
unreachable.

To avoid name conflict, you either use “Configure->Network Config” to set
correct
hostname for each machine (which it will remember, and set same hostname
next time
you boot); Or you use QNET host=“name.domin” option to pass one to QNET.

Once you are in the name conflict status, you can also reset your hostname
(/bin/hostname newname),
QNET will notice the hostname is changed, and re-try the new name; and if it
is not
conflict with anybody, it will then fully start up.

-xtang

Jan Ptacek <ptacek@esys.cz> wrote in message
news:Pine.LNX.4.44.0301211927490.5601-100000@mail.esys.cz

Hello,
I have problems with QNET since I upgraded to 6.2. I have 2 boxes running
6.2 on the same subnet, network cards are 3Com905-TX. Sometimes after I
reboot one of them, I can’t see the second one in the /net. On both
machines there is only localhost name in the /net directory. When I do ls
/net/second_machine I get “No route to host” though I can ping or telnet
between the boxes. I tried to restart io-net manually with different
parameters - nothing works. I must do HW reset of both machines (even a
few times) before i finally get both nodes to /net.

Thanks in advance,
Jan.

I’m sorry for the misunderstanding. When I wrote that “there is only
localhost name in the /net directory” I meant that there is only name of
the particular host. Of course I have different hostnames on both nodes.
These hostnames are node1.esys.cz and node2.esys.cz. When I restart for
example node1, then I see only node1 in the /net directory of
node1.esys.cz and node2 in the /net directory of node2.esys.cz. And the
only way to get both node1 and node2 to both /net directories is to HW
reset both machines.

Thanks,
Jan.

That is the correct behavor of “name confilict”.

The “hostname” (and domain, called FQNN) is the “global unique address” of
a node. That means, we will try to avoid if two nodes come up with same
name.
(it’s like in a IP network, you can’t have 2 machine have same IP).

As part of the QNET startup, it will try to determine if there is a “name
conflict”
on network. If it found one, (in your case, one of the “localhost” win, but
second one detect a conflict) QNET put itself into this mode that it only
configured
itself, but won’t talk to anybody on the network.

So in your case, if you have a 3rd machine with some other name, (other then
localhost)
you would actually see one “localhost” is Okey, but the other is
unreachable.

To avoid name conflict, you either use “Configure->Network Config” to set
correct
hostname for each machine (which it will remember, and set same hostname
next time
you boot); Or you use QNET host=“name.domin” option to pass one to QNET.

Once you are in the name conflict status, you can also reset your hostname
(/bin/hostname newname),
QNET will notice the hostname is changed, and re-try the new name; and if it
is not
conflict with anybody, it will then fully start up.

-xtang

Jan Ptacek <> ptacek@esys.cz> > wrote in message
news:> Pine.LNX.4.44.0301211927490.5601-100000@mail.esys.cz> …
Hello,
I have problems with QNET since I upgraded to 6.2. I have 2 boxes running
6.2 on the same subnet, network cards are 3Com905-TX. Sometimes after I
reboot one of them, I can’t see the second one in the /net. On both
machines there is only localhost name in the /net directory. When I do ls
/net/second_machine I get “No route to host” though I can ping or telnet
between the boxes. I tried to restart io-net manually with different
parameters - nothing works. I must do HW reset of both machines (even a
few times) before i finally get both nodes to /net.

Thanks in advance,
Jan.
\

Jan Ptacek wrote:

I’m sorry for the misunderstanding. When I wrote that “there is only
localhost name in the /net directory” I meant that there is only name of
the particular host. Of course I have different hostnames on both nodes.
These hostnames are node1.esys.cz and node2.esys.cz. When I restart for
example node1, then I see only node1 in the /net directory of
node1.esys.cz and node2 in the /net directory of node2.esys.cz. And the
only way to get both node1 and node2 to both /net directories is to HW
reset both machines.

I think there is a problem with the 3Com905 driver. When we upgraded to
QNX 6.1, I spent a week running around installing 3Com905 cards in
everyone’s boxes, because they were the only ones that worked reliably.
After installing QNX 6.2, some of the cards worked for a while, but many
refused to work right off the bat, and the ones that did work eventually
stopped working. Suffice it to say, I spent another week running
around swapping out the 3Com cards for the SMC and Intel cards that were
originally in the machines.

The 3Com cards were all the same exact model and revision, out of a
25-pack that was purchased when we found out they worked with QNX 6.1.

Some symptoms we saw:

  1. No TCP/IP connectivity, but qnet worked.
  2. Refusal to acknowledge that there was a valid link between the card
    and the network switch. A Windows system with the exact same card had no
    problem communicating on that network.