IPC with nid,pid,chid - what nid?

I have some questions about interprocess communication in Nuetrino
that I hope don’t make me look too ignorant.

I have a hypothetical network of three machines, Alpha, Bravo and
Charlie, and they run processes A, B, and C respectively. I want to
perform interprocess communication among A, B, and C by using
ConnectAttach() with the (nid, pid, chid) tuple (which I’ll refer to
as NPC from now on), that uniquely identifies a listener - in this
case, corresponding to a process.

Case 1:

A knows the NPC of B, but B and C do not know the NPC of A. A would
like to tell B and C its NPC by transmitting it in a message. The
message content is constrained to be a fixed size, and 3 integers will
work fine. The same message should be sent to both B and C. What
does A send to B and C as its NID?

Case 2:

A does not know the NPC of B and C, but does know their queue names on
a networked queue server. A would like to send its NPC in a
fixed-length message to B anc C, without knowing on which node either
B or C is running. What does A send to B and C? A would like to send
the same message to both B and C.

Case 3:

A knows the NPC of B, but C does not. A would like to send the NPC of
B to C. What does A send to C to identify B?


All of these cases really boil down to the general question: How is a
connection uniquely identified on a network such that the information
can be tranmistted around the network without regard to the node of
the sender or receiver of that information? I have intentionally
constrained the message length in my questions to be fixed to
disqualify passing the fully qualified node identifier as a string.
This reflects a real-world constraint that I am working under.

The question regarding the queue seems to be the most difficult, since
there is no _msg_info for the incoming message, and even if there
were, it would identify the queue server, not the originator of the
message.

Thanks,
Andrew

I thought it would be better start from explain nd.

On Neutrino, the “nd” (node descripter) is no longer “global
unique”. Basically, every node keep a table of “known nodes”,
and use the index as the nd. So “nd = 2” on node A, have
no meaning on node B (it could be invalid on node B, or
point to totally different nodes). The only “global unique”
thing, is the name.domain string.

This means, trasnfer an “nd” cross nodes are totally useless.
If you really need, you could use “netmgr_ndtostr()” to translate
a nd to it’s “name.domain” string, and pass it cross, so the
otherside might be able to do a netmgr_strtond() on it.

The most interesting function would be “netmgr_remote_nd(rnd, lnd)”.
What does this function do? It answers this question:

Let’s say I am on node A, I have a table like:

nodeb.domain is nd = 2
nodec.domain is nd = 3

But I want to know, on nodec.domain, what is the nd for nodeb.domain?
Thus I call:

nd_on_c = netmgr_remote_nd(3, 2);

Now you got a nd_on_c, this is useless on nodea, but you are safe
to transfer it to nodec (which correct pid, chid), so nodec could
use the (nd, pid, chid) set to actually connect to nodeb.

So your case 1 is solved by calling netmgr_remote_nd(nd_of_b, 0),
, netmgr_remtoe_nd(nd_of_c, ND_LOCAL_NODE), and pass the result
to node B and C.

Your case 3 is solved by calling netmgr_remote_nd(nd_of_c, nd_of_b),
and pass it to node C.

For case 2, A could either call netmgr_strtond() to get B/C’s
nd, and then goto case 1. Or A could “open(/net/b/server)”,
and do a “ConnectServerInfo()” on the return nd, to find
out B’s nd (on A), and then fall back to case 1.

Hopefully, this could be helpful.

-xtang


Andrew Thomas <Andrew@cogent.ca> wrote:

I have some questions about interprocess communication in Nuetrino
that I hope don’t make me look too ignorant.

I have a hypothetical network of three machines, Alpha, Bravo and
Charlie, and they run processes A, B, and C respectively. I want to
perform interprocess communication among A, B, and C by using
ConnectAttach() with the (nid, pid, chid) tuple (which I’ll refer to
as NPC from now on), that uniquely identifies a listener - in this
case, corresponding to a process.

Case 1:

A knows the NPC of B, but B and C do not know the NPC of A. A would
like to tell B and C its NPC by transmitting it in a message. The
message content is constrained to be a fixed size, and 3 integers will
work fine. The same message should be sent to both B and C. What
does A send to B and C as its NID?

Case 2:

A does not know the NPC of B and C, but does know their queue names on
a networked queue server. A would like to send its NPC in a
fixed-length message to B anc C, without knowing on which node either
B or C is running. What does A send to B and C? A would like to send
the same message to both B and C.

Case 3:

A knows the NPC of B, but C does not. A would like to send the NPC of
B to C. What does A send to C to identify B?



All of these cases really boil down to the general question: How is a
connection uniquely identified on a network such that the information
can be tranmistted around the network without regard to the node of
the sender or receiver of that information? I have intentionally
constrained the message length in my questions to be fixed to
disqualify passing the fully qualified node identifier as a string.
This reflects a real-world constraint that I am working under.

The question regarding the queue seems to be the most difficult, since
there is no _msg_info for the incoming message, and even if there
were, it would identify the queue server, not the originator of the
message.

Thanks,
Andrew

Previously, David Gibbs wrote in qdn.public.qnxrtp.os:

Andrew Thomas <> andrew-s-thomas@home.nospam.com> > wrote:

By the way, to add context to my questions, I am actually trying to find
a way that a process can identify itself in every message that it sends,
as part of the header.

Why not use IP address & pid?

Yes, the node may have two (or more) IP addresses – but if any
particular program gets the IP address once, and always uses that
IP address for identification, this should work for you.

Mostly because there is no guarantee that a node is running the TCP
stack, which is both optional and orthogonal to QNET communication.
When I said “socket” earlier, read “arbitrary file descriptor”.

Cheers,
Andrew

Xiaodan Tang <xtang@qnx.com> wrote:

I thought it would be better start from explain nd.

On Neutrino, the “nd” (node descripter) is no longer “global
unique”. Basically, every node keep a table of “known nodes”,
and use the index as the nd. So “nd = 2” on node A, have
no meaning on node B (it could be invalid on node B, or
point to totally different nodes). The only “global unique”
thing, is the name.domain string.

One more point should be made… never store an ‘nd’. Over time, the
‘nd’ for a given node is allowed to change value. If you want remember
another node, you should always remember the node name.

Think of the node descriptor as one of those short lived atomic particles
that we’ll probably use to name a program someday :slight_smile:.


Brian Stecher (bstecher@qnx.com) QNX Software Systems, Ltd.
phone: +1 (613) 591-0931 (voice) 175 Terence Matthews Cr.
+1 (613) 591-3579 (fax) Kanata, Ontario, Canada K2M 1W8

Brian Stecher <bstecher@qnx.com> wrote:

One more point should be made… never store an ‘nd’. Over time, the
‘nd’ for a given node is allowed to change value. If you want remember
another node, you should always remember the node name.

But the nd is guaranteed to keep pointing to the same node as long as I
have an open connection to that node, right?


Wojtek Lerch (wojtek@qnx.com) QNX Software Systems Ltd.

Wojtek Lerch <wojtek@qnx.com> wrote:

Brian Stecher <> bstecher@qnx.com> > wrote:
One more point should be made… never store an ‘nd’. Over time, the
‘nd’ for a given node is allowed to change value. If you want remember
another node, you should always remember the node name.

But the nd is guaranteed to keep pointing to the same node as long as I
have an open connection to that node, right?

Correct… usually. You can’t depend on this behaviour to try and keep an
nd active. If the remote machine is determined to be dead for some reason,
that open connect won’t keep the nd from going stale. You’ll get an
error the next time you attempt to use the connection. Using the nd
you’re trying to keep valid to create a new connection will possibly hook
up to a completely different machine.


Brian Stecher (bstecher@qnx.com) QNX Software Systems, Ltd.
phone: +1 (613) 591-0931 (voice) 175 Terence Matthews Cr.
+1 (613) 591-3579 (fax) Kanata, Ontario, Canada K2M 1W8

Thank you all for responding. I’m going to try to paraphrase what you
all said:

Case 1:

A knows the NPC of B, but B and C do not know the NPC of A. A would
like to tell B and C its NPC by transmitting it in a message. The
message content is constrained to be a fixed size, and 3 integers will
work fine. The same message should be sent to both B and C. What
does A send to B and C as its NID?

Case 2:

A does not know the NPC of B and C, but does know their queue names on
a networked queue server. A would like to send its NPC in a
fixed-length message to B anc C, without knowing on which node either
B or C is running. What does A send to B and C? A would like to send
the same message to both B and C.

These are impossible. A cannot identify itself using the same message to
both B and C, unless it uses the fully qualified node name, as a variable
length
string, which violates one of my hypothetical criteria. Process A can try
to
send its “nd according to the receiver”, but

  1. then the message will be different to each of B and C (this breaks cases
    1,2 and 3)
  2. by the time the message arrives, the nd may no longer be valid (this
    breaks
    case 2, at least)
  3. in order to determine the node number, a networked call will have to be
    made to netmgr_remote_nd(), or at the very least, a MsgSend will have
    to be made. Since nd is transient, this call must be made prior to
    every
    networked message that contains an identifier!
  4. A will have to know the node numbers of B and C. (this breaks case 2).

Case 3:

A knows the NPC of B, but C does not. A would like to send the NPC of
B to C. What does A send to C to identify B?

This is impossible to do reliably. The nd that is transmitted may or may
not be
correct by the time the message arrives. Think of a queued message, that
may
be in transit for a little bit of time. Also, B’s nd might change by the
time the
message is transmitted to C.

In all cases, there is no way for a process to transmit its NPC without
knowing
the node ID of the receiver.

The only “solution” seems to be to use the fully qualified node name. First
off,
this is very sad. Even IP has a fixed-length identifier in the form of an
IP address
and port pair. Secondly, can FQNN be trusted? If A sends its FQNN to B,
what guarantee is there that B agrees on the MAC-to-name mapping? A
might think that C is called “charlie.local.net”, but B might have a hosts
file
(or whatever) that says that C is “cosmos.local.net”.

Have I interpreted you correctly?

By the way, to add context to my questions, I am actually trying to find
a way that a process can identify itself in every message that it sends,
as part of the header. I need this because messages can come from
both asynchronous and synchronous sources, or via a socket, but still
be from the same sender. I need to be able to correlate the message
back to the sender, not back to the socket, queue or connection id.
Since messages can be sent to queues, it is not generally true that the
sender knows the node of the receiver, so I need a way to uniquely
identify the sender without regard to the receiver. Not to mention
that there could be several readers on a single queue, each running on
a different machine. A variable-length header in messages is really evil,
and wastes bandwidth besides. Under QNX4, I could use (nid,pid).
Under and TCP/IP implementation I could use (address,port).

Cheers,
Andrew

Previously, David Gibbs wrote in qdn.public.qnxrtp.os:

Andrew Thomas <> Andrew@cogent.ca> > wrote:
Previously, David Gibbs wrote in qdn.public.qnxrtp.os:
Andrew Thomas <> andrew-s-thomas@home.nospam.com> > wrote:

By the way, to add context to my questions, I am actually trying to find
a way that a process can identify itself in every message that it sends,
as part of the header.

Why not use IP address & pid?

Well, there is another reason as well. The purpose of having process
A identify itself to process B is so that B can subsequently send a
message back to A. I could generate a random number in a huge range,
and drive the chance of a collision to nearly infinitessimal, but that
gets me no closer to bi-directional communication.

So is the bottom line that the three scenarios that I outlined are
unsolvable, given the constraint of fixed-length messages, in
Neutrino?

Cheers,
Andrew

Andrew Thomas <andrew-s-thomas@home.nospam.com> wrote:

By the way, to add context to my questions, I am actually trying to find
a way that a process can identify itself in every message that it sends,
as part of the header.

Why not use IP address & pid?

Yes, the node may have two (or more) IP addresses – but if any
particular program gets the IP address once, and always uses that
IP address for identification, this should work for you.

-David

QNX Training Services
dagibbs@qnx.com

Andrew Thomas <Andrew@cogent.ca> wrote:

Previously, David Gibbs wrote in qdn.public.qnxrtp.os:
Andrew Thomas <> andrew-s-thomas@home.nospam.com> > wrote:

By the way, to add context to my questions, I am actually trying to find
a way that a process can identify itself in every message that it sends,
as part of the header.

Why not use IP address & pid?

Yes, the node may have two (or more) IP addresses – but if any
particular program gets the IP address once, and always uses that
IP address for identification, this should work for you.

Mostly because there is no guarantee that a node is running the TCP
stack, which is both optional and orthogonal to QNET communication.
When I said “socket” earlier, read “arbitrary file descriptor”.

I thought (and I might be mistaken) that QNET uses IP for addressing.
So, I think your machine will always have an IP address – it may
not always have the rest of the stack, and I don’t know whether or
not the IP is readily available.

-David

QNX Training Services
dagibbs@qnx.com

Andrew Thomas <Andrew@cogent.ca> wrote:

Previously, David Gibbs wrote in qdn.public.qnxrtp.os:
Andrew Thomas <> Andrew@cogent.ca> > wrote:
Previously, David Gibbs wrote in qdn.public.qnxrtp.os:
Andrew Thomas <> andrew-s-thomas@home.nospam.com> > wrote:

By the way, to add context to my questions, I am actually trying to find
a way that a process can identify itself in every message that it sends,
as part of the header.

Why not use IP address & pid?

Well, there is another reason as well. The purpose of having process
A identify itself to process B is so that B can subsequently send a
message back to A. I could generate a random number in a huge range,
and drive the chance of a collision to nearly infinitessimal, but that
gets me no closer to bi-directional communication.

So is the bottom line that the three scenarios that I outlined are
unsolvable, given the constraint of fixed-length messages, in
Neutrino?

If you only want to “identify” process A in the messages A send out,
(my understanding is these messages my not even MsgSend/Receive/Reply,
but could comming though a socket ?) then you’d better thinking
of something else then (nd, pid, chid). IP address/ethernet Mac address,
cpu id, a fixed table kept in a file, phone number :slight_smile:, anything is
Fixed length, and “global unique” and all your nodes agree with.

Yes, the ONLY purpose to transfer (nd/pid/chid) is so that the
receiver could using this “nd/pid/chid” pair to connect back/or
to third machine.

In most case, we suggest applications using resource manager, and
servers (channels) take a name space, so others could “open()”
it. That is saying, A send B a file name, tell B to “open()” back
is more reliable.

There are server’s using the technic to gathering
“nd_of_me_on_his_machine/mypid/mychid”, pass it back to remote,
so remote could connect back. Proc, for example, doing this
all the time.

The concern is what if the “nd” I got is changed after I send
it back (Brian’s warning :slight_smile:.

The only reason that a “nd” could invalid/changed, is that
QNET detect the machine is down/unreachable, thus set the
nd “invalid”, and it will “stay” in invalid mode for
“a while”.

So if A call remote_nd(), got B’s view of A, and stuff it,
send it back, there is totally no problem. (The only reason
B’s view of A is changed, is A gone) Even if the message
goes to B, then A gone, then B turns his nd of A into invalid,
B sill could detect the nd in message from A is invalid.

Only problem could happen is B turns nd of A into invalid,
then after “a while”, B release the nd, and then using that
nd point to somebody else, then B start processing A’s message,
that will cauing B goes to wrong node. The only protect
for this satuation is this “a while” is quite longer.
(10 minutes and we plan to make it user configurable, along
with a lot of other timers).

-xtang

If you only want to “identify” process A in the messages A send out,
(my understanding is these messages my not even MsgSend/Receive/Reply,
but could comming though a socket ?) then you’d better thinking
of something else then (nd, pid, chid). IP address/ethernet Mac address,
cpu id, a fixed table kept in a file, phone number > :slight_smile:> , anything is
Fixed length, and “global unique” and all your nodes agree with.

That’s not at all what I want. I want to be able to send A’s nid/pid/chid
so that the receiving process can send a message back at a later date.
My goal involves communication among the tasks.

As far as other unique identifiers. What if the processes are on the
same node, and there is no ethernet card? If the processes are peers,
then who gives out the globally unique identifiers? Who stores the
files? Who serializes access so there are no races? I’m looking for
global uniqueness, and nid/pid/chid are supposed to be globally unique
on a QNET.

Yes, the ONLY purpose to transfer (nd/pid/chid) is so that the
receiver could using this “nd/pid/chid” pair to connect back/or
to third machine.

That is my purpose. However, as I read your responses, this
information is either not valid, or not knowable. So the “ONLY”
purpose, which I see as being a perfectly valid thing to want to
do, is not even possible, or am I still not understanding?

In most case, we suggest applications using resource manager, and
servers (channels) take a name space, so others could “open()”
it. That is saying, A send B a file name, tell B to “open()” back
is more reliable.

But that implies that A and B are both root privileged, and have a
pile of code in them to do resource management. How does the
resmgr library work with Photon? That also breaks one of the
requirements - that the nid/pid/chid be a fixed length, and
preferably small. When you say “most cases”, what do you suggest
for the rest of the cases, like mine?

There are server’s using the technic to gathering
“nd_of_me_on_his_machine/mypid/mychid”, pass it back to remote,
so remote could connect back. Proc, for example, doing this
all the time.

But this won’t work through a queue.

The concern is what if the “nd” I got is changed after I send
it back (Brian’s warning > :slight_smile:> .

The only reason that a “nd” could invalid/changed, is that
QNET detect the machine is down/unreachable, thus set the
nd “invalid”, and it will “stay” in invalid mode for
“a while”.

OK, so let’s say that nd is effectively safe. The question still
applies. How to transmit nid/pid/chid to identify A when the
node of B is not known.

If it will make the problem easier, let’s just look at two cases:
MsgSend() and queued messages. Forget about sockets for
the time being. I think that if we can produce a solution for
transmitting a message via a queue where the receiver’s node
is not known, then that will be good enough.

You also didn’t answer the question of whether the FQNN
as a string is actually usable, given that two nodes may have
different ideas of the MAC-to-name mapping.

Cheers,
Andrew

Previously, Xiaodan Tang wrote in qdn.public.qnxrtp.os:

All right. So many questions, let’s do it one by one.

Actually, the question was rhetorical. In a network of peers, you
cannot store globally unique information on a single node without
making it a global point of failure. The idea of a file in a
well-known place is about as far from fault-tolerant design as we can
get while staying with QNX. You (QSSL) battled this same problem with
nameloc in QNX4.

Andrew Thomas <> andrew-s-thomas@home.nospam.com> > wrote:

As far as other unique identifiers. What if the processes are on the
same node, and there is no ethernet card? If the processes are peers,
then who gives out the globally unique identifiers? Who stores the
files? Who serializes access so there are no races? I’m looking for

By “other unique identifiers”, I mean anything IS “fixed-length-globally-
unique” to your porcess. A file in a well-known place which contains
“Id FQNN” pair, all your process talk to a third party server to get
that identifier… Any way, but this is not what you like, right > :slight_smile:

global uniqueness, and nid/pid/chid are supposed to be globally unique
on a QNET.

No, your “suppose” is wrong. Let’s confirm it again, nd(not nid)/pid/chid
IS NOT globally unique, it is only unique in one node. The only globally
unique thing IS the FQNN, and yes, it is not fixed length.

This is the point that we are ultimately coming to. In order for
process B to attach to process A, it must have the nd/pid/chid for A.
However, A cannot tell B its nd/pid/chid without knowing which node B
is on. A cannot tell the nd of B to C, even by FQNN (which is not
globally unique, see below).

There are server’s using the technic to gathering
“nd_of_me_on_his_machine/mypid/mychid”, pass it back to remote,
so remote could connect back. Proc, for example, doing this
all the time.

But this won’t work through a queue.

Can you explain more detail?

A queue, generally, hides the receiver from the sender. A sends
messages to B only through an agreed-upon mailbox at a third-party
location. A, B and the queue server can all be on different nodes. A
has no way to know which node B is on. Therefor
“nd_of_me_on_his_machine” is not generally knowable. Before you say
“gee that’s getting pretty hypothetical”, this is the case that
actually caused me to ask the question.

OK, so let’s say that nd is effectively safe. The question still
applies. How to transmit nid/pid/chid to identify A when the
node of B is not known.

I am not so sure of the question. You case 1) in your
orignal post can be done by:

A obtain A’s_nd_on_node_B/A’s pid/A’s chid, send it back to B;
A obtain A’s_nd_on_node_C/A’s pid/A’s chid, send it back to C;

The only “problem” is that the 2 messages are different, why is
this so important ?

I’m coming around to the idea that this is the least of many evils,
and now just have to figure out whether B’s FQNN can be known by A in
all cases. This still does not work in the case of a queue, where A
doesn’t know which node B is on. The question is whether I can live
with that restriction.

If it will make the problem easier, let’s just look at two cases:
MsgSend() and queued messages. Forget about sockets for
the time being. I think that if we can produce a solution for
transmitting a message via a queue where the receiver’s node
is not known, then that will be good enough.

I am a little bit confusing. When you say “queue”, you are talking
about mqueue?

I’m talking about the general concept of a message sent via an
agreed-upon third party.

In case of MsgSend(), you need to ConnectAttach() first, which
request a (nd, pid, chid). If you don’t know the nd of B, then
you call netmgr_strtond() to get the nd.

If I need to call netmgr_strtond() every time I want to send a message
to B, that is really expensive. If I want to find
“my_node_according_to_him” for every message, this is intolerably
expensive. Remember that I’m looking for a way that B can send a
message back to A based on the information that A sent.

Or are you saying you don’t even know the name of node you want
to talk?

Correct.

Currently, if you don’t know the node’s name you want to talk
to, you can’t talk to them.

Wrong. I can use a queue. In the more general case, I can use any
non-MsgSend() mechanism to transmit a message, and it will have the
same properties in this regard as a queue.

You also didn’t answer the question of whether the FQNN
as a string is actually usable, given that two nodes may have
different ideas of the MAC-to-name mapping.

Since the FQNN is “global unique”, this will fail. ie,

if A knows C as “foo”, but B knows C as “bar”, and A send a
message to B say “talking to the pid/chid on foo”, B will
return error say “I don’t know ‘foo’”.

(there is a way in QNET to “alias” different name to same node,
but let’s not get in there first).

In the above case, who have the knowladge that “foo” is “bar” ?

Nobody. That was my question. It seems that FQNN is not sufficient
to for A to identify B to C. The frustrating thing here is that if
you stick to open() calls and resource managers, everything is fine.
I just happen to be trying to color outside the rather narrow lines
that seem to have been painted by the nd/pid/chid implementation.

When all is said and done, I think the answer here is:

  1. A must always know B’s node FQNN in order for A to identify itself
    to B. There is no way for A to use the same information to
    identify itself to two different processes unless it does so using
    its FQNN. This is annoying and won’t work in the case where A does
    not know which node B is on. It is certainly a step backward from
    QNX4.

  2. Both A and B must be willing to accept a finite chance that if they
    attempt to attach a connection to one another using nd/pid/chid,
    that connection might in fact go to a completely different process
    on a completely different node (the nd and/or pid are stale). The
    chance seems remote, but it hurts the mathie in me. It may
    actually be lower than the equivalent chance of this happening with
    QNX4 because the PID space is larger in QNX6.

  3. All nodes on the network have to agree on the FQNNs of all other
    nodes, otherwise the whole nd/pid/chid scheme appears to fall
    apart. This appears to be a real step-to-the-left relative to the
    experiences of TCP/IP developers, where a name is just a
    convenience to refer to the globally unique identifier, and the
    globally unique identifier is sacred. What we appear to have
    concluded is that QNET has no globally unique identifiers. (And
    no, MAC address isn’t it either. Not all machines have ethernet
    cards. I’ve read the posts saying that QNET is not tied to the
    physical layer).

When I started this thread, what I was really looking for is the
implementation limitations of QNET so I knew how to go about solving
my problems. Do the above points accurately describe QNET?

Cheers,
Andrew

All right. So many questions, let’s do it one by one.

Andrew Thomas <andrew-s-thomas@home.nospam.com> wrote:

As far as other unique identifiers. What if the processes are on the
same node, and there is no ethernet card? If the processes are peers,
then who gives out the globally unique identifiers? Who stores the
files? Who serializes access so there are no races? I’m looking for

By “other unique identifiers”, I mean anything IS “fixed-length-globally-
unique” to your porcess. A file in a well-known place which contains
“Id FQNN” pair, all your process talk to a third party server to get
that identifier… Any way, but this is not what you like, right :slight_smile:

global uniqueness, and nid/pid/chid are supposed to be globally unique
on a QNET.

No, your “suppose” is wrong. Let’s confirm it again, nd(not nid)/pid/chid
IS NOT globally unique, it is only unique in one node. The only globally
unique thing IS the FQNN, and yes, it is not fixed length.

There are server’s using the technic to gathering
“nd_of_me_on_his_machine/mypid/mychid”, pass it back to remote,
so remote could connect back. Proc, for example, doing this
all the time.

But this won’t work through a queue.

Can you explain more detail?

The concern is what if the “nd” I got is changed after I send
it back (Brian’s warning > :slight_smile:> .

The only reason that a “nd” could invalid/changed, is that
QNET detect the machine is down/unreachable, thus set the
nd “invalid”, and it will “stay” in invalid mode for
“a while”.

OK, so let’s say that nd is effectively safe. The question still
applies. How to transmit nid/pid/chid to identify A when the
node of B is not known.

I am not so sure of the question. You case 1) in your
orignal post can be done by:

A obtain A’s_nd_on_node_B/A’s pid/A’s chid, send it back to B;
A obtain A’s_nd_on_node_C/A’s pid/A’s chid, send it back to C;

The only “problem” is that the 2 messages are different, why is
this so important ?

If it will make the problem easier, let’s just look at two cases:
MsgSend() and queued messages. Forget about sockets for
the time being. I think that if we can produce a solution for
transmitting a message via a queue where the receiver’s node
is not known, then that will be good enough.

I am a little bit confusing. When you say “queue”, you are talking
about mqueue?

In case of MsgSend(), you need to ConnectAttach() first, which
request a (nd, pid, chid). If you don’t know the nd of B, then
you call netmgr_strtond() to get the nd.

Or are you saying you don’t even know the name of node you want
to talk?

Currently, if you don’t know the node’s name you want to talk
to, you can’t talk to them.

There are name_*() functions to allow a server regist a name
“cross network”, so clients can “name_locate()” the service
they want (kind of like QNX4 name_regist/local). But these
are not released yet.

You also didn’t answer the question of whether the FQNN
as a string is actually usable, given that two nodes may have
different ideas of the MAC-to-name mapping.

Since the FQNN is “global unique”, this will fail. ie,

if A knows C as “foo”, but B knows C as “bar”, and A send a
message to B say “talking to the pid/chid on foo”, B will
return error say “I don’t know ‘foo’”.

(there is a way in QNET to “alias” different name to same node,
but let’s not get in there first).

In the above case, who have the knowladge that “foo” is “bar” ?

-xtang

Previously, William M. Derby Jr. wrote in qdn.public.qnxrtp.os:

I’ve been dealing with the same sort of problems on my system as I
port to Neutrino and this is what I came up with – perhaps it will be
somewhat useful. I too did not want every task in my system to be a
resource manager and hence run at root privaleges. The scheme I came
up with is to create 1 resource manager called namesrvr which handles
the filesystem in say /dev/mynames. Processes register the connection
information by creating a “file” in the namesrvr and writing the
nid:pid:chid of the connection for the process into the file. The file
handle is not closed until the program detaches the connecton or
terminates - which automatically closes the file. Close files cease to
exist or the contents are reset to -1> :-1:> -1. (I haven’t fully settled
on this yet) If you run a namesrvr on each machine, you can search
for the name of the file on all the /net/XXXX/dev/mynames. Iin this
way you will also be able to extract the name of node which sould
allow for correction of the local node NID… ( i.e. the entry in /net
will be your local name)

We have a similar thing here. We are planning to make the name server
update other name servers on the network automatically as well. We
want to avoid the case where a program is obliged to do network
messaging to look up a name.

Since you only register on your machine, all you need to worry about
is the uniqueness of the local name - remote name will always have
the node name is a qualifier to separate duplicates. As far as
sending the name - you could register the name for the connection as
“nd_pid_chid” send the tuple in you message and then search for it
in the net directory. You could additionally add the MAC address of
your card (one would do) to the registered name and send that as
well… So maybe you register “MAC_pid_chid” as your name and pass
them numerically as part of your message… The size is fixed and
relatively small and generating the filename to search for is
trivial…

The trouble is that in nd_pid_chid, nd is not useful. It will
always be zero to the process that registers the name. So names are
only unique to pid_chid. You need to store the FQNN on your
nameserver as well. MAC address may or may not exist.

In this model, all client programs simply need to have read
permission of the created files in /dev/mynames… I do not plan any
fancy permission scheme, but it certainly be possible. This model
also makes it really easy to retain the QNX4 name_locate
functionality with set of wrapper functions… The reason for the
resmgr at all is to automatically reap stale resgistrations…

Would something like this work for you?

It will, with some modifications. :slight_smile: We also store a queue name and a
domain, and perform the name search with ioctl (for speed) as well as
allowing file system reads. It reduces the problems a fair bit, but
does not eliminate them. For example, I can screw up our name server
by running the name server on one machine and then creating a prefix
link from the other machine to its /dev/nserve directory. Now there
are two different nodes sharing the same name server and some
assumptions we made become false.

I wonder how many more people are writing name servers?

Cheers,
Andrew

On Fri, 23 Feb 2001 15:15:45 -0500, “Andrew Thomas”
<andrew-s-thomas@home.nospam.com> wrote:

If you only want to “identify” process A in the messages A send out,
(my understanding is these messages my not even MsgSend/Receive/Reply,
but could comming though a socket ?) then you’d better thinking
of something else then (nd, pid, chid). IP address/ethernet Mac address,
cpu id, a fixed table kept in a file, phone number > :slight_smile:> , anything is
Fixed length, and “global unique” and all your nodes agree with.

That’s not at all what I want. I want to be able to send A’s nid/pid/chid
so that the receiving process can send a message back at a later date.
My goal involves communication among the tasks.

As far as other unique identifiers. What if the processes are on the
same node, and there is no ethernet card? If the processes are peers,
then who gives out the globally unique identifiers? Who stores the
files? Who serializes access so there are no races? I’m looking for
global uniqueness, and nid/pid/chid are supposed to be globally unique
on a QNET.

Yes, the ONLY purpose to transfer (nd/pid/chid) is so that the
receiver could using this “nd/pid/chid” pair to connect back/or
to third machine.

That is my purpose. However, as I read your responses, this
information is either not valid, or not knowable. So the “ONLY”
purpose, which I see as being a perfectly valid thing to want to
do, is not even possible, or am I still not understanding?

In most case, we suggest applications using resource manager, and
servers (channels) take a name space, so others could “open()”
it. That is saying, A send B a file name, tell B to “open()” back
is more reliable.

But that implies that A and B are both root privileged, and have a
pile of code in them to do resource management. How does the
resmgr library work with Photon? That also breaks one of the
requirements - that the nid/pid/chid be a fixed length, and
preferably small. When you say “most cases”, what do you suggest
for the rest of the cases, like mine?

I’ve been dealing with the same sort of problems on my system as I
port to Neutrino and this is what I came up with – perhaps it will be
somewhat useful. I too did not want every task in my system to be a
resource manager and hence run at root privaleges. The scheme I came
up with is to create 1 resource manager called namesrvr which handles
the filesystem in say /dev/mynames. Processes register the connection
information by creating a “file” in the namesrvr and writing the
nid:pid:chid of the connection for the process into the file. The file
handle is not closed until the program detaches the connecton or
terminates - which automatically closes the file. Close files cease to
exist or the contents are reset to -1:-1:-1. (I haven’t fully settled
on this yet) If you run a namesrvr on each machine, you can search
for the name of the file on all the /net/XXXX/dev/mynames. Iin this
way you will also be able to extract the name of node which sould
allow for correction of the local node NID… ( i.e. the entry in /net
will be your local name)

Since you only register on your machine, all you need to worry about
is the uniqueness of the local name - remote name will always have the

node name is a qualifier to separate duplicates. As far as sending the
name - you could register the name for the connection as “nd_pid_chid”
send the tuple in you message and then search for it in the net
directory. You could additionally add the MAC address of your card
(one would do) to the registered name and send that as well… So maybe
you register “MAC_pid_chid” as your name and pass them numerically
as part of your message… The size is fixed and relatively small and
generating the filename to search for is trivial…

In this model, all client programs simply need to have read permission
of the created files in /dev/mynames… I do not plan any fancy
permission scheme, but it certainly be possible. This model also makes
it really easy to retain the QNX4 name_locate functionality with set
of wrapper functions… The reason for the resmgr at all is to
automatically reap stale resgistrations…

Would something like this work for you?

-Bill

There are server’s using the technic to gathering
“nd_of_me_on_his_machine/mypid/mychid”, pass it back to remote,
so remote could connect back. Proc, for example, doing this
all the time.

But this won’t work through a queue.

The concern is what if the “nd” I got is changed after I send
it back (Brian’s warning > :slight_smile:> .

The only reason that a “nd” could invalid/changed, is that
QNET detect the machine is down/unreachable, thus set the
nd “invalid”, and it will “stay” in invalid mode for
“a while”.

OK, so let’s say that nd is effectively safe. The question still
applies. How to transmit nid/pid/chid to identify A when the
node of B is not known.

If it will make the problem easier, let’s just look at two cases:
MsgSend() and queued messages. Forget about sockets for
the time being. I think that if we can produce a solution for
transmitting a message via a queue where the receiver’s node
is not known, then that will be good enough.

You also didn’t answer the question of whether the FQNN
as a string is actually usable, given that two nodes may have
different ideas of the MAC-to-name mapping.

Cheers,
Andrew

OK, I start to understand what you really want :slight_smile:

Andrew Thomas <Andrew@cogent.ca> wrote:

Previously, Xiaodan Tang wrote in qdn.public.qnxrtp.os:

In case of MsgSend(), you need to ConnectAttach() first, which
request a (nd, pid, chid). If you don’t know the nd of B, then
you call netmgr_strtond() to get the nd.

If I need to call netmgr_strtond() every time I want to send a message
to B, that is really expensive. If I want to find
“my_node_according_to_him” for every message, this is intolerably
expensive. Remember that I’m looking for a way that B can send a
message back to A based on the information that A sent.

I thought A only need to do this once. If B ConnectAttach() back
to A, B could keep the connection, so that A no longer need to
tell B again where to connect.

Once A and B have a connection, B could MsgSend() A, and A could
MsgDelieverEvent() to B.

Currently, if you don’t know the node’s name you want to talk
to, you can’t talk to them.

Wrong. I can use a queue. In the more general case, I can use any
non-MsgSend() mechanism to transmit a message, and it will have the
same properties in this regard as a queue.

Well, what I mean “talk” is meant ConnectAttach() and MsgSend().

Now, if you are talking about “non-MsgSend() mechanism”, you
can of cause send out “whoever receive this message, send me
back your name, I will send you a nd/pid/chid to connect me”
on this “non-MsgSend() mechanism”, does it?

if A knows C as “foo”, but B knows C as “bar”, and A send a
message to B say “talking to the pid/chid on foo”, B will
return error say “I don’t know ‘foo’”.

(there is a way in QNET to “alias” different name to same node,
but let’s not get in there first).

In the above case, who have the knowladge that “foo” is “bar” ?

Nobody. That was my question. It seems that FQNN is not sufficient
to for A to identify B to C. The frustrating thing here is that if
you stick to open() calls and resource managers, everything is fine.
I just happen to be trying to color outside the rather narrow lines
that seem to have been painted by the nd/pid/chid implementation.

We take the above case is a “mis-configed” network.
If C have 10.1 and 11.1, A knows C as 10.1, and B knows C as 11.1,
how can A ask B to connect C ?

When all is said and done, I think the answer here is:

  1. A must always know B’s node FQNN in order for A to identify itself
    to B. There is no way for A to use the same information to
    identify itself to two different processes unless it does so using
    its FQNN. This is annoying and won’t work in the case where A does
    not know which node B is on. It is certainly a step backward from
    QNX4.

“A must know B to identify itself (using nd/pid/chid) to B”.

“A could broadcast (A’s FQNN/pid/chid) so whoever receive the message,
could connect back to A”, I don’t know why you claim it won’t work.

If A don’t which node B is on, A can not use MsgSend() mechanism
to “talk to” B. If A some how, could using some other way to
transfer messages to B, A could either ask B’s FQNN (so A could
send back nd/pid/chid), or A could send B A’s FQNN (to let b
connect back)

  1. Both A and B must be willing to accept a finite chance that if they
    attempt to attach a connection to one another using nd/pid/chid,
    that connection might in fact go to a completely different process
    on a completely different node (the nd and/or pid are stale). The
    chance seems remote, but it hurts the mathie in me. It may
    actually be lower than the equivalent chance of this happening with
    QNX4 because the PID space is larger in QNX6.

This is true, and that’s the warn that appliction should never
“store” a nd.

  1. All nodes on the network have to agree on the FQNNs of all other
    nodes, otherwise the whole nd/pid/chid scheme appears to fall
    apart. This appears to be a real step-to-the-left relative to the
    experiences of TCP/IP developers, where a name is just a
    convenience to refer to the globally unique identifier, and the
    globally unique identifier is sacred. What we appear to have
    concluded is that QNET has no globally unique identifiers. (And
    no, MAC address isn’t it either. Not all machines have ethernet
    cards. I’ve read the posts saying that QNET is not tied to the
    physical layer).

The FQNN IS (must be) globally unique. If it is not, (you have 2
nodes have the same name for example) QNET will not properly functional.

When I started this thread, what I was really looking for is the
implementation limitations of QNET so I knew how to go about solving
my problems. Do the above points accurately describe QNET?

Yes. And this is very good. Sometiem I feel I was bounded into the
MsgSend()/Receive()/Reply() world too long, and need this kind of
fresh air to open my mind :slight_smile:

Oh, the “global name” service (name_*) is implemented into QNET
(internally). So a client could ask a service, without knowing
which node the service is on. (kind of like your name server in
your another post).

-xtang

I’m new to this correctness / mathmatically provable thing BUT…


assuming RTC’s in all nodes

how about Machine ID + Boot Time + PID … etc ?

Andrew Thomas <Andrew@cogent.ca> wrote:

Hey Andrew…

  1. A must always know B’s node FQNN in order for A to identify itself
    to B. There is no way for A to use the same information to
    identify itself to two different processes unless it does so using
    its FQNN. This is annoying and won’t work in the case where A does
    not know which node B is on. It is certainly a step backward from
    QNX4.

Really? I don’t think so at all. Instead of having to map a node#
to a specific MAC address and keeping that constant on all machines you
only need to keep a mapping of a name to an IP address in /etc/hosts
and you can change one machine in the system without having to touch
all the other machines. And since you control the entries in /etc/hosts
you can force the names to a fixed length.

Here is something I am not really clear on in all your postings. You are
worried that getting the nd from the name will cost you too much if you have
to do it everytime you want to send a message. Are you tearing down the
connection between every message? That is the only time you need you need
to know the nd, once you have connection established you can keep using that
until it becomes invalid and then, and only then, do you have to lookup
the nd of the remote node again. This is very different from QNX4 in which
the connection is tied to the PID. And if you are tearing down the connection
between every message then the cost of looking up the nd is gonna be 0
relative to the cost of setting up the connection over the network. :wink: Maybe
I am missing something in what you said…

chris

cdm@qnx.com > “The faster I go, the behinder I get.”

Chris McKillop – Lewis Carroll –
Software Engineer, QSSL
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Chris give a good point that we seems didn’t make it clear
in previos post.

Even though, keep a nd is not a good thing, keep a coid is totally
safe.

Once you established a connection (ConnectAttach()), that connection
will always valid, until you ConnectDetach(). If the node you are
talking to, is gone (and even comming back later), next time you
using the connection, you will got error.

-xtang


Chris McKillop <cdm@qnx.com> wrote:

Andrew Thomas <> Andrew@cogent.ca> > wrote:

Hey Andrew…

  1. A must always know B’s node FQNN in order for A to identify itself
    to B. There is no way for A to use the same information to
    identify itself to two different processes unless it does so using
    its FQNN. This is annoying and won’t work in the case where A does
    not know which node B is on. It is certainly a step backward from
    QNX4.


    Really? I don’t think so at all. Instead of having to map a node#
    to a specific MAC address and keeping that constant on all machines you
    only need to keep a mapping of a name to an IP address in /etc/hosts
    and you can change one machine in the system without having to touch
    all the other machines. And since you control the entries in /etc/hosts
    you can force the names to a fixed length.

Here is something I am not really clear on in all your postings. You are
worried that getting the nd from the name will cost you too much if you have
to do it everytime you want to send a message. Are you tearing down the
connection between every message? That is the only time you need you need
to know the nd, once you have connection established you can keep using that
until it becomes invalid and then, and only then, do you have to lookup
the nd of the remote node again. This is very different from QNX4 in which
the connection is tied to the PID. And if you are tearing down the connection
between every message then the cost of looking up the nd is gonna be 0
relative to the cost of setting up the connection over the network. > :wink: > Maybe
I am missing something in what you said…

chris

cdm@qnx.com > “The faster I go, the behinder I get.”
Chris McKillop – Lewis Carroll –
Software Engineer, QSSL