Previously, Mitchell Schoenbrun wrote in qdn.public.qnxrtp.os:
This question has been out here for a few days without
reply. It can’t be all that difficult unless for some reason
this interface is not public. Is that the case?
Previously, Mitchell Schoenbrun wrote in qdn.public.qnxrtp.os:
In trying to understand the new NTO message passing I’ve
searched down to where I understand that on the server side
I call resmgr_attach() to attach to a piece of name space
and create a channel to it. On the client side I call
open() to find the channel and make a connection to it.
I can’t find anything lower level. I presume that resmgr_attach()
calls ChannelCreate() and then some routine to bind this channel
to the name space, and that open(), knowing the path requested,
retrieves the channel from the kernel and then calls ConnectAttach()
to make a connection.
Can anyone point me at the missing routines. I’m not
against using the higher level interface. I’d just like to
know what is going on below.
Well, I’ll take a shot at it, Mitchell:
Resmgr_attach() is not necessary unless you are making a resource
manager. At the lowest level, a connection from client to server is
made by the client to a nd/pid/chid tuple on the server, where:
nd = node descriptor. This is not a node number, but a
locally unique number on the client’s machine. No other
machine has this number as a descriptor for the server’s
machine. In addition, this nd could change whenever the
server’s machine drops off the network or is rebooted.
pid = the pid of the server, on the server’s node
chid = a channel ID on the server. The server had to create
this.
The client calls
coid = ConnectAttach (nd, pid, chid, 1, _NTO_SIDE_CHANNEL);
to produce a connection ID, which is essentially the file descriptor
for this client/server connection. This connection ID is only
meaningful to the client. The server doesn’t even know that it was
created.
To send a message from client to server, the client calls:
MsgSend (coid, sendbuffer, sendbytes, recbuffer, recbytes);
which is almost exactly the same as the QNX4 Send() call, except that
the parameters are in a different order.
On the server side, the server calls:
chid = ChannelCreate (0);
to simply create the next available channel. This is the chid that
the client needs to use when calling ConnectAttach().
Now, the server waits for messages:
rcvid = MsgReceive (chid, rcvmsg, maxlen, NULL);
The rcvid is a magic cookie that is used in MsgDeliverEvent and
MsgReply to refer to the client. There is no useful information
embedded in it. The lifetime of a rcvid is “long”, where “long” is
not well documented, but it is definitely longer-lived than the
particular send/receive/reply transaction that created it. You can,
for example, MsgReply to a rcvid, and then use the rcvid later in a
MsgDeliverEvent, and that rcvid will still be valid.
The server does its processing, then calls:
MsgReply (rcvid, status, replymsg, length);
which concludes the S/R/R transaction.
Now, the 64-dollar question(s):
The server created a chid, which the client needs, yet there is no
apparent way for the client to get this chid. How do I get the chid
to the client? For that matter, how do I get the pid to the client,
and what the heck is nd?
The answer is, “Deal with it.” You can write the chid and pid to a
file, or pass it to the client in its argument list, or pass it
through a pipe, or whatever. The original assumption in QNX6 was that
all servers must be resource managers. With a resource manager you
use resmgr_attach() to create an entry in the filesystem name space,
and an open() on that name generates a coid for the client directly,
bypassing the need for nd/pid/chid. Later, the POSIX purists buckled
under public pressure, and the functions name_attach and name_open
were added as analogs to the QNX4 qnx_name_attach and qnx_name_locate
calls. The problem is that they do not currently work over the
network, so you have to live with local node communication.
Name_attach creates a filesystem name, just like resmgr_attach, and
name_open generates a coid, just like open.
As for the nd, this is another magic cookie, only valid on the node on
which it was created. That is, if you have machines A, B, C, then A
could have nd_of_B = 43, nd_of_C = 22. B could have nd_of_A = 22,
nd_of_C = 99. You cannot share nd’s among tasks on different
machines, though they are sharable among tasks on the same machine.
The nd of your own node is always 0.
You create a nd by doing a name lookup on the node name, which you got
because you just knew it. The node name is typically the hostname
of the machine, unless you specifically stated otherwise in the
parameter list to npm-qnet.so. The command to create a nd is: nd =
netmgr_strtond (nodename, NULL); The node name is referred to as the
Fully Qualified Node Name, or FQNN. So, in addition to passing the
chid and pid of the server to the client, you must also pass the
variable-length FQNN string to the client. If you are relying on your
TCP setup to generate the hostname for the FQNN, you have to be
careful. In QNET, the FQNN must be unique (one-to-one mapping with a
node). In TCP, no such restriction exists. Consequently, a
well-formed TCP network naming strategy can be a mal-formed QNET
network naming strategy.
So, the API to perform low-level messaging is simple, and very similar
to QNX4, but the specific information required by that API is subject
to a catch-22. In order for the client to connect to the server, the
server must first send a message to the client with the information
the client needs. But of course the server cannot know how to send
the message to the client. So you need a mailbox, effectively, where
the server places this information. With a resource manager, this is
formalized through the file system name space. The server’s message
to the client is the registration of a name. If you just want to whip
up two processes that send messages like you did in QNX4, then you
need another way.
Incidentally, this problem always existed in QNX4 as well. You needed
to know something about the server - its nid/pid. The QNX4 nameloc
program and the name registry in Proc solved this mapping for you.
The complexity in QNX6 stems from just one thing - nobody has written
a global naming service yet. The resmgr_attach() and name_attach()
functions, since they operate on the file name space of the local
machine, don’t really handle one class of problems: “I know that
service XYZ exists, but I don’t know (or care) which node it is on.
I just want to use it.” The solutions to this question are generally
either an exhaustive search of all available nodes, or storage of
global information at one agreed-upon central location. Both are poor
solutions.
Hope this helps to shed some light,
Andrew