Inter-node spawn confused about node name

Now this is wierd.

I have a program on node “fox” that spawns another program on node
“bobcat”. This works, and the program really launches on “bobcat”;
I can see it in “ps” on “bobcat”.

In that program, I call

char mynodename[512];
int cnt = netmgr_ndtostr(0,ND_LOCAL_NODE,mynodename,sizeof(mynodename));

which should return “bobcat”. But, in fact, it returns
“fox” (or “fox.overbot.org”, actually), even though the program
is running on “bobcat”.

What’s going on here? Does the node ID inherit across spawn?
That’s not right.

John Nagle

John Nagle <nagle@overbot.com> wrote:

Now this is wierd.

I have a program on node “fox” that spawns another program on node
“bobcat”. This works, and the program really launches on “bobcat”;
I can see it in “ps” on “bobcat”.

In that program, I call

char mynodename[512];
int cnt = netmgr_ndtostr(0,ND_LOCAL_NODE,mynodename,sizeof(mynodename));

which should return “bobcat”. But, in fact, it returns
“fox” (or “fox.overbot.org”, actually), even though the program
is running on “bobcat”.

What’s going on here? Does the node ID inherit across spawn?
That’s not right.

Yes it is - since your process is still rooted at the node you spawned it
from. So if you try to access /tmp it will be /tmp on fox. If you want it
to be rooted on the other node just chroot() before you start the process.

chris


Chris McKillop <cdm@qnx.com> “The faster I go, the behinder I get.”
Software Engineer, QSSL – Lewis Carroll –
http://qnx.wox.org/

OK. I accept that “rooted on” and “running on” can be
different.

It would be appreciated if this were to be
documented in the QNX help files.

John Nagle
Team Overbot

Chris McKillop wrote:

John Nagle <> nagle@overbot.com> > wrote:

Now this is wierd.

I have a program on node “fox” that spawns another program on node
“bobcat”. This works, and the program really launches on “bobcat”;
I can see it in “ps” on “bobcat”.

In that program, I call

char mynodename[512];
int cnt = netmgr_ndtostr(0,ND_LOCAL_NODE,mynodename,sizeof(mynodename));

which should return “bobcat”. But, in fact, it returns
“fox” (or “fox.overbot.org”, actually), even though the program
is running on “bobcat”.

What’s going on here? Does the node ID inherit across spawn?
That’s not right.



Yes it is - since your process is still rooted at the node you spawned it
from. So if you try to access /tmp it will be /tmp on fox. If you want it
to be rooted on the other node just chroot() before you start the process.

chris

But the node number for ConnectAttach has to be the
real remote node number. That is, if node “fox” spawns
a process on node “bobcat”, and the process on “bobcat”
wants to talk to a channel opened by the process on “fox”,
it has to do a ConnectAttach specifying the node descriptor
of “fox”, not “bobcat”. ConnectAttach needs a valid
nd/pid/chid triplet. You can’t use a node descriptor of 0
to refer to another node. So a “rooted” node descriptor
is useless for ConnectAttach.

Even worse, if I actually do

netmgr_strtond(“fox”,0)

on “bobcat”, I get 0 returned. That’s just wrong.

How do I get real node descriptors and numbers in
a spawned process? “chroot” only works if you’re
“root”, and I’m trying hard to avoid running
programs as root.

John Nagle

John Nagle wrote:

OK. I accept that “rooted on” and “running on” can be
different.

It would be appreciated if this were to be
documented in the QNX help files.

John Nagle
Team Overbot

Chris McKillop wrote:

John Nagle <> nagle@overbot.com> > wrote:

Now this is wierd.

I have a program on node “fox” that spawns another program on node
“bobcat”. This works, and the program really launches on “bobcat”;
I can see it in “ps” on “bobcat”.

In that program, I call

char mynodename[512]; int cnt =
netmgr_ndtostr(0,ND_LOCAL_NODE,mynodename,sizeof(mynodename));

which should return “bobcat”. But, in fact, it returns
“fox” (or “fox.overbot.org”, actually), even though the program
is running on “bobcat”.

What’s going on here? Does the node ID inherit across spawn?
That’s not right.



Yes it is - since your process is still rooted at the node you spawned it
from. So if you try to access /tmp it will be /tmp on fox. If you
want it
to be rooted on the other node just chroot() before you start the
process.

chris
\

Here’s the solution - how to look up a node descriptor given
a node name. This works even if the program running it was
launched via an inter-node spawn.

John Nagle
Team Overbot

//
//
// getnodend – get usable node descriptor given node identifier string
//
// This works even if the calling process is spawned from a remote node.
//
// When a process is spawned by a remote root node, calls to the “netmgr_”
// functions behave as if the process is on the remote root node. But
// calls to ConnectAttach require the node number as valid for the
// local node. Thus, some conversions are required.
//
int getnodend(const char* remotename)
{ assert(remotename); // must be valid remote node name
// Get local node name
struct utsname thisnodeinfo; // local info
if (uname(&thisnodeinfo) < 0) return(-1); // get local node node
descriptor in root node space
int localrootnd = netmgr_strtond(thisnodeinfo.nodename,0);
if (localrootnd < 0) return(-1);
// get remote node node descriptor in root node space
int remoterootnd = netmgr_strtond(remotename,0);
if (remoterootnd < 0) return(-1);
// Now we have all the node descriptors.
// Translate the remote node descriptor into local node space.
int remotelocalnd = netmgr_remote_nd(localrootnd, remoterootnd);// translate
if (remotelocalnd < 0) return(-1);
return(remotelocalnd); // return valid local node descriptor for remote node
}

If you spawn() a process on another node, as Chris point out, your root
is still “/net/fox”. So anything involve pathname, will be goes back to
orignal node.

netmgr_*() make a connection to “/dev/netmgr”, which, in the spawned
case, it actually connect to “/net/fox/dev/netmgr”. That’s why you got the
name.

ConnectAttach() on the other hand, don’t involve a pathname, so …

If you want the process running on remote “as you started it on remote
node”.
You should do a chroot("/net/bobcat"); before spawn().

-xtang

John Nagle <nagle@overbot.com> wrote in message
news:3F7F3A4F.5050206@overbot.com

But the node number for ConnectAttach has to be the
real remote node number. That is, if node “fox” spawns
a process on node “bobcat”, and the process on “bobcat”
wants to talk to a channel opened by the process on “fox”,
it has to do a ConnectAttach specifying the node descriptor
of “fox”, not “bobcat”. ConnectAttach needs a valid
nd/pid/chid triplet. You can’t use a node descriptor of 0
to refer to another node. So a “rooted” node descriptor
is useless for ConnectAttach.

Even worse, if I actually do

netmgr_strtond(“fox”,0)

on “bobcat”, I get 0 returned. That’s just wrong.

How do I get real node descriptors and numbers in
a spawned process? “chroot” only works if you’re
“root”, and I’m trying hard to avoid running
programs as root.

John Nagle

John Nagle wrote:
OK. I accept that “rooted on” and “running on” can be
different.

It would be appreciated if this were to be
documented in the QNX help files.

John Nagle
Team Overbot

Chris McKillop wrote:

John Nagle <> nagle@overbot.com> > wrote:

Now this is wierd.

I have a program on node “fox” that spawns another program on node
“bobcat”. This works, and the program really launches on “bobcat”;
I can see it in “ps” on “bobcat”.

In that program, I call

char mynodename[512]; int cnt =
netmgr_ndtostr(0,ND_LOCAL_NODE,mynodename,sizeof(mynodename));

which should return “bobcat”. But, in fact, it returns
“fox” (or “fox.overbot.org”, actually), even though the program
is running on “bobcat”.

What’s going on here? Does the node ID inherit across spawn?
That’s not right.



Yes it is - since your process is still rooted at the node you spawned
it
from. So if you try to access /tmp it will be /tmp on fox. If you
want it
to be rooted on the other node just chroot() before you start the
process.

chris


\

John Nagle <nagle@overbot.com> wrote:
: OK. I accept that “rooted on” and “running on” can be
: different.

: It would be appreciated if this were to be
: documented in the QNX help files.

We recently looked at all of the docs concerning Qnet, and we plan to
improve them for 6.3. Your problem sounds like a good one to address.
Thanks for the suggestion.


Steve Reid stever@qnx.com
TechPubs (Technical Publications)
QNX Software Systems

“chroot” is an interesting option. The documentation says
that you have to be “root” to use it, but that is apparently incorrect.
Also, “chroot” is per-process, not per-thread, so I’d have
to do some extra local process forking.

John Nagle

Xiaodan Tang wrote:

If you spawn() a process on another node, as Chris point out, your root
is still “/net/fox”. So anything involve pathname, will be goes back to
orignal node.

netmgr_*() make a connection to “/dev/netmgr”, which, in the spawned
case, it actually connect to “/net/fox/dev/netmgr”. That’s why you got the
name.

ConnectAttach() on the other hand, don’t involve a pathname, so …

If you want the process running on remote “as you started it on remote
node”.
You should do a chroot("/net/bobcat"); before spawn().

As a final note here, QNX doesn’t support “fork” for multi-threaded
programs. You’re supposed to use “spawn” in that case. So a solution
that requires “fork” is undesirable.

John Nagle

John Nagle wrote:

“chroot” is an interesting option. The documentation says
that you have to be “root” to use it, but that is apparently incorrect.
Also, “chroot” is per-process, not per-thread, so I’d have
to do some extra local process forking.

John Nagle

Xiaodan Tang wrote:

If you spawn() a process on another node, as Chris point out, your root
is still “/net/fox”. So anything involve pathname, will be goes back to
orignal node.

netmgr_*() make a connection to “/dev/netmgr”, which, in the spawned
case, it actually connect to “/net/fox/dev/netmgr”. That’s why you got
the
name.

ConnectAttach() on the other hand, don’t involve a pathname, so …

If you want the process running on remote “as you started it on remote
node”.
You should do a chroot("/net/bobcat"); before spawn().