Opening a resource manager on a different node?

Hi. I’m continuing to have problems getting our app to span multiple
nodes. Here’s what I’m doing:

This is more or less a direct translation of a QNX 4 app. The main
process, called exec, runs on one node. It uses the resource manager
framework and registers the name “/project/exec”. Based on a config
file, it spawns subordinate processes. These subordinate processes work
fine and the app runs as expected (just slowly - it’s very CPU
intensive) as long as everything is kept on a single node.

Optionally, one or more of the subordinate processes can be spawned on
different nodes. For testing purposes, we have two nodes (called dev-1
and dev-2). Both are running the latest 6.2.1 NC download (the latest
as of a few weeks ago, at least). Exec and most subordinate processes
run on dev-1. I’m trying to spawn a process on dev-2.

After being spawned, the subordinate processes try to open exec’s “file”
using open(). I’ve already discovered that I can’t just open “./exec”,
like I did under QNX 4; instead, I open “/net/dev-1/project/exec”. That
leads me to the first problem: the first call to open() always fails in
the process running on dev-2 (with errno 3 - “No such process”). If I
just repeat the call, the second open() succeeds, or seems to.

I then call fcntl() on the resulting file descriptor, and that call
works. But when I call MsgSend() a few lines later with the file
descriptor, MsgSend() fails. Retrying it makes no difference; the call
always fails. In this case, errno is 89 - “Function not implemented”.

In summary, here are my questions:

  1. Is there a better, more QNX-6-ish way of spanning multiple nodes? I
    haven’t had much luck trying to cram everything into the QNX 4 style.

  2. Why does the first call to open() fail, but the second call succeeds?

  3. Why does MsgSend() always fail?

Thanks for any help; I can post code samples or answer questions as
necessary.

Josh Hamacher
FAAC Incorporated

Hi Josh,

  1. Is there a better, more QNX-6-ish way of spanning multiple nodes? I
    haven’t had much luck trying to cram everything into the QNX 4 style.

Not that I know of, this is the way I do it.

  1. Why does the first call to open() fail, but the second call succeeds?

I never experienced this, probably because I wrapped the open into a
function that will always try again. I did this because I know that
sometimes a node isn’t always visible when I want it to be.

  1. Why does MsgSend() always fail?

It may be because you are using POSIX io handling in your resource manager?
If so, then it is only looking for IO_ messages. If it doesn’t find one,
then it returns an error to the sender. If you call read() it will probably
work (read is wraps a MsgSend()). Also there is a io_func that supports raw
messages, I don’t have the docs in front me, so I don’t remember what it is.
My memory is good, it just doesn’t last very long.

Regards,
Kevin

Thanks for any help; I can post code samples or answer questions as
necessary.

Josh Hamacher
FAAC Incorporated

It may be because you are using POSIX io handling in your resource manager?
If so, then it is only looking for IO_ messages. If it doesn’t find one,
then it returns an error to the sender. If you call read() it will probably
work (read is wraps a MsgSend()). Also there is a io_func that supports raw
messages, I don’t have the docs in front me, so I don’t remember what it is.
My memory is good, it just doesn’t last very long.

message_attach() is the call you are thinking of…

http://www.qnx.com/developer/articles/index.html?article=jan2601


chris

\

Chris McKillop <cdm@qnx.com> “The faster I go, the behinder I get.”
Software Engineer, QSSL – Lewis Carroll –
http://qnx.wox.org/

Just a shot in the dark, but if this program is running at boot time, QNET
takes a few seconds to orient itself in the network. During that time, you
will get errors when referencing other nodes (since none exist AFA QNET is
concerned).

Daryl Low

“Josh Hamacher” <jh@faac.com> wrote in message
news:bavkj9$85k$1@inn.qnx.com

Hi. I’m continuing to have problems getting our app to span multiple
nodes. Here’s what I’m doing:

This is more or less a direct translation of a QNX 4 app. The main
process, called exec, runs on one node. It uses the resource manager
framework and registers the name “/project/exec”. Based on a config
file, it spawns subordinate processes. These subordinate processes work
fine and the app runs as expected (just slowly - it’s very CPU
intensive) as long as everything is kept on a single node.

Optionally, one or more of the subordinate processes can be spawned on
different nodes. For testing purposes, we have two nodes (called dev-1
and dev-2). Both are running the latest 6.2.1 NC download (the latest
as of a few weeks ago, at least). Exec and most subordinate processes
run on dev-1. I’m trying to spawn a process on dev-2.

After being spawned, the subordinate processes try to open exec’s “file”
using open(). I’ve already discovered that I can’t just open “./exec”,
like I did under QNX 4; instead, I open “/net/dev-1/project/exec”. That
leads me to the first problem: the first call to open() always fails in
the process running on dev-2 (with errno 3 - “No such process”). If I
just repeat the call, the second open() succeeds, or seems to.

I then call fcntl() on the resulting file descriptor, and that call
works. But when I call MsgSend() a few lines later with the file
descriptor, MsgSend() fails. Retrying it makes no difference; the call
always fails. In this case, errno is 89 - “Function not implemented”.

In summary, here are my questions:

  1. Is there a better, more QNX-6-ish way of spanning multiple nodes? I
    haven’t had much luck trying to cram everything into the QNX 4 style.

  2. Why does the first call to open() fail, but the second call succeeds?

  3. Why does MsgSend() always fail?

Thanks for any help; I can post code samples or answer questions as
necessary.

Josh Hamacher
FAAC Incorporated

Thanks to everyone who replied. I tried to consolidate all of my
answers into this message.

  1. Why does the first call to open() fail, but the second call succeeds?


    I never experienced this, probably because I wrapped the open into a
    function that will always try again. I did this because I know that
    sometimes a node isn’t always visible when I want it to be.

I find it a little alarming and a lot inconvenient that the first open()
fails. Granted, the second open() has always succeeded (so far…), but
this still feels wrong. Both nodes have been up for weeks and I can
browse both hard drives from /net/ on either machine, so it’s not a
bootup timing issue or anything like that. The resource manager’s
prefix is visible in the file system before I ever issue the command
that spawns the subordinate process, so it’s not a race condition. Is
this just a Neutrino-ism that I’m going to have to accept?

  1. Why does MsgSend() always fail?


    It may be because you are using POSIX io handling in your resource manager?
    If so, then it is only looking for IO_ messages. If it doesn’t find one,
    then it returns an error to the sender. If you call read() it will probably
    work (read is wraps a MsgSend()). Also there is a io_func that supports raw
    messages, I don’t have the docs in front me, so I don’t remember what it is.
    My memory is good, it just doesn’t last very long.

Good call on read(). I inserted a dummy read() call immediately before
the MsgSend(), and it succeeded (or at least it claims it read 56 bytes,
even though I’m using the default io_read handler and would have
expected zero bytes).

I guess I don’t understand what you’re getting at, though. I’ve been
using message type _IO_MSG for all of my internal communications, which
works fine when run on a single node. What aspect of node-to-node
communications causes _IO_MSG to not work while _IO_READ does?

As per your suggestion (at least based on what Chris said), I tried
message_attach() with MSG_FLAG_DEFAULT_FUNC; that didn’t seem to make
any difference, the MsgSend() still failed, and my default handler never
got anything. Did I misinterpret your suggestion? I couldn’t find
anything that looked promising for raw messages, although that tickled
my memory as well and I think I read about something like that myself.

Josh

Ping?

Josh


Josh Hamacher wrote:

Thanks to everyone who replied. I tried to consolidate all of my
answers into this message.

  1. Why does the first call to open() fail, but the second call
    succeeds?



    I never experienced this, probably because I wrapped the open into a
    function that will always try again. I did this because I know that
    sometimes a node isn’t always visible when I want it to be.


    I find it a little alarming and a lot inconvenient that the first open()
    fails. Granted, the second open() has always succeeded (so far…), but
    this still feels wrong. Both nodes have been up for weeks and I can
    browse both hard drives from /net/ on either machine, so it’s not a
    bootup timing issue or anything like that. The resource manager’s
    prefix is visible in the file system before I ever issue the command
    that spawns the subordinate process, so it’s not a race condition. Is
    this just a Neutrino-ism that I’m going to have to accept?

  2. Why does MsgSend() always fail?



    It may be because you are using POSIX io handling in your resource
    manager?
    If so, then it is only looking for IO_ messages. If it doesn’t find one,
    then it returns an error to the sender. If you call read() it will
    probably
    work (read is wraps a MsgSend()). Also there is a io_func that
    supports raw
    messages, I don’t have the docs in front me, so I don’t remember what
    it is.
    My memory is good, it just doesn’t last very long.


    Good call on read(). I inserted a dummy read() call immediately before
    the MsgSend(), and it succeeded (or at least it claims it read 56 bytes,
    even though I’m using the default io_read handler and would have
    expected zero bytes).

I guess I don’t understand what you’re getting at, though. I’ve been
using message type _IO_MSG for all of my internal communications, which
works fine when run on a single node. What aspect of node-to-node
communications causes _IO_MSG to not work while _IO_READ does?

As per your suggestion (at least based on what Chris said), I tried
message_attach() with MSG_FLAG_DEFAULT_FUNC; that didn’t seem to make
any difference, the MsgSend() still failed, and my default handler never
got anything. Did I misinterpret your suggestion? I couldn’t find
anything that looked promising for raw messages, although that tickled
my memory as well and I think I read about something like that myself.

Josh

Josh Hamacher <jh@faac.com> wrote:

Ping?

Do you have a stand-alone test case that shows the failure? Then we could
see what you are doing in your code and it will be much easier to comment
on specifics.

chris


Chris McKillop <cdm@qnx.com> “The faster I go, the behinder I get.”
Software Engineer, QSSL – Lewis Carroll –
http://qnx.wox.org/

Uggh. Oops.

I started stripping things down as much as I could to create the test
case. And discovered a silly error on my part.

But…

I now have a different problem: the open() call always fails. The
resource manager runs on dev-1 and registers the name “/project/exec”.
This is visible from dev-2; a ‘ls -l /net/dev-1/project/exec’ results in
this output:

nrw-rw-rw- 1 root root 0 Jun 10 18:39 /net/dev-1/project/exec

But, when I do an open() from the subsystem, spawned by exec onto dev-2,
it always fails with ESRCH. I ran into this problem originally and
thought I had gotten around it, but that was the error I mentioned
above. So I just can’t open the resource manager. Any hints? I’ll
continue stripping down the code to post an example of this problem.

Josh

Here is a test case that illustrates the current problem (open() always
failing). There are three files: resmgr.cpp, client.cpp, and Makefile;
the names are self-explanatory.

There are two constants defined at the top of resmgr.cpp; you can change
these to reflect the names of the nodes to be used.

To run the test, compile the two programs and place them both in “/tmp/”
(I have the startup path for the client hardcoded to that location).
Start the resource manager. After five seconds, it spawns the client on
a remote node. The client attempts to open the resource manager once
per second for ten seconds. It exits; 18 seconds after starting, the
resource manager also exits.

All output is to the syslog; here’s an example of what I’ve been getting:

Jun 10 20:33:45 nto resmgr[11743269-1]: spawning client
Jun 10 20:27:55 nto client[1691772-1]: making attempt 1 on /net/dev-1/resmgr
Jun 10 20:27:55 nto client[1691772-1]: Unable to open Exec’s file; errno 309
Jun 10 20:27:56 nto client[1691772-1]: making attempt 2 on /net/dev-1/resmgr
Jun 10 20:27:56 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:27:57 nto client[1691772-1]: making attempt 3 on /net/dev-1/resmgr
Jun 10 20:27:57 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:27:58 nto client[1691772-1]: making attempt 4 on /net/dev-1/resmgr
Jun 10 20:27:58 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:27:59 nto client[1691772-1]: making attempt 5 on /net/dev-1/resmgr
Jun 10 20:27:59 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:00 nto client[1691772-1]: making attempt 6 on /net/dev-1/resmgr
Jun 10 20:28:00 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:01 nto client[1691772-1]: making attempt 7 on /net/dev-1/resmgr
Jun 10 20:28:01 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:02 nto client[1691772-1]: making attempt 8 on /net/dev-1/resmgr
Jun 10 20:28:02 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:03 nto client[1691772-1]: making attempt 9 on /net/dev-1/resmgr
Jun 10 20:28:03 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:04 nto client[1691772-1]: making attempt 10 on
/net/dev-1/resmgr
Jun 10 20:28:04 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:05 nto client[1691772-1]: retries exhausted
Jun 10 20:33:58 nto resmgr[11743269-1]: exiting

Thanks for looking at this.

Josh

==================================
Makefile:


all: resmgr client

client: client.cpp
QCC -o $@ client.cpp

resmgr: resmgr.cpp
QCC -o $@ resmgr.cpp

\

resmgr.cpp


#include <errno.h>
#include <fcntl.h>
#include <spawn.h>
#include <stdlib.h>
#include <string.h>
#include <syslog.h>
#include <unix.h>

#include <sys/iofunc.h>
#include <sys/dispatch.h>
#include <sys/netmgr.h>

const char *local_node = “dev-1”;
const char *remote_node = “dev-2”;

static dispatch_t *dpp;
static resmgr_io_funcs_t io_funcs;

int set_up_timing(void);
int timer_handler(message_context_t *ctp, int code, unsigned flags, void
*handle

int main(int argc, char *argv[])
{
iofunc_attr_t attr;
resmgr_connect_funcs_t connect_funcs;
dispatch_context_t *ctp;
resmgr_attr_t resmgr_attr;

// Daemonize ourselves.
switch (fork()) {
case -1:
syslog(LOG_ERR, “fork() failed”);
exit(EXIT_FAILURE);
break;

case 0:
break;

default:
exit(EXIT_SUCCESS);
break;
}

if (setsid() == -1) {
syslog(LOG_ERR, “setsid() failed”);
exit(EXIT_FAILURE);
}

openlog(“resmgr”, LOG_PID, LOG_LOCAL0);

if((dpp = dispatch_create()) == NULL) {
syslog(LOG_ERR, “dispatch_create() failed”);
exit(EXIT_FAILURE);
}

memset(&resmgr_attr, 0, sizeof resmgr_attr);
resmgr_attr.nparts_max = 1;
resmgr_attr.msg_max_size = 2048;

iofunc_func_init(_RESMGR_CONNECT_NFUNCS, &connect_funcs,
_RESMGR_IO_NFUNCS, &

iofunc_attr_init(&attr, S_IFNAM | 0666, 0, 0);

if(resmgr_attach(dpp, &resmgr_attr, “/resmgr”, _FTYPE_ANY, 0,
&connect_funcs,
syslog(LOG_ERR, “resmgr_attach() failed”);
exit(EXIT_FAILURE);
}

ctp = dispatch_context_alloc(dpp);

if (set_up_timing() == -1) {
syslog(LOG_ERR, “set_up_timing() failed”);
exit(EXIT_FAILURE);
}

while (1) {
if ((ctp = dispatch_block(ctp)) == NULL) {
syslog(LOG_ERR, “dispatch_block() failed”);
exit(EXIT_FAILURE);
}
dispatch_handler(ctp);
}
}

//------------------------------------------------------------------------------
//------------------------------------------------------------------------------
int set_up_timing(void)
{
struct _itimer itime;
struct _clockperiod period;
struct sigevent timerEvent;
int timerId;

memset(&period, 0, sizeof(period));
period.nsec = 500000;

if (ClockPeriod(CLOCK_REALTIME, &period, NULL, 0) == -1) {
syslog(LOG_ERR, “ClockPeriod() failed”);
return -1;
}

if ((timerEvent.sigev_code = pulse_attach(dpp, MSG_FLAG_ALLOC_PULSE,
0, &time
syslog(LOG_ERR, “pulse_attach() failed”);
return -1;
}

if ((timerEvent.sigev_coid = message_connect(dpp,
MSG_FLAG_SIDE_CHANNEL)) ==
syslog(LOG_ERR, “message_connect() failed”);
return -1;
}

timerEvent.sigev_notify = SIGEV_PULSE;
timerEvent.sigev_priority = -1;
timerEvent.sigev_value.sival_int = 0;

if ((timerId = TimerCreate(CLOCK_REALTIME, &timerEvent)) == -1) {
syslog(LOG_ERR, “TimerCreate() failed”);
return -1;
}

memset(&itime, 0, sizeof(itime));
itime.nsec = 10000000;
itime.interval_nsec = 10000000;
if (TimerSettime(timerId, 0, &itime, NULL) == -1) {
syslog(LOG_ERR, “TimerSettime() failed”);
return -1;
}

return 0;
}

//------------------------------------------------------------------------------
//------------------------------------------------------------------------------
int timer_handler(message_context_t *ctp, int code, unsigned flags, void
*handle
{
static int tickCount = 0;

tickCount++;

if (tickCount == 500) {
char *argv[3];
struct inheritance inherit;
memset(&inherit, 0, sizeof(inherit));

syslog(LOG_INFO, “spawning client”);

inherit.flags |= SPAWN_SETND;
inherit.nd = netmgr_strtond(remote_node, NULL);

argv[0] = strdup(“client”);
argv[1] = strdup(local_node);
argv[2] = NULL;
if (spawn("/tmp/client", 0, NULL, &inherit, argv, NULL) == -1) {
syslog(LOG_ERR, “spawn() failed; errno %d”, errno);
return -1;
}
free(argv[1]);
free(argv[0]);
}

if (tickCount == 1800) {
syslog(LOG_INFO, “exiting”);
exit(EXIT_SUCCESS);
}

return 0;
}


\

client.cpp


#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <syslog.h>

int main(int argc, char* argv[])
{
char buf[128];
int execfd = -1;
int iii;

openlog(“client”, LOG_PID, LOG_LOCAL0);

sprintf(buf, “/net/%s/resmgr”, argv[1]);

iii = 1;
while ((execfd == -1) && (iii < 11)) {
syslog(LOG_INFO, “making attempt %d on %s”, iii, buf);
if ((execfd = open(buf, O_RDONLY)) == -1) {
syslog(LOG_ERR, “Unable to open Exec’s file; errno %d”, errno);
}
sleep(1);
iii++;
}
if (execfd == -1) {
syslog(LOG_INFO, “retries exhausted”);
exit(EXIT_FAILURE);
}

syslog(LOG_INFO, “reached here, must have opened resource manager”);

return 0;
}

well, it worked for me! :v)

Jun 10 17:24:21 nto resmgr[4927546-1]: spawning client
Jun 10 17:22:06 nto client[4186138-1]: making attempt 1 on /net/cburgess/resmgr
Jun 10 17:22:07 nto client[4186138-1]: reached here, must have opened resource manager
Jun 10 17:24:34 nto resmgr[4927546-1]: exiting

Josh Hamacher <jh@faac.com> wrote:

This is a multi-part message in MIME format.
--------------080408070601020609090603
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

Here is a test case that illustrates the current problem (open() always
failing). There are three files: resmgr.cpp, client.cpp, and Makefile;
the names are self-explanatory.

There are two constants defined at the top of resmgr.cpp; you can change
these to reflect the names of the nodes to be used.

To run the test, compile the two programs and place them both in “/tmp/”
(I have the startup path for the client hardcoded to that location).
Start the resource manager. After five seconds, it spawns the client on
a remote node. The client attempts to open the resource manager once
per second for ten seconds. It exits; 18 seconds after starting, the
resource manager also exits.

All output is to the syslog; here’s an example of what I’ve been getting:

Jun 10 20:33:45 nto resmgr[11743269-1]: spawning client
Jun 10 20:27:55 nto client[1691772-1]: making attempt 1 on /net/dev-1/resmgr
Jun 10 20:27:55 nto client[1691772-1]: Unable to open Exec’s file; errno 309
Jun 10 20:27:56 nto client[1691772-1]: making attempt 2 on /net/dev-1/resmgr
Jun 10 20:27:56 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:27:57 nto client[1691772-1]: making attempt 3 on /net/dev-1/resmgr
Jun 10 20:27:57 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:27:58 nto client[1691772-1]: making attempt 4 on /net/dev-1/resmgr
Jun 10 20:27:58 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:27:59 nto client[1691772-1]: making attempt 5 on /net/dev-1/resmgr
Jun 10 20:27:59 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:00 nto client[1691772-1]: making attempt 6 on /net/dev-1/resmgr
Jun 10 20:28:00 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:01 nto client[1691772-1]: making attempt 7 on /net/dev-1/resmgr
Jun 10 20:28:01 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:02 nto client[1691772-1]: making attempt 8 on /net/dev-1/resmgr
Jun 10 20:28:02 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:03 nto client[1691772-1]: making attempt 9 on /net/dev-1/resmgr
Jun 10 20:28:03 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:04 nto client[1691772-1]: making attempt 10 on
/net/dev-1/resmgr
Jun 10 20:28:04 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:05 nto client[1691772-1]: retries exhausted
Jun 10 20:33:58 nto resmgr[11743269-1]: exiting

Thanks for looking at this.

Josh

==================================
Makefile:



all: resmgr client

client: client.cpp
QCC -o $@ client.cpp

resmgr: resmgr.cpp
QCC -o $@ resmgr.cpp


resmgr.cpp



#include <errno.h
#include <fcntl.h
#include <spawn.h
#include <stdlib.h
#include <string.h
#include <syslog.h
#include <unix.h

#include <sys/iofunc.h
#include <sys/dispatch.h
#include <sys/netmgr.h

const char *local_node = “dev-1”;
const char *remote_node = “dev-2”;

static dispatch_t *dpp;
static resmgr_io_funcs_t io_funcs;

int set_up_timing(void);
int timer_handler(message_context_t *ctp, int code, unsigned flags, void
*handle

int main(int argc, char *argv[])
{
iofunc_attr_t attr;
resmgr_connect_funcs_t connect_funcs;
dispatch_context_t *ctp;
resmgr_attr_t resmgr_attr;

// Daemonize ourselves.
switch (fork()) {
case -1:
syslog(LOG_ERR, “fork() failed”);
exit(EXIT_FAILURE);
break;

case 0:
break;

default:
exit(EXIT_SUCCESS);
break;
}

if (setsid() == -1) {
syslog(LOG_ERR, “setsid() failed”);
exit(EXIT_FAILURE);
}

openlog(“resmgr”, LOG_PID, LOG_LOCAL0);

if((dpp = dispatch_create()) == NULL) {
syslog(LOG_ERR, “dispatch_create() failed”);
exit(EXIT_FAILURE);
}

memset(&resmgr_attr, 0, sizeof resmgr_attr);
resmgr_attr.nparts_max = 1;
resmgr_attr.msg_max_size = 2048;

iofunc_func_init(_RESMGR_CONNECT_NFUNCS, &connect_funcs,
_RESMGR_IO_NFUNCS, &

iofunc_attr_init(&attr, S_IFNAM | 0666, 0, 0);

if(resmgr_attach(dpp, &resmgr_attr, “/resmgr”, _FTYPE_ANY, 0,
&connect_funcs,
syslog(LOG_ERR, “resmgr_attach() failed”);
exit(EXIT_FAILURE);
}

ctp = dispatch_context_alloc(dpp);

if (set_up_timing() == -1) {
syslog(LOG_ERR, “set_up_timing() failed”);
exit(EXIT_FAILURE);
}

while (1) {
if ((ctp = dispatch_block(ctp)) == NULL) {
syslog(LOG_ERR, “dispatch_block() failed”);
exit(EXIT_FAILURE);
}
dispatch_handler(ctp);
}
}

//------------------------------------------------------------------------------
//------------------------------------------------------------------------------
int set_up_timing(void)
{
struct _itimer itime;
struct _clockperiod period;
struct sigevent timerEvent;
int timerId;

memset(&period, 0, sizeof(period));
period.nsec = 500000;

if (ClockPeriod(CLOCK_REALTIME, &period, NULL, 0) == -1) {
syslog(LOG_ERR, “ClockPeriod() failed”);
return -1;
}

if ((timerEvent.sigev_code = pulse_attach(dpp, MSG_FLAG_ALLOC_PULSE,
0, &time
syslog(LOG_ERR, “pulse_attach() failed”);
return -1;
}

if ((timerEvent.sigev_coid = message_connect(dpp,
MSG_FLAG_SIDE_CHANNEL)) ==
syslog(LOG_ERR, “message_connect() failed”);
return -1;
}

timerEvent.sigev_notify = SIGEV_PULSE;
timerEvent.sigev_priority = -1;
timerEvent.sigev_value.sival_int = 0;

if ((timerId = TimerCreate(CLOCK_REALTIME, &timerEvent)) == -1) {
syslog(LOG_ERR, “TimerCreate() failed”);
return -1;
}

memset(&itime, 0, sizeof(itime));
itime.nsec = 10000000;
itime.interval_nsec = 10000000;
if (TimerSettime(timerId, 0, &itime, NULL) == -1) {
syslog(LOG_ERR, “TimerSettime() failed”);
return -1;
}

return 0;
}

//------------------------------------------------------------------------------
//------------------------------------------------------------------------------
int timer_handler(message_context_t *ctp, int code, unsigned flags, void
*handle
{
static int tickCount = 0;

tickCount++;

if (tickCount == 500) {
char *argv[3];
struct inheritance inherit;
memset(&inherit, 0, sizeof(inherit));

syslog(LOG_INFO, “spawning client”);

inherit.flags |= SPAWN_SETND;
inherit.nd = netmgr_strtond(remote_node, NULL);

argv[0] = strdup(“client”);
argv[1] = strdup(local_node);
argv[2] = NULL;
if (spawn("/tmp/client", 0, NULL, &inherit, argv, NULL) == -1) {
syslog(LOG_ERR, “spawn() failed; errno %d”, errno);
return -1;
}
free(argv[1]);
free(argv[0]);
}

if (tickCount == 1800) {
syslog(LOG_INFO, “exiting”);
exit(EXIT_SUCCESS);
}

return 0;
}


client.cpp



#include <errno.h
#include <fcntl.h
#include <stdio.h
#include <stdlib.h
#include <syslog.h

int main(int argc, char* argv[])
{
char buf[128];
int execfd = -1;
int iii;

openlog(“client”, LOG_PID, LOG_LOCAL0);

sprintf(buf, “/net/%s/resmgr”, argv[1]);

iii = 1;
while ((execfd == -1) && (iii < 11)) {
syslog(LOG_INFO, “making attempt %d on %s”, iii, buf);
if ((execfd = open(buf, O_RDONLY)) == -1) {
syslog(LOG_ERR, “Unable to open Exec’s file; errno %d”, errno);
}
sleep(1);
iii++;
}
if (execfd == -1) {
syslog(LOG_INFO, “retries exhausted”);
exit(EXIT_FAILURE);
}

syslog(LOG_INFO, “reached here, must have opened resource manager”);

return 0;
}

--------------080408070601020609090603
Content-Type: application/octet-stream;
name=“test.tar.bz2”
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename=“test.tar.bz2”

QlpoOTFBWSZTWS7mbQsABbH/pP55FABf///fb///6v/v3/4AAIABAAhgB/wPgBVABSgABSiR
HGhk00yaABggDQZNAAMmgAABkyAaHGhk00yaABggDQZNAAMmgAABkyAaHGhk00yaABggDQZN
AAMmgAABkyAaCTUVMUpvEZT001T9U0ek0NqaepoGmhkADQBoA0A09QHGhk00yaABggDQZNAA
MmgAABkyAaCRII0I0E0NGpphIaDFPUyep6mCYJ6gZBoyeBRmmp/3gj4Dv+TL/ScJRByCSdfm
SHNzVwWJMJN0YBVLJIMRYTI54zQf4PZJUYiRVVFU9JP2q1zOCdFLjcjdS2S2wl0GTFsptbQc
1SVG2hgwQql57lRZJkBSCe0JaUtUPmgmTnNIIQDbEgAacpCCy6qQSmhRYRYBL7XoYXIG5mLI
fe0gO8K4Yl6CjPRpmkfgWabX/O9CMrGRBiCuwgaPjNGpZJHruuLlWP3MXmokvKzL89JP861l
+Nbu2mqTI3ANlhmWjMDQMG2NsoDPH6IqOY+Kf0TKFw3nwLyfu5AnoL3NCoyTUF0CkZwrUdee
KUGyMd8gtdCsrBBcNEmDL5hIxEH9MpJKubsjNpRJsgpQ+T3fnkEhiGxew60KBdfVBRjabbGz
Zy82Zmim5oln17ok5SK+otKLrXTlVBeIIiRa/G22Msi+RZX3jyvIydEVWORwIrs97k+1mNiJ
MgE50CbREoJPTB5a6XSmGF3y2vwMt9JXw2KerGSP5dt/6kSxNdL2VZlpp8nEvF44Db7n1mvg
/An3KirHk5NW5u9etrB39PN8u7Rq34/SYGczdGVZ8tc7tOEbekhYmGAMuOwP5gaORCHAQTyy
GM8EdbqnhPLrw6+vWJxcHSa23CtVIcPYDZqONVzT3doSk4Q2ggFL7oBeIUmd0bXmPRJfT9NP
DesKpi9J0HgaRdEyq+CIVf4W/wYCffrdim20XruxK/K9lNMXRrsTlOcqKYpOVzAlWcrzOkql
znAXBZQpyVL5b+ztVlXPw/rGVr5JeFhv6vekVfR0QI6H8GcI+F6mK7hSlJUIfVq7mnv1Ks2x
DTbex5u2pWtwSDxgmhtNsZ87EL3kl/Y3H3pfSerYeX8Zeah8Z5z0kikzvkXlp8dN4cx5y5/P
taX1H7yChkhQaCfnxQa+5kljZBliiGRkefUhZDQRoxJ3uaXpJC8SDJfEF5vDcwvHvORBKxUp
tEGwKlTWlsiDMof5xuNV5cjI2GFkD9VkaEhbs7XafsxDeficyFLWkw4kLmLzE0FOYeVw2QRy
lCerNK9nPc+XE40HOXmCVf1GfdVQM2W/4zvMnb7HZ9kSu729+g6lQmeRNvHUMyZutBy/NL1G
+2268+Lh4G2L6x94v5ChOZAyn2Fw2OVAn8cX+P652bwh+S7HCC8keIGFeleSQYfku/eFMl1h
MDGc9dLWP0efWqYaaIxJsne3tVXft2ZxgaQ4JnCv9YTa8rWrIvDQfUflU8yw1D0KqfDgciIN
pHPQNmQNBSyJBcWBqR4Ptfs8U/u16UluImNchEUGHlEyoHrDWLiuZzrPUZeOLSV6/JoTJlSk
Q6wm6YJKEuLYbuKNKbm8BUNKNPGdXQzrPIgPY59cqWO2zYRJu6B2ZBsc8DoC8uAkvcRmB1u5
VaCo8Pmv6qFcQ7Ec2tg2mhvWGvSq6tIQgJX/KNEJcYeUzWjRo0gjAKBsYMlB8JCDaZix5U/b
OZDfPpDShdpxCjmC4Fx7DnCJdbLZYmAdaDHapDGIGNvbsOdFANyEQpCNItCSYHvrtmISF29U
A9CXsMkdBxpalqGSQadZYBsK1AuRsnRnc7Q9oK9B3hiGNKD1pLs1T5A05d00Gg3DTmQd0Gzf
d03lLpge3JRRBlvQOSXLjYuPuwvu5LNsOaQ+JMTHsN8g5FvX4ayEK0CvQVLuGu88RXQWBHWp
JK3U1tOp7TMDmkoEE40GCWxgnQLKGjfXp3IJ7x3EfVp1Ic0lAFr5i1i9otwMrJLFmFqjFkLY
jJsoTKFQlGQqFJCrB2HyGk1IZoeKAgUYImQWExI7pNmm7pmCkivAmW3hx190Vh56HKTaZe4k
F9yFmjr39RiZi7Yrt2jhXNc6bRzw01rJMhsEUokogyqE1JePLNGxeM6uv0cuU6L4vMwxvcM8
LCfv9xElxAthw2gNpc63IOsUkEUqwlNP9OXw8oQT9KnMuZ2gsdh1LHJdAcfFsIiWJ7SKXkdM
g3agRrBHYgxx77uoGBoDnm6qUiMYMWSVTeKwUtLMsFio70oNom0RqgWTh0BdEG6UYjC3juvO
wBmAbAwyI8kCow5iJ3imtO0daFyXChRaoPXInmFcDI6a82pLIFmVYpwZoYwTQWkGsA7gbSvY
hbsjiD2jWEaUD/RdHsWHVrGgaNevCk/oS2aTeM4qos63KBxdPGSGJMiBygiMInUlHagH/i7k
inChIF3M2hY=
--------------080408070601020609090603–


cburgess@qnx.com

Are you sure that dev-2 see’s dev-1 as dev-1? Try spawning ls /net instead
of your client…

Josh Hamacher <jh@faac.com> wrote:

This is a multi-part message in MIME format.
--------------080408070601020609090603
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

Here is a test case that illustrates the current problem (open() always
failing). There are three files: resmgr.cpp, client.cpp, and Makefile;
the names are self-explanatory.

There are two constants defined at the top of resmgr.cpp; you can change
these to reflect the names of the nodes to be used.

To run the test, compile the two programs and place them both in “/tmp/”
(I have the startup path for the client hardcoded to that location).
Start the resource manager. After five seconds, it spawns the client on
a remote node. The client attempts to open the resource manager once
per second for ten seconds. It exits; 18 seconds after starting, the
resource manager also exits.

All output is to the syslog; here’s an example of what I’ve been getting:

Jun 10 20:33:45 nto resmgr[11743269-1]: spawning client
Jun 10 20:27:55 nto client[1691772-1]: making attempt 1 on /net/dev-1/resmgr
Jun 10 20:27:55 nto client[1691772-1]: Unable to open Exec’s file; errno 309
Jun 10 20:27:56 nto client[1691772-1]: making attempt 2 on /net/dev-1/resmgr
Jun 10 20:27:56 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:27:57 nto client[1691772-1]: making attempt 3 on /net/dev-1/resmgr
Jun 10 20:27:57 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:27:58 nto client[1691772-1]: making attempt 4 on /net/dev-1/resmgr
Jun 10 20:27:58 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:27:59 nto client[1691772-1]: making attempt 5 on /net/dev-1/resmgr
Jun 10 20:27:59 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:00 nto client[1691772-1]: making attempt 6 on /net/dev-1/resmgr
Jun 10 20:28:00 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:01 nto client[1691772-1]: making attempt 7 on /net/dev-1/resmgr
Jun 10 20:28:01 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:02 nto client[1691772-1]: making attempt 8 on /net/dev-1/resmgr
Jun 10 20:28:02 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:03 nto client[1691772-1]: making attempt 9 on /net/dev-1/resmgr
Jun 10 20:28:03 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:04 nto client[1691772-1]: making attempt 10 on
/net/dev-1/resmgr
Jun 10 20:28:04 nto client[1691772-1]: Unable to open Exec’s file; errno 3
Jun 10 20:28:05 nto client[1691772-1]: retries exhausted
Jun 10 20:33:58 nto resmgr[11743269-1]: exiting

Thanks for looking at this.

Josh

==================================
Makefile:



all: resmgr client

client: client.cpp
QCC -o $@ client.cpp

resmgr: resmgr.cpp
QCC -o $@ resmgr.cpp


resmgr.cpp



#include <errno.h
#include <fcntl.h
#include <spawn.h
#include <stdlib.h
#include <string.h
#include <syslog.h
#include <unix.h

#include <sys/iofunc.h
#include <sys/dispatch.h
#include <sys/netmgr.h

const char *local_node = “dev-1”;
const char *remote_node = “dev-2”;

static dispatch_t *dpp;
static resmgr_io_funcs_t io_funcs;

int set_up_timing(void);
int timer_handler(message_context_t *ctp, int code, unsigned flags, void
*handle

int main(int argc, char *argv[])
{
iofunc_attr_t attr;
resmgr_connect_funcs_t connect_funcs;
dispatch_context_t *ctp;
resmgr_attr_t resmgr_attr;

// Daemonize ourselves.
switch (fork()) {
case -1:
syslog(LOG_ERR, “fork() failed”);
exit(EXIT_FAILURE);
break;

case 0:
break;

default:
exit(EXIT_SUCCESS);
break;
}

if (setsid() == -1) {
syslog(LOG_ERR, “setsid() failed”);
exit(EXIT_FAILURE);
}

openlog(“resmgr”, LOG_PID, LOG_LOCAL0);

if((dpp = dispatch_create()) == NULL) {
syslog(LOG_ERR, “dispatch_create() failed”);
exit(EXIT_FAILURE);
}

memset(&resmgr_attr, 0, sizeof resmgr_attr);
resmgr_attr.nparts_max = 1;
resmgr_attr.msg_max_size = 2048;

iofunc_func_init(_RESMGR_CONNECT_NFUNCS, &connect_funcs,
_RESMGR_IO_NFUNCS, &

iofunc_attr_init(&attr, S_IFNAM | 0666, 0, 0);

if(resmgr_attach(dpp, &resmgr_attr, “/resmgr”, _FTYPE_ANY, 0,
&connect_funcs,
syslog(LOG_ERR, “resmgr_attach() failed”);
exit(EXIT_FAILURE);
}

ctp = dispatch_context_alloc(dpp);

if (set_up_timing() == -1) {
syslog(LOG_ERR, “set_up_timing() failed”);
exit(EXIT_FAILURE);
}

while (1) {
if ((ctp = dispatch_block(ctp)) == NULL) {
syslog(LOG_ERR, “dispatch_block() failed”);
exit(EXIT_FAILURE);
}
dispatch_handler(ctp);
}
}

//------------------------------------------------------------------------------
//------------------------------------------------------------------------------
int set_up_timing(void)
{
struct _itimer itime;
struct _clockperiod period;
struct sigevent timerEvent;
int timerId;

memset(&period, 0, sizeof(period));
period.nsec = 500000;

if (ClockPeriod(CLOCK_REALTIME, &period, NULL, 0) == -1) {
syslog(LOG_ERR, “ClockPeriod() failed”);
return -1;
}

if ((timerEvent.sigev_code = pulse_attach(dpp, MSG_FLAG_ALLOC_PULSE,
0, &time
syslog(LOG_ERR, “pulse_attach() failed”);
return -1;
}

if ((timerEvent.sigev_coid = message_connect(dpp,
MSG_FLAG_SIDE_CHANNEL)) ==
syslog(LOG_ERR, “message_connect() failed”);
return -1;
}

timerEvent.sigev_notify = SIGEV_PULSE;
timerEvent.sigev_priority = -1;
timerEvent.sigev_value.sival_int = 0;

if ((timerId = TimerCreate(CLOCK_REALTIME, &timerEvent)) == -1) {
syslog(LOG_ERR, “TimerCreate() failed”);
return -1;
}

memset(&itime, 0, sizeof(itime));
itime.nsec = 10000000;
itime.interval_nsec = 10000000;
if (TimerSettime(timerId, 0, &itime, NULL) == -1) {
syslog(LOG_ERR, “TimerSettime() failed”);
return -1;
}

return 0;
}

//------------------------------------------------------------------------------
//------------------------------------------------------------------------------
int timer_handler(message_context_t *ctp, int code, unsigned flags, void
*handle
{
static int tickCount = 0;

tickCount++;

if (tickCount == 500) {
char *argv[3];
struct inheritance inherit;
memset(&inherit, 0, sizeof(inherit));

syslog(LOG_INFO, “spawning client”);

inherit.flags |= SPAWN_SETND;
inherit.nd = netmgr_strtond(remote_node, NULL);

argv[0] = strdup(“client”);
argv[1] = strdup(local_node);
argv[2] = NULL;
if (spawn("/tmp/client", 0, NULL, &inherit, argv, NULL) == -1) {
syslog(LOG_ERR, “spawn() failed; errno %d”, errno);
return -1;
}
free(argv[1]);
free(argv[0]);
}

if (tickCount == 1800) {
syslog(LOG_INFO, “exiting”);
exit(EXIT_SUCCESS);
}

return 0;
}


client.cpp



#include <errno.h
#include <fcntl.h
#include <stdio.h
#include <stdlib.h
#include <syslog.h

int main(int argc, char* argv[])
{
char buf[128];
int execfd = -1;
int iii;

openlog(“client”, LOG_PID, LOG_LOCAL0);

sprintf(buf, “/net/%s/resmgr”, argv[1]);

iii = 1;
while ((execfd == -1) && (iii < 11)) {
syslog(LOG_INFO, “making attempt %d on %s”, iii, buf);
if ((execfd = open(buf, O_RDONLY)) == -1) {
syslog(LOG_ERR, “Unable to open Exec’s file; errno %d”, errno);
}
sleep(1);
iii++;
}
if (execfd == -1) {
syslog(LOG_INFO, “retries exhausted”);
exit(EXIT_FAILURE);
}

syslog(LOG_INFO, “reached here, must have opened resource manager”);

return 0;
}

--------------080408070601020609090603
Content-Type: application/octet-stream;
name=“test.tar.bz2”
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename=“test.tar.bz2”

QlpoOTFBWSZTWS7mbQsABbH/pP55FABf///fb///6v/v3/4AAIABAAhgB/wPgBVABSgABSiR
HGhk00yaABggDQZNAAMmgAABkyAaHGhk00yaABggDQZNAAMmgAABkyAaHGhk00yaABggDQZN
AAMmgAABkyAaCTUVMUpvEZT001T9U0ek0NqaepoGmhkADQBoA0A09QHGhk00yaABggDQZNAA
MmgAABkyAaCRII0I0E0NGpphIaDFPUyep6mCYJ6gZBoyeBRmmp/3gj4Dv+TL/ScJRByCSdfm
SHNzVwWJMJN0YBVLJIMRYTI54zQf4PZJUYiRVVFU9JP2q1zOCdFLjcjdS2S2wl0GTFsptbQc
1SVG2hgwQql57lRZJkBSCe0JaUtUPmgmTnNIIQDbEgAacpCCy6qQSmhRYRYBL7XoYXIG5mLI
fe0gO8K4Yl6CjPRpmkfgWabX/O9CMrGRBiCuwgaPjNGpZJHruuLlWP3MXmokvKzL89JP861l
+Nbu2mqTI3ANlhmWjMDQMG2NsoDPH6IqOY+Kf0TKFw3nwLyfu5AnoL3NCoyTUF0CkZwrUdee
KUGyMd8gtdCsrBBcNEmDL5hIxEH9MpJKubsjNpRJsgpQ+T3fnkEhiGxew60KBdfVBRjabbGz
Zy82Zmim5oln17ok5SK+otKLrXTlVBeIIiRa/G22Msi+RZX3jyvIydEVWORwIrs97k+1mNiJ
MgE50CbREoJPTB5a6XSmGF3y2vwMt9JXw2KerGSP5dt/6kSxNdL2VZlpp8nEvF44Db7n1mvg
/An3KirHk5NW5u9etrB39PN8u7Rq34/SYGczdGVZ8tc7tOEbekhYmGAMuOwP5gaORCHAQTyy
GM8EdbqnhPLrw6+vWJxcHSa23CtVIcPYDZqONVzT3doSk4Q2ggFL7oBeIUmd0bXmPRJfT9NP
DesKpi9J0HgaRdEyq+CIVf4W/wYCffrdim20XruxK/K9lNMXRrsTlOcqKYpOVzAlWcrzOkql
znAXBZQpyVL5b+ztVlXPw/rGVr5JeFhv6vekVfR0QI6H8GcI+F6mK7hSlJUIfVq7mnv1Ks2x
DTbex5u2pWtwSDxgmhtNsZ87EL3kl/Y3H3pfSerYeX8Zeah8Z5z0kikzvkXlp8dN4cx5y5/P
taX1H7yChkhQaCfnxQa+5kljZBliiGRkefUhZDQRoxJ3uaXpJC8SDJfEF5vDcwvHvORBKxUp
tEGwKlTWlsiDMof5xuNV5cjI2GFkD9VkaEhbs7XafsxDeficyFLWkw4kLmLzE0FOYeVw2QRy
lCerNK9nPc+XE40HOXmCVf1GfdVQM2W/4zvMnb7HZ9kSu729+g6lQmeRNvHUMyZutBy/NL1G
+2268+Lh4G2L6x94v5ChOZAyn2Fw2OVAn8cX+P652bwh+S7HCC8keIGFeleSQYfku/eFMl1h
MDGc9dLWP0efWqYaaIxJsne3tVXft2ZxgaQ4JnCv9YTa8rWrIvDQfUflU8yw1D0KqfDgciIN
pHPQNmQNBSyJBcWBqR4Ptfs8U/u16UluImNchEUGHlEyoHrDWLiuZzrPUZeOLSV6/JoTJlSk
Q6wm6YJKEuLYbuKNKbm8BUNKNPGdXQzrPIgPY59cqWO2zYRJu6B2ZBsc8DoC8uAkvcRmB1u5
VaCo8Pmv6qFcQ7Ec2tg2mhvWGvSq6tIQgJX/KNEJcYeUzWjRo0gjAKBsYMlB8JCDaZix5U/b
OZDfPpDShdpxCjmC4Fx7DnCJdbLZYmAdaDHapDGIGNvbsOdFANyEQpCNItCSYHvrtmISF29U
A9CXsMkdBxpalqGSQadZYBsK1AuRsnRnc7Q9oK9B3hiGNKD1pLs1T5A05d00Gg3DTmQd0Gzf
d03lLpge3JRRBlvQOSXLjYuPuwvu5LNsOaQ+JMTHsN8g5FvX4ayEK0CvQVLuGu88RXQWBHWp
JK3U1tOp7TMDmkoEE40GCWxgnQLKGjfXp3IJ7x3EfVp1Ic0lAFr5i1i9otwMrJLFmFqjFkLY
jJsoTKFQlGQqFJCrB2HyGk1IZoeKAgUYImQWExI7pNmm7pmCkivAmW3hx190Vh56HKTaZe4k
F9yFmjr39RiZi7Yrt2jhXNc6bRzw01rJMhsEUokogyqE1JePLNGxeM6uv0cuU6L4vMwxvcM8
LCfv9xElxAthw2gNpc63IOsUkEUqwlNP9OXw8oQT9KnMuZ2gsdh1LHJdAcfFsIiWJ7SKXkdM
g3agRrBHYgxx77uoGBoDnm6qUiMYMWSVTeKwUtLMsFio70oNom0RqgWTh0BdEG6UYjC3juvO
wBmAbAwyI8kCow5iJ3imtO0daFyXChRaoPXInmFcDI6a82pLIFmVYpwZoYwTQWkGsA7gbSvY
hbsjiD2jWEaUD/RdHsWHVrGgaNevCk/oS2aTeM4qos63KBxdPGSGJMiBygiMInUlHagH/i7k
inChIF3M2hY=
--------------080408070601020609090603–


cburgess@qnx.com

This is getting more and more confusing (to me, at least).

Spawning ‘ls /net’ resulted in the appropriate output; dev-1 and dev-2
both showed up in the output list. But when I changed it to spawn ‘ls
/net/dev-1/’, I got this output:

ls: No such process (/net/dev-1/)

A third change, to ‘ls /net/dev-2/’, gave the expected result (listing
of ‘/’ on dev-2).

But I have a telnet session open on dev-2, and if I run ‘ls /net/dev-1/’
from the command line, it works.

As a final test, I tried spawning date on the remote node. Since the
clocks on the two machines differ by about 8 minutes, it was easy to
verify the results. ‘on -n dev-2 date’ does display dev-2’s clock.

So is something misconfigured in my network?

Josh