Neutrino IPC over network

Amit_Bhatnagar · January 4, 2001, 3:20am

Hello, I have a problem that I can’t figure out.

what I want is to have 3 apps running on 3 different machines.
The key here is that all of the apps can send and receive IPC
between each other (so none of them are really a “server”).

Now if these 3 apps were on the same box… it’d be easy… I
would get app1 to create a channel and spawn() apps 2 & 3,
passing app1’s pid/chid to each via command line args. Apps
2 & 3 would create their chid and pass it back to app1 via
a MsgSend. With a couple of MsgSend()'s by app1, all of the
apps have everything they need to know for IPC to occur beteen
any of the processes.

Now taking scalability into account, there is a possibility
that the processes could be on seperate machines. In this
case, the above procedure wouldn’t work because I couldn’t
find out the PID of the other apps on different boxes (aka,
I can’t spawn them over the network).

Is writing these apps as a resource manager my only option?
I can’t seem to find any -clear- documentation on how to
set up a resource manager for the aforementioned problem.

Any help would be greatly appreicated.
Amit.

system · January 4, 2001, 1:53pm

Well, the answers kind of depend on your timeframe. There
will eventually be global names available for just this
kind of problem (name_attach), so all 3 apps could register
a global name, and the others could look it up. But this
functionality is not yet available, and I am not sure
what kind of a timeframe is expected.

You would not need to do anything special to setup the app
as a resource manager for this purpose, but the names will
still have to be searched for. If your apps all registered
a name like /dev/app1, you could readdir() though /net to
get a list of machines available on the network, and then
look for the other apps names on each machine. If you find
the name, do an open to get a connection. If you want to be
easily able to switch over to global names when they become
available you may want to make a loader app that spawns your
apps, then registers the name itself. It could then just handle
opens and reads, and return a structure containing a nid/pid/chid
of the app when anyone does a read from it. Then your apps would
just have to do the readdir() of net, and search for the names.
When global names become available, you could just change that
function to call name_open().
Hope this helps,

Peter

Amit Bhatnagar <amit@dal.ca> wrote:

Hello, I have a problem that I can’t figure out.

what I want is to have 3 apps running on 3 different machines.
The key here is that all of the apps can send and receive IPC
between each other (so none of them are really a “server”).

Now if these 3 apps were on the same box… it’d be easy… I
would get app1 to create a channel and spawn() apps 2 & 3,
passing app1’s pid/chid to each via command line args. Apps
2 & 3 would create their chid and pass it back to app1 via
a MsgSend. With a couple of MsgSend()'s by app1, all of the
apps have everything they need to know for IPC to occur beteen
any of the processes.

Now taking scalability into account, there is a possibility
that the processes could be on seperate machines. In this
case, the above procedure wouldn’t work because I couldn’t
find out the PID of the other apps on different boxes (aka,
I can’t spawn them over the network).

Is writing these apps as a resource manager my only option?
I can’t seem to find any -clear- documentation on how to
set up a resource manager for the aforementioned problem.

Any help would be greatly appreicated.
Amit.

Steve_Munnings_Corma · January 4, 2001, 4:52pm

Amit Bhatnagar <amit@dal.ca> wrote in message
news:930q5k$8vm$2@News.Dal.Ca…

Hello, I have a problem that I can’t figure out.

what I want is to have 3 apps running on 3 different machines.
The key here is that all of the apps can send and receive IPC
between each other (so none of them are really a “server”).

Now if these 3 apps were on the same box… it’d be easy… I
would get app1 to create a channel and spawn() apps 2 & 3,
passing app1’s pid/chid to each via command line args. Apps
2 & 3 would create their chid and pass it back to app1 via
a MsgSend. With a couple of MsgSend()'s by app1, all of the
apps have everything they need to know for IPC to occur beteen
any of the processes.

Now taking scalability into account, there is a possibility
that the processes could be on seperate machines. In this
case, the above procedure wouldn’t work because I couldn’t
find out the PID of the other apps on different boxes (aka,
I can’t spawn them over the network).

Since this appears to be RTP, I cannot give THE definitive answer (only QSSL
could), but if you cannot spawn over the network now, I believe that the
plan is to allow you to spawn over the network in the future. In QNX4, it
is a piece of cake! This type of facility is part of QNET functionality,
and I believe that QNET is still in beta. Also, when QNET is fully
functional, another alternative is to do “qnx_name_locate()” equivalent.
qnx_name_locate() is a QNX4 thing that I would be absolutely shocked if the
RTP QNET did not support.

Is writing these apps as a resource manager my only option?
I can’t seem to find any -clear- documentation on how to
set up a resource manager for the aforementioned problem.

Any help would be greatly appreicated.
Amit.

Jay_Hogg · January 5, 2001, 2:39pm

One other major issue…

You need to look at how you design the application because normally
you should never have 2 applications that “send” to each other
because of the possibility of a deadlock - Notification is usually done
via Proxies(qnx4)/Pulses(nto) and somebody is in charge.

3+ process just means the entire network of machines can become
deadlocked…

Jay

pgraves@qnx.com wrote in message <931v89$jjm$1@nntp.qnx.com>…

Well, the answers kind of depend on your timeframe. There
will eventually be global names available for just this
kind of problem (name_attach), so all 3 apps could register
a global name, and the others could look it up. But this
functionality is not yet available, and I am not sure
what kind of a timeframe is expected.

You would not need to do anything special to setup the app
as a resource manager for this purpose, but the names will
still have to be searched for. If your apps all registered
a name like /dev/app1, you could readdir() though /net to
get a list of machines available on the network, and then
look for the other apps names on each machine. If you find
the name, do an open to get a connection. If you want to be
easily able to switch over to global names when they become
available you may want to make a loader app that spawns your
apps, then registers the name itself. It could then just handle
opens and reads, and return a structure containing a nid/pid/chid
of the app when anyone does a read from it. Then your apps would
just have to do the readdir() of net, and search for the names.
When global names become available, you could just change that
function to call name_open().
Hope this helps,

Peter

Amit Bhatnagar <> amit@dal.ca> > wrote:
Hello, I have a problem that I can’t figure out.

what I want is to have 3 apps running on 3 different machines.
The key here is that all of the apps can send and receive IPC
between each other (so none of them are really a “server”).

Now if these 3 apps were on the same box… it’d be easy… I
would get app1 to create a channel and spawn() apps 2 & 3,
passing app1’s pid/chid to each via command line args. Apps
2 & 3 would create their chid and pass it back to app1 via
a MsgSend. With a couple of MsgSend()'s by app1, all of the
apps have everything they need to know for IPC to occur beteen
any of the processes.

Now taking scalability into account, there is a possibility
that the processes could be on seperate machines. In this
case, the above procedure wouldn’t work because I couldn’t
find out the PID of the other apps on different boxes (aka,
I can’t spawn them over the network).

Is writing these apps as a resource manager my only option?
I can’t seem to find any -clear- documentation on how to
set up a resource manager for the aforementioned problem.

Any help would be greatly appreicated.
Amit.

David_Gibbs1 · January 5, 2001, 5:04pm

Jay Hogg <jh@fastlane.net.r-e-m-o-v-e> wrote:

One other major issue…

You need to look at how you design the application because normally
you should never have 2 applications that “send” to each other
because of the possibility of a deadlock - Notification is usually done
via Proxies(qnx4)/Pulses(nto) and somebody is in charge.

Of course, this isn’t quite as true when you get into threaded applications.
If a process has a Send thread and a Receive thread, then a pair of processes
can be “peers” and send to each other. (Only true under Neutrino.)

Still, this isn’t usually a good design – client-server architecture
generally does work better. And, even under Neutrino, all of the
process location code, and channel setup code, etc, does assume
a client-server architecture. You’d still have to do a bit of extra
shoe-horning to get peer-to-peer.

-David

QNX Training Services
dagibbs@qnx.com

Mario_Charest1 · January 5, 2001, 8:19pm

“Amit Bhatnagar” <amit@dal.ca> wrote in message
news:935958$4vu$1@News.Dal.Ca…

Steve Munnings, Corman Technologies <> steve@cormantech.com> > wrote:

: Amit Bhatnagar <> amit@dal.ca> > wrote in message
: news:930q5k$8vm$> 2@News.Dal.Ca> …
:> Hello, I have a problem that I can’t figure out.
:
:> what I want is to have 3 apps running on 3 different machines.
:> The key here is that all of the apps can send and receive IPC
:> between each other (so none of them are really a “server”).

[snip]…

well…after much reading… my next question is “What is wrong
with doing it as a resource manager?”

Absolutely nothing.

What is in question is that each of your resource manager
send message to each other. But that doesn’t reduce the
risk of potential deadlock.

Jay_Hogg · January 6, 2001, 6:46am

Amit Bhatnagar wrote in message <935ksi$ai8$1@News.Dal.Ca>…

Mario Charest <mcharest@void_zinformatic.com> wrote:

:> well…after much reading… my next question is “What is wrong
:> with doing it as a resource manager?”

: Absolutely nothing.

: What is in question is that each of your resource manager
: send message to each other. But that doesn’t reduce the
: risk of potential deadlock.

Ohh i see, so I am tackling the problem ina good way (ala
resmource manager), however the potential deadlock is because
one app may not be able to reply to the sending app (thus making
the sender reply blocked). Is that right? How else is there
potential deadlck?

If you don’t have enough threads to handle the maximum number of
requests that could be sent to you at one time.

With 2 nodes it is 2 threads.
N1t1 sends to N2 (and t1 or t2 can handle it) while
N2tx sends to N1 (and the other thread is available)
(note this doesn’t account for the processes that really need the RM!)

With ‘N’ nodes it gets tricky unless you run with N+1 threads because there
is always a chance that you could get 2 process attempting to send to
each other and no threads are available to handle the request. If you
have
mutexs/condvars you could get into a hopeless deadlock that would be
really hard to debug.

In Nto it is possible (QNX4 was a definitive not possible unless you had a
method of synchronization between processes) but I think you need to
look carefully at what you are trying to do. This design has a process
that is both a “server” and a “peer” - one way would be to have a single
thread in each RM that is responsible for talking to all the other peers
so some amount of segmentation exists (a thread that does MsgReceive
will never issue a MsgSend (or call a function that does) to a process
it could receive from)

Just my 2 cents - the simpler the design, the easier to implement, the
more robust down the road, and you can hand it off to somebody else.

(Mario and David - Yes I forgot about threads and dispatch handling
in Nto, but I also believe there are right and wrong places to use
threads and a lot of threads are used because someone couldn’t find
a “correct” way to do something; the thread was a work-around as
opposed to a “design” issue.)

Jay

Amit_Bhatnagar · January 8, 2001, 3:58pm

Jay Hogg <jh@fastlane.net.r-e-m-o-v-e> wrote:
: If you don’t have enough threads to handle the maximum number of
: requests that could be sent to you at one time.

what actually happens when a process sends a message to another process
(and assume that it is on a different box, as in my scenario), and the
receiving app is not ready to receive the message (ie, not waiting on
MsgReceive()) ? Also what happenes when the processes -are- on the same box,
in this case?

David_Gibbs1 · January 8, 2001, 6:49pm

Amit Bhatnagar <amit@dal.ca> wrote:

Jay Hogg <> jh@fastlane.net.r-e-m-o-v-e> > wrote:
: If you don’t have enough threads to handle the maximum number of
: requests that could be sent to you at one time.

what actually happens when a process sends a message to another process
(and assume that it is on a different box, as in my scenario), and the
receiving app is not ready to receive the message (ie, not waiting on
MsgReceive()) ?

The sending process (really thread) is put into the SEND blocked state.
Some part of the message (the whole thing in QNX4 where messages were
limitted to 64K, the first 8k in the current Neutrino implementation)
will be copied accross the network so as to be available when the
receiving side calls MsgReceive() in order to reduce latency.

Also what happenes when the processes -are- on the same box,
in this case?

The process (thread) is put in the SEND blocked state.

-David

QNX Training Services
dagibbs@qnx.com

Mario_Charest1 · January 9, 2001, 8:27pm

“Amit Bhatnagar” <amit@dal.ca> wrote in message
news:93fm49$ept$1@News.Dal.Ca…

David Gibbs <> dagibbs@qnx.com> > wrote:
: Amit Bhatnagar <> amit@dal.ca> > wrote:
:> Jay Hogg <> jh@fastlane.net.r-e-m-o-v-e> > wrote:
:> : If you don’t have enough threads to handle the maximum number of
:> : requests that could be sent to you at one time.

:> what actually happens when a process sends a message to another process
:> (and assume that it is on a different box, as in my scenario), and the
:> receiving app is not ready to receive the message (ie, not waiting on
:> MsgReceive()) ?

: The sending process (really thread) is put into the SEND blocked state.

[snip.,…]

So I guess I should look at a multi-threaded solution. But say I have
10 threads, and it recevies 11 requests… what will happen to the 11th
request? Will it stick around until one of the worker threads free s up,
or will I loose that request all together?

It will wait (block) until a thread is freed.

Another thing that I wouild like to implement is setting up a timeout
for each message send that I would do. If I don’t receive a reply for
any given msgSend() within a certian ammount of time, I can record that
as an error and properly shutdown the app (avoiding the deadlock).

Yes you can do that, check TimerTimeout(), but it’s not a way to avoid
deadlock, you are not avoiding deadlocks you are detecting and going
around them, very different

You should avoid deadlock at the source, in the design.

Does this make sence? Is it quite easy to implement?