QNX messaging model

Hi, my company departemnt is about to embark on its first QNX project
and I need some feedbbck on QNX process and messaging …

We are designing a multi-cpu datacom software layered architecture. Each
CPU can run up 4 layers of software:

L1: OS, and drivers
L2: real time monitoring applications

  • must handle events from hardware (polled and interrupt driven)
  • must configure the hardware communication services (e.g. telenet
    server)
    L3: managed objects …
    These objects
  • must manage configuration commands received from the L4 layer
    (store and apply the data to hardware via L2)
  • must report on HW performance queries from the L4 layer
  • must handle events and fault reports originating from the L2 layer
    and gererate apropriate alarms to L4 layer.
    L4: external interfaces e.g. TL-1 agent used to configure and query the NE

Each layer contains a number of components that can be mapped to a QNX
process. A component at a layer may need to communicate with components
above/below on the same CPU or at the same layer (on the same CPU or
across a CPU boundary).

Problem:

What messaging model should we adopt betwaen the components?
Although we tried to abstract our design as much as possible from the
underlying QNX OS it is difficult to avoid. The use of the QNX message
passing seems to be the key to making effective use of the OS.

Our current view:

The design model assumes that all processes will frequently be blocked.
Ideally, the CPU should be idling a majority of the time; otherwise the
system will be inadequate to respond to a number of events occurring
together.
The following paragraphs discuss our current view on how the system
components use a Send hierarchy to avoid deadlocks).

Using the components we identified in our design, a diagram similar to the
root structure of a tree can be drawn. At the bottom tips of the roots are
the processes that interface to the hardware. They always Receive () from
processes above. They never Send() to processes above them. Higher up in
the tree are lower priority processes that have a broader overview of the
entire system, but do less detail work: they Send() to the Processes below
and Receive() from the Process above. The performance of the system can be
tuned by adjusting the priority levels of the processes.

So we have a Send hiearchy… What this means is that two processes should
never send messages to each other directly, rather they should be
organized such that each process occupies a level. All Sends () go from
one level to a lower level, never the same or higher level.

Bt there are cases where it is necessary for a process A down the tree to
get a message to a process B up the tree (e.g to report a fault
condition). Since all processes are typically Receive() blocked (on
processes above or below them), they can receive a message. So the process
A down the tree can use a pulse message to tell the process B to Send() a
message so it can use Reply() to pass the data.

Another case is when two processes on the same level that need to notify
one another of some event. They cannot blindly do a Send() since it could
result in locking . . One aproach we are thinking of taking is to send a
pulse, which notifies the other process to do a Send() which is then safe
to do because the process has been designed to Reply() immediately and is
waiting to do so.

Questions

  • Send hiearchy … from my investigation this seems to be the QNX way to
    design messaging… But should we be using the pulses in such a generic
    way ? Is this aproach common ? Can we avoid the pulses and get messages
    flowing safely up and down without risking deadlocks ?

  • One concern I have is that this aproach using pulses will make our
    design very difficult to port to another OS. Any suggestions on how to
    astract this messaging model from the applications ?

D Vezeau <danny.vezeau@sympatico.ca> wrote:

Hi, my company departemnt is about to embark on its first QNX project
and I need some feedbbck on QNX process and messaging …

We are designing a multi-cpu datacom software layered architecture. Each
CPU can run up 4 layers of software:

L1: OS, and drivers
L2: real time monitoring applications

  • must handle events from hardware (polled and interrupt driven)
  • must configure the hardware communication services (e.g. telenet
    server)
    L3: managed objects …
    These objects
  • must manage configuration commands received from the L4 layer
    (store and apply the data to hardware via L2)
  • must report on HW performance queries from the L4 layer
  • must handle events and fault reports originating from the L2 layer
    and gererate apropriate alarms to L4 layer.
    L4: external interfaces e.g. TL-1 agent used to configure and query the NE

Each layer contains a number of components that can be mapped to a QNX
process. A component at a layer may need to communicate with components
above/below on the same CPU or at the same layer (on the same CPU or
across a CPU boundary).

Problem:

What messaging model should we adopt betwaen the components?
Although we tried to abstract our design as much as possible from the
underlying QNX OS it is difficult to avoid. The use of the QNX message
passing seems to be the key to making effective use of the OS.

Our current view:

The design model assumes that all processes will frequently be blocked.
Ideally, the CPU should be idling a majority of the time; otherwise the
system will be inadequate to respond to a number of events occurring
together.
The following paragraphs discuss our current view on how the system
components use a Send hierarchy to avoid deadlocks).

Using the components we identified in our design, a diagram similar to the
root structure of a tree can be drawn. At the bottom tips of the roots are
the processes that interface to the hardware. They always Receive () from
processes above. They never Send() to processes above them. Higher up in
the tree are lower priority processes that have a broader overview of the
entire system, but do less detail work: they Send() to the Processes below
and Receive() from the Process above. The performance of the system can be
tuned by adjusting the priority levels of the processes.

So we have a Send hiearchy… What this means is that two processes should
never send messages to each other directly, rather they should be
organized such that each process occupies a level. All Sends () go from
one level to a lower level, never the same or higher level.

Bt there are cases where it is necessary for a process A down the tree to
get a message to a process B up the tree (e.g to report a fault
condition). Since all processes are typically Receive() blocked (on
processes above or below them), they can receive a message. So the process
A down the tree can use a pulse message to tell the process B to Send() a
message so it can use Reply() to pass the data.

Another case is when two processes on the same level that need to notify
one another of some event. They cannot blindly do a Send() since it could
result in locking . . One aproach we are thinking of taking is to send a
pulse, which notifies the other process to do a Send() which is then safe
to do because the process has been designed to Reply() immediately and is
waiting to do so.

Questions

  • Send hiearchy … from my investigation this seems to be the QNX way to
    design messaging… But should we be using the pulses in such a generic
    way ? Is this aproach common ? Can we avoid the pulses and get messages
    flowing safely up and down without risking deadlocks ?

Sticking to a strict send-hierarchy is always a good idea, esp. for your
first cut of the project. Pulses are exactly what you want.
There are ways of breaking send-hierarchies by having multiple threads,
two threads per process for two processes. Thread A in process 1 is
effectively at a lower level in the send hierarchy than thread A in process 2.
Thread B in process 1 is effectively at a higher level in the send hierarchy
than thread B in process 2. In this manner, you can “virtually” break
the send hierarchy. I’d love to know if you implement that, because that’s
the topic of one the chapters (titled “Bad Boys” :slight_smile:) in my upcoming book,
and having a real-world “we needed to do this because” kind of thing is
always useful.

  • One concern I have is that this aproach using pulses will make our
    design very difficult to port to another OS. Any suggestions on how to
    astract this messaging model from the applications ?

There’s no real difference (heh) between a pulse and a signal at some
level of abstraction… I’ve worked at a company where they were very
concerned that they might need to change OSs. So they went and put stupid
cover functions over all of the QNX native calls. Their code turned out
to be slower. I told them over and over that if they change OS’s, whether
they called their message passing function MsgSend or our_stupid_MsgSend
will be the least of their problems. “Whatever” was their reply.
So now, they’ve committed a large part of their resources to using QNX,
and are wondering why they are having performance problems.

Sigh.

The lesson learned there is that if you want cover functions to hide the
OS implementation, you have two choices. You can just put a layer of
gunk between yourself and the OS, and change nothing in the fundamental
paradigm that you’re using to communicate between your processes, or
you can use an “already portable” set of functions (like sockets, signals,
pipes, whatever) and incur efficiency losses. Or you can bite the bullet,
design your system cleanly, and be able to migration functionality easily
to another OS because your design is so clean.

Cheers,
-RK

[If replying via email, you’ll need to click on the URL that’s emailed to you
afterwards to forward the email to me – spam filters and all that]
Robert Krten, PDP minicomputer collector http://www.parse.com/~pdp8/