John Nagle wrote:
Andrew Thomas wrote:
Armin Steinhoff wrote:
That’s the reason why middleware like PVM and MPI have been created.
These packages are doing message passing on top of the socket lib.
Or, if you have legacy QNX4 code, or just like the kernel-mediated
message passing, proxies, etc. of QNX, you could use the SRRIPC module
for Linux. It is basically QNX4 message passing, proxies, timers, and
to some degree user-space interrupts for Linux.
None of those do as good a job of message passing as QNX,
because the message passing and scheduling aren’t as well
coupled. SRRIPC takes two copies and an extra trip through
the scheduler to do what QNX does with one copy and one context switch.SRRIPC was done for the 2.2x Linux kernel, and it’s not
clear how well it works with the 2.6 kernel. It never really
got out of beta, either. You definitely
want a 2.6 Linux kernel for anything even vaguely real-time. That’s
the one with the low-latency fixes.Still, it’s important to have a migration path from
QNX available. We have to be realistic about the future of QNX
since the acquisition.
John Nagle
Hi John,
SRRIPC has been running in production systems for years on 2.4 kernels,
and has been stable on 2.6 kernels for quite some time. People have
successfully ported large, complex QNX4 systems to Linux using the
SRRIPC module with almost no code change.
It does indeed do two copies - once into kernel space, and once back out
again - with every message pass. At some point it becomes memory
bandwidth limited, but this point occurs between 1 and 2 GB per second
on a typical machine. There was a patch floating around for single-copy
messaging in the SRRIPC module, but it was against a quite old version,
so we never incorporated it.
I’m not sure you’re correct about the two trips through the scheduler.
What makes you say that? It looks to me like there are the same number
of scheduling passes as in QNX.
In any case, as you imply, Linux is not truly real-time. There is no
priority inheritance in the SRRIPC module and memory access in the
kernel can cause paging. That’s hardly the point. For a stock Linux
kernel, there is no faster way to do message passing, and no other way
to cleanly get the QNX4 message passing semantics. To give you an idea,
we tested a couple of TCP-based approximations of QNX4 messaging and
they were up to 5000 times slower than the SRRIPC module. They did not
do a good job with proxies (especially proxies on timers), did not
deliver task death notification, and did not offer any help in interrupt
handling.
The SRRIPC module routinely posts performance that shows that it runs at
about half the speed of QNX4 messaging on the same hardware. Blame the
two-copy message passing.
Cheers,
Andrew