Real time problem...

Hi!
Any answer greatly appreciated!

Intro:
I have made a QNX 6.0 resource manager, following the QNX documentation. I
am working against a PCI card (using a PLX 9050 controller), writing data
using C function memcpy in a interrupt thread handler, which is unblocked
every time my ISR determines if the PCI card has generated the interrupt.
Interrupt handling, writing data to the card at high bitrates works great at
nearly 40 Mbit/s! I am using a data analyser analysing what has gone through
the PCI card, and verifies that the data coming through is correct. As
mentioned, I am using memcpy, copying 1880 bytes every time an interrupt
occurs.

Strange things start to happen when I for example move a window (terminal
window in Photon). Also, occasional use of the printf() also cased the same
problem. The problem seems to be that data copying, using the memcpy stops
or halts. The result is corrupted data at the output of the PCI card.

At low bit-rates this problem does not occur.

I am aware of that printf() or moving a window takes a lot of processor
resources, but is should not interfere with a memcpy function using the PCI
bus for writing data to the PCI card. I would expect only a lower
throughput.

I have also tried different priorities on the interrupt thread, but with now
better result.

Have anyone gone through a similar problem, and solved it?


Regards,

Oystein Solvberg
Telenor Satellite Broadcasting
Norway

Øystein Sølvberg wrote:

I am aware of that printf() or moving a window takes a lot of processor
resources, but is should not interfere with a memcpy function using the PCI
bus for writing data to the PCI card. I would expect only a lower
throughput.

I have also tried different priorities on the interrupt thread, but with now
better result.

The CPU load incurred by moving the windows, or doing a printf is not
the problem (as you feel it shouldn’t be). The problem is most likely
either the video card hogging the PCI bus, or the video driver spending
a lot of time in an interrupt handler (assuming you’ve verified that the
video driver isn’t running at a high prio). This problem basically
boils down to old “hardware and software both have to be real-time
compatible to build a real-time system” adage. I suspect the video
driver developers at QSSL do the best they can to try and make the video
drivers as real-time as possible, but the hardware developers probably
don’t make it easy (since 99.99999999% of their market has nothing to do
with real-time, and product differentiation relies on moving as many
pixels as possible between system ram and video ram in the minimum
amount of time).

Have anyone gone through a similar problem, and solved it?

Many I think, although I’m not one of them (I use graphics on QNX, and I
do hard real-time on QNX but not both in the same space and time).

Rennie

Hi Oystein,

I saw Rennie already gave you a couple of good advices. I believe in this
particular case something is not quite right with hardware. Otherwise, I
think, changing priorities should have helped, or at least caused some
changes in behaviour.

The golden rule is: less parts - more reliable system. When you need
real-time you must be absolutely sure what every part (hardware or software)
of your system is doing in every particular moment.

I’m using a PLX9080 based FPMC (few of them per computer). The purpose of
the system is to collect data from 1 to 4 external intelligent devices (1
FPMC per device), do some math when data from all devices
is obtained and make the result available to clients connected to this box
via TCP/IP. I don’t measure the productivity of the system in KB/s or MB/s.
It is rather time needed to communicate to all devices. All
FPMC<->device transactions are synchronous: every so often computer sends a
request to all devices and waits fixed time for response. The amount of data
sent back and force is pretty small (about 100-4000
bytes), but “so often” has to be constant (late data is lost data). We
achieved 0.7ms rate on P III 1.2GHz (35% CPU load) and 1.3ms on P233MHz (95%
CPU load). 0.5ms of those times takes the external device to respond. The
good news is that the FPMCs transfer data from FPGA FIFO to computer memory
via DMA. No matter how many devices are connected, the data from all of them
is available practically simultaneously. The resource manager just puts all
data into one IOV.

One of the components of the success is to assign proper priorities to every
task. Make the priority of the tasks, which behaviour you cannot predict,
lower then critical tasks. In my case, for example, the system
has no other interface than network (telnet, ftp, MODBUS), but the network
is widely open for any authorized client at any time. So, the resource
manager priority is higher than io-net just by 1.

Sincerely,

Serge

P.S. I used to read this newsgroup through the http://groups.google.com, and
sometimes I post messages via the same interface. I did the same with this
message a couple days ago. When I looked at the newsgroup from Outlook
Express I couldn’t find it. I believe that messages posted that way never go
to inn.qnx.com, but are just kept somewhere on google (i.e. everything
written with google can be read with google only). Sorry for this offtopic.

“Øystein Sølvberg” <oystein.solvberg@telenor.com> wrote in message
news:akm466$qss$1@inn.qnx.com

Hi!
Any answer greatly appreciated!

Intro:
I have made a QNX 6.0 resource manager, following the QNX documentation. I
am working against a PCI card (using a PLX 9050 controller), writing data
using C function memcpy in a interrupt thread handler, which is unblocked
every time my ISR determines if the PCI card has generated the interrupt.
Interrupt handling, writing data to the card at high bitrates works great
at
nearly 40 Mbit/s! I am using a data analyser analysing what has gone
through
the PCI card, and verifies that the data coming through is correct. As
mentioned, I am using memcpy, copying 1880 bytes every time an interrupt
occurs.

Strange things start to happen when I for example move a window (terminal
window in Photon). Also, occasional use of the printf() also cased the
same
problem. The problem seems to be that data copying, using the memcpy stops
or halts. The result is corrupted data at the output of the PCI card.

At low bit-rates this problem does not occur.

I am aware of that printf() or moving a window takes a lot of processor
resources, but is should not interfere with a memcpy function using the
PCI
bus for writing data to the PCI card. I would expect only a lower
throughput.

I have also tried different priorities on the interrupt thread, but with
now
better result.

Have anyone gone through a similar problem, and solved it?


Regards,

Oystein Solvberg
Telenor Satellite Broadcasting
Norway

Rennie Allen <rallen@csical.com> wrote:

I am aware of that printf() or moving a window takes a lot of processor
resources, but is should not interfere with a memcpy function using the PCI
bus for writing data to the PCI card. I would expect only a lower
throughput.

Just to clarify, when you call “printf” the output is going to a pterm, right?

I have also tried different priorities on the interrupt thread, but with now
better result.

The CPU load incurred by moving the windows, or doing a printf is not
the problem (as you feel it shouldn’t be). The problem is most likely
either the video card hogging the PCI bus, or the video driver spending

There have been problems like this in the past, but they were caused by
driver bugs, and were fixed. Once upon a time, with certain graphics
drivers, you would lose characters in a pterm, when running qtalk. This
was because the scrolling/blitting was starving the serial drivers
interrupt handler. I am aware of 3 drivers that had problems like
this, and I believe they were all fixed by 6.1. The problems were
caused by writing draw commands into the adapters command FIFO while
there was not sufficient room in the FIFO. This would cause the
graphics card to generate wait states until FIFO space became available,
but in the meantime, the CPU would be stuck waiting for the PCI write
cycle to finish.

a lot of time in an interrupt handler (assuming you’ve verified that the

None of the standard graphics drivers that are currently shipping attach
interrupt handlers (and if they did, they wouldn’t need to spend much
time in the handlers).

video driver isn’t running at a high prio). This problem basically
boils down to old “hardware and software both have to be real-time
compatible to build a real-time system” adage. I suspect the video
driver developers at QSSL do the best they can to try and make the video
drivers as real-time as possible, but the hardware developers probably
don’t make it easy (since 99.99999999% of their market has nothing to do
with real-time, and product differentiation relies on moving as many
pixels as possible between system ram and video ram in the minimum
amount of time).

Have anyone gone through a similar problem, and solved it?

You could try to eliminate the possibilty of a graphics driver bug
by running the vesabios or vga drivers, and seeing if the problem
still occurs. These driver do dumb PCI frame buffer writes, and do
not do any register writes, or bus-mastering while drawing.

If this makes a difference, then there is reason to suspect the
graphics driver, in which case you could post the output of the
‘pci’ utility.

Dave

David Donohoe wrote:

Rennie Allen <> rallen@csical.com> > wrote:

Just to clarify, when you call \\\“printf\\\” the output is going to
a pterm, right?

This is the assumption I am making.

None of the standard graphics drivers that are currently shipping attach
interrupt handlers (and if they did, they wouldn\\\'t need to spend much
time in the handlers).

This is good information (not something most people can check for
themselves, since they don\\\'t have access to every card supported -
and even if they did, the lack of a \\\“pidin irq\\\”, makes it
difficult to determine whether a driver is attaching an IRQ :slight_smile:

If this makes a difference, then there is reason to suspect the
graphics driver, in which case you could post the output of the
\\\‘pci\\\’ utility.

There is already good reason to suspect the graphics driver, since the
original poster stated that the problem occured not only when moving the
window (in which case it could be either the input driver or the graphics
driver), but also when an application simply issued printf’s to the tty
connected to (I am assuming) a pterm.

Your suggestion for ruling out the graphics driver should confirm or deny
the suspicion.

Some of the older video drivers would lock out interrupts. It was especially evident when moving
a window as you describe. This is all fixed in 6.2. (see thread entitled ‘long interrupt latency’
from March
in this newsgroup).

“Øystein Sølvberg” <oystein.solvberg@telenor.com> wrote in message news:akm466$qss$1@inn.qnx.com

Hi!
Any answer greatly appreciated!

Intro:
I have made a QNX 6.0 resource manager, following the QNX documentation. I
am working against a PCI card (using a PLX 9050 controller), writing data
using C function memcpy in a interrupt thread handler, which is unblocked
every time my ISR determines if the PCI card has generated the interrupt.
Interrupt handling, writing data to the card at high bitrates works great at
nearly 40 Mbit/s! I am using a data analyser analysing what has gone through
the PCI card, and verifies that the data coming through is correct. As
mentioned, I am using memcpy, copying 1880 bytes every time an interrupt
occurs.

Strange things start to happen when I for example move a window (terminal
window in Photon). Also, occasional use of the printf() also cased the same
problem. The problem seems to be that data copying, using the memcpy stops
or halts. The result is corrupted data at the output of the PCI card.

At low bit-rates this problem does not occur.

I am aware of that printf() or moving a window takes a lot of processor
resources, but is should not interfere with a memcpy function using the PCI
bus for writing data to the PCI card. I would expect only a lower
throughput.

I have also tried different priorities on the interrupt thread, but with now
better result.

Have anyone gone through a similar problem, and solved it?


Regards,

Oystein Solvberg
Telenor Satellite Broadcasting
Norway

Thanks to everyone for answers!
Just to clarify, the printf is going to a pterm.

I have tested the system a bit more and found an error in my code setting
the interrupt thread priority. This resulted in that my previous test the
interrupt thread had default priority all the time (10r). I have corrected
the error and now checked that the priority is really changed, then tested
again.

As Rennie Allen and Serge Yuschenko says changing priority should have
helped, or at least caused some changes in behaviour.

Processes and priority related to graphics environment in my system are as
follows:

Photon, 10 XPhoton, 10 io-graphics, 12 devi-hirun (15,10,12)

My new test results now has shown a difference as follows:

  1. Interrupt thread priority: 10 (processor load 20%), errors detected when
    moving a window.
  2. Interrupt thread priority: 19 (processor load 20%), no errors detected
    when moving a window.
  3. Interrupt thread priority: 10 (processor load 99,9%), errors detected
    when moving a window.
  4. Interrupt thread priority: 19 (processor load 99,9%), no errors detected
    when moving a window.

The 20% processor load is the result of interrupts generated at a constant
rate of 2500 interrupts per sec.

The 100% processor load is the result of interrupt generated at a constant
rate of 2500 interrupt per sec. plus a simple program running at priority
10r doing a while (TRUE) loop.

So, after all it seems that setting the priority of the interrupt thread
higher than the graphics processes (as listed above) results in no errors in
the data transmission to the PCI card.

It is strange however that at lower priorities (10), the memcpy is
interrupted, when it says in the QNX library documentation that memcpy is
thread safe, interrupt safe and signal safe. Am I wrong here?

Oystein Solvberg

“Øystein Sølvberg” <oystein.solvberg@telenor.com> wrote:

It is strange however that at lower priorities (10), the memcpy is
interrupted, when it says in the QNX library documentation that memcpy is
thread safe, interrupt safe and signal safe. Am I wrong here?

That memcpy() is defined as interrupt safe doesn’t mean it will complete
uninterrupted, it means that it is safe to call that function in an
interrupt service routine. That it is signal safe means it is safe
to call it in a signal handler – you could in fact be interrupted in
the middle of a memcpy, either by a hardware interrupt, by a signal,
or by preemption by another task or other possible things.

-David

QNX Training Services
http://www.qnx.com/support/training/
Please followup in this newsgroup if you have further questions.

“Øystein Sølvberg” <oystein.solvberg@telenor.com> wrote:

It is strange however that at lower priorities (10), the memcpy is
interrupted, when it says in the QNX library documentation that memcpy is
thread safe, interrupt safe and signal safe. Am I wrong here?

Most likely your thread is not pre-empted while doing memcopy. Data
processing is not started immediately after interrupt has been triggered.
Usually, all what handler does it arms new interrupt fixing a couple of
hardware registers and returns a pulse to your program. If you use
InterruptAttach()/InterruptWait() you can see it for yourself (in case
InterruptAttachEvent()/MsgReceive() your program receives a pulse and must
do all work that was spit between interrupt handler and program in previous
case). If your thread priority is not high enough to win a struggle for CPU
it can be pre-empted upon receiving the pulse before even it starts any data
processing. In the worst case those pulses can be queued, and when finally
your thread is resumed it receives whole bunch of proxies at once.


Regards,

Serge

“Øystein Sølvberg” <oystein.solvberg@telenor.com> wrote in message
news:al4rrn$niv$1@inn.qnx.com

Thanks to everyone for answers!
Just to clarify, the printf is going to a pterm.

It is strange however that at lower priorities (10), the memcpy is
interrupted, when it says in the QNX library documentation that memcpy is
thread safe, interrupt safe and signal safe. Am I wrong here?

Thread/interrupt/signal safe doesn’t mean that it won’t
be interrupted. To put it simply it means the function
doesn’t use any global variable and can be called
“simultaneously” by multiple thread. In the example of
memcpy, you can have 100 threads all loosing CPU
while in the middle of memcpy and when they will get
the CPU back, they will all resume where they left off
without any side effect (aside timing)

The fact that your process is affected by this has nothing
to do with function being thread/interrupt/signal safe. It’s
a timing issue.

Oystein Solvberg

“Serge Yuschenko” <serge.yuschenko@rogers.com> wrote in message
news:al6eep$1l6$1@inn.qnx.com

your thread is resumed it receives whole bunch of proxies at once.

^^^^^^^
Sorry. I meant pulses, of cource. QNX4 habit.

Serge