Serial not writing out constanly

I am see an issue where the serial outputting through a pc104 serial card will sometimes (randomly of course) not output data for 100 ms then blast three or four messages at once. I have a process which is writing out data every 10 ms and expecting a return message from a device.

In the system profiler I see the task doing its sending properly with the ser8250 and he gets msgblocked on a read from the ser8250 and the interrupt line for the serial device goes dead.

The logic analyzer hooked up to the 232 port sees no transmit from the serial device and then 3 right in a row, so I would say the analyzer and the interrupt traces would agree with each other.

How do I further go deeper to get to the point where the data is going from the os/software to the serial hardware?

Thanks

It sounds like you might be being interrupted by a higher priority, cpu bound thread. If you happen to have the new schedule partition software, you could try putting the driver and your processes in a separate partition to test this theory.

Maschoen

That is what I thought by the logic analyzer output but the kernel trace isn’t showing the ser8250 being interrupted or blocked. Is this not the last process to touch it before it goes to hardware?

What state is the driver in during this period? If it is in RUNNING mode, then it is chewing up cpu cycles. This would imply a bug in the driver, which is very unlikely, or some malfunctioning hardware. If it is in READY mode, then some other higher priority thread is in RUNNING mode. If it is not in RUNNING or READY mode, then it is blocked.

BTW, is there any output flow control on the port that might be causing this?

It doesn’t appear to be busy or waiting. I am trying to dig deeper into the kernel trace but I am fairly to new to it so it takes a little longer then you guys.

I am assuming now (after some reading) that ser8250 is the last piece of software to touch the string before it goes to hardware.

By not busy or waiting, I assume you mean devc-ser8250.

I’ve never used the kernel trace for QNX 6, but it should be indicating thread state transitions. The QNX states are described in the System Architecture Manual and “busy” and “waiting” are not included. As I tried to explain above there are only three interesting possibilities. If your thread is RUNNING, then it is the active thread in the cpu. There usually is only one of these unless you have a multiprocessor system. If your thread is READY, then a thread at the same or higher priority is RUNNING. If it is at the sme priority as your thread, then you are probably time-slicing with that thread. All other states are blocked states. In a blocked state, a thread will not run until some event occurs to change its state to READY. Typically this mean the thread is waiting for something to happen.

There is one more idea I’ve just thought of. Is it possible that the thread that opens the serial device and writes to it is being blocked by some higher priority thread? That would imply that you have some thread running that occaisionally becomes cpu bound. Are you doing an heavy calculations?

I pm’ed you but
I see that my interrupt 5 (which is for this serial port) is not showing any activity for almost 250ms. There are plenty of snippets of time in there where you see the procnto-idle active, so I don’t think something is stopping the interrupt from getting the trasnmit data from ser8250 and putting it out.

I see the ser8250 code is outputting stuff to the slogger. I outputted the slogger using sloginfo to a file which captured ~7 seconds of entries and I am not seeing any errors being outputted. I don’t know if that means anything to anyone but I thought I would post it anyways.

What causes the serial interrupt (interrupt 5 in my case) to grab/start transmitting data from ser8250?

Eric,

Is there any chance another device is sharing interrupt 5 that might on occasion cause issues for you such as the delay your seeing?

The other other thing I can think of to mention is if your messages are really small (a couple bytes) that there are some buffers on the driver for sending/receiving that might be waiting to be filled before doing the actual hardware transmit.

Tim

My messages are 16 bytes and if I look back in time I see that there were other transactions of tx and rx that occur without a hiccup. I shut off all other serial devices while I am tracking this down so I know another serial device is not using the interrupt but I will look into other devices. I do find it awkward that I am not seeing any events/activity on the interrupt (according to the kernel trace and System Profiler) which would lead me to assume that no other device is using this interrupt.

I have verified with pidin irqs that no other process has attached to irq 5 but ser8250.

Eric,

One more thing.

Is the serial port set up to do hardware flow control or software only? What about your remote device? Does it do any hardware flow control?

You can do a ‘stty < /dev/ser1’ to see the state of everything if your not sure.

Tim

Here is output and I am unsure what the device is setup for, but I can look into that.

Name: /dev/ser3
Type: serial
Opens: 2
+raw +echoctl
intr=^C quit=^\ erase=^? kill=^U eof=^D start=^Q stop=^S susp=^Z
lnext=^V min=01 time=00 pr1=^[ pr2=5B left=44 right=43 up=41
down=42 ins=40 del=50 home=48 end=59
par=none bits=8 stopb=1 baud=9600 rows=0,0

Eric,

Definitely no hardware flow control on the QNX side.

Tim

How did you see that?

Check the errata for serial port issues on your pc/104 card. I know of one pc/104 card that has erratic serial port problems when running with fifos enabled. You might turn off the fifos to see if it helps any.

I have tried the ser8250 with fifo enabled and disabled; actually we have been running with fifo’s disabled.
I a using a Winsystem’s PCM-COM4 PC/104 board so I will look on their site for issues.

Thanks

Eric,

Check stty in the helpviewer (or do a use stty from the command line) and you’ll see everything that can be returned and what it means.

Since you didn’t have an ihflow or an ohflow it means you don’t have hardware flow control enabled.

Tim