QNX 4.25 lock-up on framing error on TL16C754B quad UART

Summary
A framing error on the TL16C754B locks up QNX 4.25, doing a tight loop of reading the IIR and IER, but never reading the LSR to clear the framing error.

I wonder if QNX 4.25 Dev.ser has a “verbose” command that would allow us to see what it is autodetecting the UART as?

Is there any history or knowledge available showing a problem with the TL16C754B quad UART and QNX 4.25’s Dev.ser?

Problem Description
We have encountered a problem with an embedded system that runs QNX 4.25. It has a Texas Instruments TL16C754B quad UART connected to an STPC Elite. The system works fine in all cases, except for one: when the baud rate is high, say 57600, and you send a character at a low baud rate, say 9600, QNX locks up until watchdog reboot.

The 9600 baud character causes a framing error on the UART. Upon receipt of this particular interrupt, QNX locks into a tight loop of about 8-20 uSec per loop. It looks like QNX can’t handle the framing error interrupt that such an event would cause, at least on this particular quad UART, the T.I. TL16C754B.

Autodetect UART?
There is an elaborate detection scheme used by all O/S when trying to figure out what UART it has on-hand, so it can both use the advanced features of the part (like FIFO) and avoid bugs (some UARTs have broken LCRs, etc). Perhaps QNX is detecting this one incorrectly?

I captured the entire detection sequence on the analyzer, but I can’t make sense of it.

Datasheet
When I refer to table and page numbers, I’ll be referring to the datasheet for the quad UART, which T.I. calls SLLS397A.

“Good” Character Received
Step…UART…Address…Data Read…Data Write…Comment
00…IRQ goes active after character received
01…3…2…C1…IIR read of UART 3 (shares IRQ), no interrupt
02…1…2…CC…IIR read, sees binary XX001100, see table 4 page 11, RX timeout
03…1…0…0D…The that I sent from the terminal, IRQ released after this
04…1…1…60…IER read, don’t know why
05…3…2…C1…IIR read of UART 3, no interrupt
06…1…2…C1…IIR read of UART 1, no interrupt

“Bad” Character Received, QNX Locks Up
Step…UART…Address…Data Read…Data Write…Comment
00…IRQ goes active during character receive (framing error)
01…3…2…C1…IIR read of UART 3 (shares IRQ), no interrupt
02…1…2…C6…IIR read, sees binary XX000110, LCR interrupt
03…1…1…E1…IER read, don’t know why
04…3…2…C1…IIR read of UART 3 (shares IRQ), no interrupt
05…1…2…C6…IIR read, sees binary XX000110, LCR interrupt
06…1…1…E1…IER read, don’t know why
07…3…2…C1…IIR read of UART 3 (shares IRQ), no interrupt
08…1…2…C6…IIR read, sees binary XX000110, LCR interrupt
09…1…1…E1…IER read, don’t know why
10…3…2…C1…IIR read of UART 3 (shares IRQ), no interrupt
11…1…2…C6…IIR read, sees binary XX000110, LCR interrupt
12…1…1…E1…IER read, don’t know why

You can find the Dev.ser source code at

ftp://ftp.qnx.com/usr/free/qnx4/os/samples/Dev_drivers

Hi,

That helped a lot! We found our problem and fixed it.

The only concern is that this source code appears older than the Dev.ser that came with 4.25G, and has a slightly different use message. Should I be worried about that? The use message differs in that it refers to PCMCIA by its new proper name, PCCARD. No worries there, but it makes me wonder if it was just recompiled and re-issued, or whether other changes were made.

What We did to Fix Dev.ser to Solve the Issue
What we did was almost trivial. We changed Dev/ser/async.c, line 48, 0x1e → 0x9E. Explanation below.

Changes from Existing Code to Dev.ser for 4.25G
A colleague reviewed the release notes for Dev.ser 4.25G, and they state:

  • Modified the driver to work with the new pccard driver.
  • Fixed a problem with the driver where it would drop characters if you use the -t command-line option.

The first one isn’t a big concern to me, but the second one is. The TL16C754B is an enhanced UART with a big FIFO, does it drop characters?

Since the Dev.ser 4.23 source code is out there (I assume that it was from 4.23), QNX would post the 4.25G source code as well?

Error in Address Line A2 of Analyzer Trace
By the way, in my earlier note, I had said that the UART’s Interrupt Enable Register (IER) was being read, and I didn’t know why. Now I know that I had a bad A2 address bit on my analyzer trace. It’s reading the Line Status Register (LSR), not the IER. I now understand what was going on.

Why the Changes Worked
In directory Dev/ser, file async.c, line 48, change (lsr & 0x1e) to (lsr & 0x9E). Why? The original i8250 and similar UARTs had bit 7 of their LSR always read as zero, but it appears that the TL16C754B quad UART that we use shows bit 7 of the LSR to indicate whatever error we are getting here. It turns out that the UART has a byte in its FIFO, although it doesn’t show as a receive byte when you get the Interrupt Identification Register (IIR) because there is an error. [ kind of dumb, but it appears to be true ]

If you make this change, the driver will pick up the bit 7 of LSR, read the byte out of the Receive FIFO, and the system will recover.