USB 2.0 QNX problems

Hello.

We experiencing the following problem with USB 2.0 host working with
bulk transfers with QNX 6.3.0. It appears, that every several
milliseconds host stops receiving data out of the USB, effectively
pausing stream for some relatively small amount of time. The duration
of activity and duration of pause depends on URB buffer size. With some
buffer sizes pause gets bigger, with some - smaller. The function is
not linear, it has several maximums.

Our program is built to keep several active URB requests always in USB
host queue. The number of active requests (i.e. incoming bulk
transfers) never gets less than 5 out of 7 allocated, we double checked
it with software hooks, so host is stopping receiving data while
several USB bulk incoming requests are still in queue.

WBR, Maxim Shvyndia aka Max Shan.

What type of device are you working with ?

Are there any errors being reported in the URB status ?

Are you setting timeout values on the URBs ?
If so are transactions timing out ?

Can you run the usb driver with verbose and see if any errors
are reported. eg. io-usb -vvvvv -dehci pindex=0,verbose=5

Can you try the SP1 devu-ehci.so driver ?

Henry

Maxim Shvyndia <maxshan@qnx.org.ru> wrote in message
news:opslvn46itdmudw9@empty.dev

Hello.

We experiencing the following problem with USB 2.0 host working with
bulk transfers with QNX 6.3.0. It appears, that every several
milliseconds host stops receiving data out of the USB, effectively
pausing stream for some relatively small amount of time. The duration
of activity and duration of pause depends on URB buffer size. With some
buffer sizes pause gets bigger, with some - smaller. The function is
not linear, it has several maximums.

Our program is built to keep several active URB requests always in USB
host queue. The number of active requests (i.e. incoming bulk
transfers) never gets less than 5 out of 7 allocated, we double checked
it with software hooks, so host is stopping receiving data while
several USB bulk incoming requests are still in queue.

WBR, Maxim Shvyndia aka Max Shan.

Thanks for reply, Henry.
On Tue, 8 Feb 2005 09:34:16 -0500, Henry VanDyke <henry@qnx.com> wrote:

What type of device are you working with ?
Self-developed camera, windows driver don’t show same pauses.



Are there any errors being reported in the URB status ?
No



Are you setting timeout values on the URBs ?
Timeout value is default.



If so are transactions timing out ?
No



Can you run the usb driver with verbose and see if any errors
are reported. eg. io-usb -vvvvv -dehci pindex=0,verbose=5
Yes, I shall, report tomorrow



Can you try the SP1 devu-ehci.so driver ?
Tried. same results

WBR, Maxim Shvyndia aka Max Shan.

Also you may want to lower the interrupt threshold on the EHCI driver.
io-usb -dehci pindex=0,int_thresh=1

This allows the EHCI chipst to interrupt more frequently and thus increases
performance.

Henry

Maxim Shvyndia <maxshan@qnx.org.ru> wrote in message
news:opslvzi9x0dmudw9@empty.dev

Thanks for reply, Henry.
On Tue, 8 Feb 2005 09:34:16 -0500, Henry VanDyke <> henry@qnx.com> > wrote:

What type of device are you working with ?
Self-developed camera, windows driver don’t show same pauses.

Are there any errors being reported in the URB status ?
No

Are you setting timeout values on the URBs ?
Timeout value is default.

If so are transactions timing out ?
No

Can you run the usb driver with verbose and see if any errors
are reported. eg. io-usb -vvvvv -dehci pindex=0,verbose=5
Yes, I shall, report tomorrow

Can you try the SP1 devu-ehci.so driver ?
Tried. same results

WBR, Maxim Shvyndia aka Max Shan.

On Tue, 8 Feb 2005 13:28:27 -0500, Henry VanDyke <henry@qnx.com> wrote:

Also you may want to lower the interrupt threshold on the EHCI driver.
io-usb -dehci pindex=0,int_thresh=1

This allows the EHCI chipst to interrupt more frequently and thus
increases performance.

Yes, thanks, perfomance increased, pauses reduced, but still not fully
satisfactory.
Is it possible set less interrupt threshold then 125us?

Can you run the usb driver with verbose and see if any errors
are reported. eg. io-usb -vvvvv -dehci pindex=0,verbose=5
No errors.

WBR, Maxim Shvyndia aka Max Shan.

What size of transfers are you doing in usd_io requests ?

What is the endpoint size ?

Are you spending alot of time in the bulk callback function ?

Henry

Maxim Shvyndia <maxshan@qnx.org.ru> wrote in message
news:opslxagsmcdmudw9@empty.dev

On Tue, 8 Feb 2005 13:28:27 -0500, Henry VanDyke <> henry@qnx.com> > wrote:

Also you may want to lower the interrupt threshold on the EHCI driver.
io-usb -dehci pindex=0,int_thresh=1

This allows the EHCI chipst to interrupt more frequently and thus
increases performance.

Yes, thanks, perfomance increased, pauses reduced, but still not fully
satisfactory.
Is it possible set less interrupt threshold then 125us?

Can you run the usb driver with verbose and see if any errors
are reported. eg. io-usb -vvvvv -dehci pindex=0,verbose=5
No errors.


WBR, Maxim Shvyndia aka Max Shan.

On Wed, 9 Feb 2005 10:19:42 -0500, Henry VanDyke <henry@qnx.com> wrote:

What size of transfers are you doing in usd_io requests ?
I using 88K, but I had tried spectra 10K-135K.



What is the endpoint size ?
512 and 1024



Are you spending alot of time in the bulk callback function ?
9 us

Henry, I suppose that these pauses appear in end of URB transfer, when
last transaction
on bus for that URB completed, but interrupt threshold not yet occurs.
Correct me if I’m wrong.

Question is - how get rid of these gaps?
Windows USB stack have not sush pauses in stream.


WBR, Maxim Shvyndia aka Max Shan.

Maxim Shvyndia <maxshan@qnx.org.ru> wrote in message
news:opsly4gyf2dmudw9@empty.dev

On Wed, 9 Feb 2005 10:19:42 -0500, Henry VanDyke <> henry@qnx.com> > wrote:


What size of transfers are you doing in usd_io requests ?
I using 88K, but I had tried spectra 10K-135K.

What is the endpoint size ?
512 and 1024

Are you spending alot of time in the bulk callback function ?
9 us

Henry, I suppose that these pauses appear in end of URB transfer, when
last transaction
on bus for that URB completed, but interrupt threshold not yet occurs.
Correct me if I’m wrong.

How big are these pauses ?
Can you try issuing usbd_io requests of multiples of 16k

Question is - how get rid of these gaps?
Windows USB stack have not sush pauses in stream.


WBR, Maxim Shvyndia aka Max Shan.

Henry, I suppose that these pauses appear in end of URB transfer, when
last transaction
on bus for that URB completed, but interrupt threshold not yet occurs.
Correct me if I’m wrong.

How big are these pauses ?
Can you try issuing usbd_io requests of multiples of 16k

Let me give full information.
We using QNX 6.3.0 SP1.

The host system CPU is P4, 2.4GHz, 1MB cache. On-board memory is 1GB.
The USB host is USB 2.0, EHCI type, embedded into Intel ICH5 South bridge.
The device under test based on Cypress MCU CY7C68013. USB 2.0 cable
length 1.5m. Device is always ready to send data to host controller.
Application allocates few hundreds URBs and queues all of them into USB
2.0 stack,
after that sends a command to the device to start streaming. Application
uses BULK USB 2.0 transfers. During each experiment URB buffer size is set
to the value mentioned in buffer column (see table below).
Bandwidth column states achieved bandwidth,
period column is a period between two adjacent gaps, gap column
is a duration of gap. Measurements was made with Tektronix TDS-2024
oscilloscope. The gaps was measured as repeated periods of inactivity of
USB host when it was not ready to receive data. Measurements was made
for default int_thresh and for int_thresh = 1.

default int_thresh (int_thresh = :sunglasses:
buffer Bandwidth period gap
128K 42.5Mb/s 3.0ms 300us
64K 31.224Mb/s 2.0ms 600us
32K 31.2Mb/s 1.02ms 300us
16K 15.62MB/s 1.02ms 640us

int_thresh = 1
128K 43.46MB/s 2.8ms 160us
64K 41.664MB/s 1.5ms 160us
32K 41.3MB/s 740us 56us
16K 31.056MB/s 490us 140us

Another values of buffer size give similar results.


As you can calculate,
period = buffer/Bandwidth (approx).

Therefore, gap happens every URB.

Maximum available bandwidth 46.2MB/s, achievable on same hardware
with Windows 2k/XP USB 2.0 stack with all the latest updates installed.

We cant achieve it on QNX because of these pauses.

Question is - how get rid of these gaps?
Windows USB stack have not such pauses in stream.

WBR, Maxim Shvyndia aka Max Shan.

Maxim Shvyndia <maxshan@qnx.org.ru> wrote in message
news:opsmjhz6x0dmudw9@empty.dev

Henry, I suppose that these pauses appear in end of URB transfer, when
last transaction
on bus for that URB completed, but interrupt threshold not yet occurs.
Correct me if I’m wrong.

How big are these pauses ?
Can you try issuing usbd_io requests of multiples of 16k

Let me give full information.
We using QNX 6.3.0 SP1.

The host system CPU is P4, 2.4GHz, 1MB cache. On-board memory is 1GB.
The USB host is USB 2.0, EHCI type, embedded into Intel ICH5 South bridge.
The device under test based on Cypress MCU CY7C68013. USB 2.0 cable
length 1.5m. Device is always ready to send data to host controller.
Application allocates few hundreds URBs and queues all of them into USB
2.0 stack,
after that sends a command to the device to start streaming. Application
uses BULK USB 2.0 transfers. During each experiment URB buffer size is set
to the value mentioned in buffer column (see table below).
Bandwidth column states achieved bandwidth,
period column is a period between two adjacent gaps, gap column
is a duration of gap. Measurements was made with Tektronix TDS-2024
oscilloscope. The gaps was measured as repeated periods of inactivity of
USB host when it was not ready to receive data. Measurements was made
for default int_thresh and for int_thresh = 1.

default int_thresh (int_thresh = > :sunglasses:
buffer Bandwidth period gap
128K 42.5Mb/s 3.0ms 300us
64K 31.224Mb/s 2.0ms 600us
32K 31.2Mb/s 1.02ms 300us
16K 15.62MB/s 1.02ms 640us

int_thresh = 1
128K 43.46MB/s 2.8ms 160us
64K 41.664MB/s 1.5ms 160us
32K 41.3MB/s 740us 56us
16K 31.056MB/s 490us 140us

Another values of buffer size give similar results.


As you can calculate,
period = buffer/Bandwidth (approx).

Therefore, gap happens every URB.

Maximum available bandwidth 46.2MB/s, achievable on same hardware
with Windows 2k/XP USB 2.0 stack with all the latest updates installed.

We cant achieve it on QNX because of these pauses.

Theses pauses are likely because the USB stack enqueues TD(s) per URB.
When the transfer completes for a URB, TDs are then enqueued for the
next URB. During this time the USB chip is likely idle ( at least for that
endpoint).
The USB chip may also only read the schedule at the beginning of an mframe
and would be empty. To alleviate this, TD’s from multiple URBs would need to
be chained on the USB chip, for the endpoint. We can look into doing this in
the future.

Henry


Question is - how get rid of these gaps?
Windows USB stack have not such pauses in stream.



WBR, Maxim Shvyndia aka Max Shan.

Maximum available bandwidth 46.2MB/s, achievable on same hardware
with Windows 2k/XP USB 2.0 stack with all the latest updates installed.

We cant achieve it on QNX because of these pauses.

Theses pauses are likely because the USB stack enqueues TD(s) per URB.
When the transfer completes for a URB, TDs are then enqueued for the
next URB. During this time the USB chip is likely idle ( at least for
that
endpoint).
The USB chip may also only read the schedule at the beginning of an
mframe
and would be empty. To alleviate this, TD’s from multiple URBs would
need to
be chained on the USB chip, for the endpoint. We can look into doing
this in
the future.

Thanks for reply. Is there workaround for a while?
Loss of bandwidth is very critical for application.

Question is - how get rid of these gaps?
Windows USB stack have not such pauses in stream.



WBR, Maxim Shvyndia aka Max Shan.

Maxim Shvyndia <maxshan@qnx-dot-org-dot-ru.no-spam> wrote in message
news:opsmnakwjm9fu9k4@empty.dev

Maximum available bandwidth 46.2MB/s, achievable on same hardware
with Windows 2k/XP USB 2.0 stack with all the latest updates installed.

We cant achieve it on QNX because of these pauses.

Theses pauses are likely because the USB stack enqueues TD(s) per URB.
When the transfer completes for a URB, TDs are then enqueued for the
next URB. During this time the USB chip is likely idle ( at least for
that
endpoint).
The USB chip may also only read the schedule at the beginning of an
mframe
and would be empty. To alleviate this, TD’s from multiple URBs would
need to
be chained on the USB chip, for the endpoint. We can look into doing
this in
the future.

Thanks for reply. Is there workaround for a while?
Loss of bandwidth is very critical for application.

There is no workaround. But as can be seen by the performance numbers
you posted; Increased buffer sizes does mean more TDs are hooked for
each URB, which means less frequency of endpoint idle time.

Henry


Question is - how get rid of these gaps?
Windows USB stack have not such pauses in stream.



WBR, Maxim Shvyndia aka Max Shan.
\

On Wed, 23 Feb 2005 10:45:40 -0500, Henry VanDyke <henry@qnx.com> wrote:


There is no workaround. But as can be seen by the performance numbers
you posted; Increased buffer sizes does mean more TDs are hooked for
each URB, which means less frequency of endpoint idle time.

What maximum value of buffer size?

Maxim Shvyndia <maxshan@qnx-dot-org-dot-ru.no-spam> wrote in message
news:opsmo5k1rv9fu9k4@empty.dev

On Wed, 23 Feb 2005 10:45:40 -0500, Henry VanDyke <> henry@qnx.com> > wrote:



There is no workaround. But as can be seen by the performance numbers
you posted; Increased buffer sizes does mean more TDs are hooked for
each URB, which means less frequency of endpoint idle time.


What maximum value of buffer size?

Size would be limited to amount of available contiguous memory and the
number
of TDs availble in the USB driver. The current driver requires that there be
enough
Tds available for enqueue to the hardware to satisfiy the URB request.

The current EHCI driver pre-allocates a pool 20 TDs for USB transfers.
Each TD
enqueued on the USB hardware is capable of 16-20k.

One of the TDs is reserved so the maximum transfer size that could be
enqueued
would be 19*20k You can increase the number of pre-allocated TDs using the
num_td option for the EHCI driver.

Henry