DMA Speed

I am currently designing a DMA engine in a PCI driver and I have come
across an interesting design decision. There seem to be two very
different ways to handle DMA accesses and I am curious to hear what
other people think is faster. Below are the two broad designs I am
considering:

  1. On driver startup, allocate a very large block (4M - 16M) of
    contiguous memory by calling qnx_segment_alloc_flags. This memory block
    will then be managed via a free-memory linked list. Every time a DMA
    request is made, the list will have to managed accordingly. This
    requires a significant amount of overhead to manage the list, but only
    one call to qnx_segment_alloc_flags will ever be made.

2.) Allocate smaller contiguous blocks dynamically by calling
qnx_segment_alloc_flags when DMA requests are made. Here there will be
no free-memory list to manage but there will be a large number of calls
to qnx_segment_alloc_flags.

Speed is the most important priority. However, if the second option is
chosen, will there be amply free contiguous blocks of memory after weeks
of operation without rebooting the OS?

Thanks for any input,
Steve

“Steve” <stlang@vt.edu> wrote in message news:3A1A51C1.B1151154@vt.edu

I am currently designing a DMA engine in a PCI driver and I have come
across an interesting design decision. There seem to be two very
different ways to handle DMA accesses and I am curious to hear what
other people think is faster. Below are the two broad designs I am
considering:

  1. On driver startup, allocate a very large block (4M - 16M) of
    contiguous memory by calling qnx_segment_alloc_flags. This memory block
    will then be managed via a free-memory linked list. Every time a DMA
    request is made, the list will have to managed accordingly. This
    requires a significant amount of overhead to manage the list, but only
    one call to qnx_segment_alloc_flags will ever be made.

2.) Allocate smaller contiguous blocks dynamically by calling
qnx_segment_alloc_flags when DMA requests are made. Here there will be
no free-memory list to manage but there will be a large number of calls
to qnx_segment_alloc_flags.

Option 1 is definitely better, calls to qnx_segment_alloc… are costly.
Couldn’t you handle the DMA memory as circular memory instead
of memory link list?


Speed is the most important priority. However, if the second option is
chosen, will there be amply free contiguous blocks of memory after weeks
of operation without rebooting the OS?

Thanks for any input,
Steve

“Mario Charest” <mcharest@void_zinformatic.com> wrote in message
news:8ve889$1qf$1@nntp.qnx.com
|
| “Steve” <stlang@vt.edu> wrote in message news:3A1A51C1.B1151154@vt.edu
| > I am currently designing a DMA engine in a PCI driver and I have come
| > across an interesting design decision. There seem to be two very
| > different ways to handle DMA accesses and I am curious to hear what
| > other people think is faster. Below are the two broad designs I am
| > considering:
| >
| > 1) On driver startup, allocate a very large block (4M - 16M) of
| > contiguous memory by calling qnx_segment_alloc_flags. This memory block
| > will then be managed via a free-memory linked list. Every time a DMA
| > request is made, the list will have to managed accordingly. This
| > requires a significant amount of overhead to manage the list, but only
| > one call to qnx_segment_alloc_flags will ever be made.
| >
| > 2.) Allocate smaller contiguous blocks dynamically by calling
| > qnx_segment_alloc_flags when DMA requests are made. Here there will be
| > no free-memory list to manage but there will be a large number of calls
| > to qnx_segment_alloc_flags.
| >
|
| Option 1 is definitely better, calls to qnx_segment_alloc… are costly.
| Couldn’t you handle the DMA memory as circular memory instead
| of memory link list?

I agree with Mario. Grab what you’re going to need when you start up, it
eliminates possible conflicts later on. Come up with a cute ring buffer scheme
and your free list management overhead will be minimal.

-Warren

“Warren Peece” <warren@nospam.com> wrote in message
news:8vedsm$i7c$1@inn.qnx.com

“Mario Charest” <mcharest@void_zinformatic.com> wrote in message
news:8ve889$1qf$> 1@nntp.qnx.com> …
|
| “Steve” <> stlang@vt.edu> > wrote in message
news:> 3A1A51C1.B1151154@vt.edu> …
| > I am currently designing a DMA engine in a PCI driver and I have come
| > across an interesting design decision. There seem to be two very
| > different ways to handle DMA accesses and I am curious to hear what
| > other people think is faster. Below are the two broad designs I am
| > considering:
|
| > 1) On driver startup, allocate a very large block (4M - 16M) of
| > contiguous memory by calling qnx_segment_alloc_flags. This memory
block
| > will then be managed via a free-memory linked list. Every time a DMA
| > request is made, the list will have to managed accordingly. This
| > requires a significant amount of overhead to manage the list, but only
| > one call to qnx_segment_alloc_flags will ever be made.
|
| > 2.) Allocate smaller contiguous blocks dynamically by calling
| > qnx_segment_alloc_flags when DMA requests are made. Here there will
be
| > no free-memory list to manage but there will be a large number of
calls
| > to qnx_segment_alloc_flags.
|
|
| Option 1 is definitely better, calls to qnx_segment_alloc… are costly.
| Couldn’t you handle the DMA memory as circular memory instead
| of memory link list?

I agree with Mario.

Phew :wink:

Previously, Steve wrote in qdn.public.qnx4:

Speed is the most important priority. However, if the second option is
chosen, will there be amply free contiguous blocks of memory after weeks
of operation without rebooting the OS?

What makes you think that the OS’s allocation scheme will have less
overhead than yours?


Mitchell Schoenbrun --------- maschoen@pobox.com