DMA and physical addresses

I am using QNX v6.3 and I have two related questions:

  1. For DMA purposes, I need to allocate a physically contiguous block of
    memory,
    but I’m not concerned particularly about the physical address. I would like
    to use the mmap()
    function with the MAP_PHYS and MAP_ANON flags set, but the documentation
    indicates
    that mmap_device_memory() should be used instead. Unfortunately this
    function expects
    a physical address.

  2. Is there any way to obtain a logical-to-physical address conversion (or
    perhaps on
    a lower level, can the page tables for a particular process be obtained)? I
    found the
    function drvr_mphys(), but I think this function is only available if I use
    the Network DDK.

Regards,
Tom

Tom Labno <tlabno@birinc.com> wrote:

I am using QNX v6.3 and I have two related questions:

  1. For DMA purposes, I need to allocate a physically contiguous block of
    memory,
    but I’m not concerned particularly about the physical address. I would like
    to use the mmap()
    function with the MAP_PHYS and MAP_ANON flags set, but the documentation
    indicates
    that mmap_device_memory() should be used instead. Unfortunately this
    function expects
    a physical address.

No, mmap_device_memory() should not be used for allocating DMA safe
memory. I’ve not seen the docs suggest that – if you can point me
to where it said that, maybe we can clarify whatever it says.

mmap_device_memory() is for accessing RAM on an I/O card, not for
allocating system RAM.

  1. Is there any way to obtain a logical-to-physical address conversion (or
    perhaps on
    a lower level, can the page tables for a particular process be obtained)? I
    found the
    function drvr_mphys(), but I think this function is only available if I use
    the Network DDK.

See my previous post… but for memory that is allocated contiguously,
you can call mem_offset().

e.g. allocate a 2M DMA safe buffer, and get physical address:

ptr = mmap( 0, 210241024, PROT_READ|PROT_WRITE|PROT_NOCACHE,
MAP_PHYS|MAP_ANON, NOFD, 0 );
mem_offset( ptr, NOFD, 1, &physical_addr, NULL);

NOTE: if you know that your hardware will invalidate CPU cache
due to DMA operations, you need not set the PROT_NOCACHE flag,
which can increase performance accessing the memory inside your
driver.

-David

David Gibbs
QNX Training Services
dagibbs@qnx.com

I was not aware that “mmap()” allocated (or reserved memory).
Is this realy the case when using NOFD and an offset=0?

Also, the following note is included on the mmap() help page:

→ You should use mmap_device_memory() instead of MAP_PHYS.

I interpreted this as a warning that although the MAP_PHYS flag is available
in the mmap() call, it’s use is obsolete and mmap_device_memory() should be
used instead.



“David Gibbs” <dagibbs@qnx.com> wrote in message
news:cm92rr$r62$2@inn.qnx.com

Tom Labno <> tlabno@birinc.com> > wrote:
I am using QNX v6.3 and I have two related questions:

  1. For DMA purposes, I need to allocate a physically contiguous block of
    memory,
    but I’m not concerned particularly about the physical address. I would
    like
    to use the mmap()
    function with the MAP_PHYS and MAP_ANON flags set, but the documentation
    indicates
    that mmap_device_memory() should be used instead. Unfortunately this
    function expects
    a physical address.

No, mmap_device_memory() should not be used for allocating DMA safe
memory. I’ve not seen the docs suggest that – if you can point me
to where it said that, maybe we can clarify whatever it says.

mmap_device_memory() is for accessing RAM on an I/O card, not for
allocating system RAM.

  1. Is there any way to obtain a logical-to-physical address conversion
    (or
    perhaps on
    a lower level, can the page tables for a particular process be
    obtained)? I
    found the
    function drvr_mphys(), but I think this function is only available if I
    use
    the Network DDK.

See my previous post… but for memory that is allocated contiguously,
you can call mem_offset().

e.g. allocate a 2M DMA safe buffer, and get physical address:

ptr = mmap( 0, 210241024, PROT_READ|PROT_WRITE|PROT_NOCACHE,
MAP_PHYS|MAP_ANON, NOFD, 0 );
mem_offset( ptr, NOFD, 1, &physical_addr, NULL);

NOTE: if you know that your hardware will invalidate CPU cache
due to DMA operations, you need not set the PROT_NOCACHE flag,
which can increase performance accessing the memory inside your
driver.

-David

David Gibbs
QNX Training Services
dagibbs@qnx.com

“Tom Labno” <tlabno@birinc.com> wrote in message
news:cmb0po$d8p$1@inn.qnx.com

I was not aware that “mmap()” allocated (or reserved memory).
Is this realy the case when using NOFD and an offset=0?

Offset does not matter (unless you want a certain piece of physical memory).
Yes, unlike Linux in QNX mmap can be used to allocate memory (in fact that
is what lies under the malloc). It behaves same way as BSD-style ‘anonymous
mmap’ or SysV-style ‘mmap of /dev/zero’, so it is actually somewhat
portable. You need NOFD (or just -1 for portability) and MAP_ANON.

Also, the following note is included on the mmap() help page:

→ You should use mmap_device_memory() instead of MAP_PHYS.

I interpreted this as a warning that although the MAP_PHYS flag is
available
in the mmap() call, it’s use is obsolete and mmap_device_memory() should
be
used instead.

The mmap_device_memory() is a cover for mmap64() and will use MAP_PHYS
internally. The tricky bit here is that POSIX has defined offset argument of
mmap() as signed integer, so you can be caught by surprize if you pass a
physical address higher than 2Gb (QNX has changed their mind about how
strictly they interpret that from time to time).

You can use mmap64() directly, nothing wrong with that. But MAP_PHYS is
unportable anyway.

– igor

“David Gibbs” <> dagibbs@qnx.com> > wrote in message
news:cm92rr$r62$> 2@inn.qnx.com> …
Tom Labno <> tlabno@birinc.com> > wrote:
I am using QNX v6.3 and I have two related questions:

  1. For DMA purposes, I need to allocate a physically contiguous block
    of
    memory,
    but I’m not concerned particularly about the physical address. I would
    like
    to use the mmap()
    function with the MAP_PHYS and MAP_ANON flags set, but the
    documentation
    indicates
    that mmap_device_memory() should be used instead. Unfortunately this
    function expects
    a physical address.

No, mmap_device_memory() should not be used for allocating DMA safe
memory. I’ve not seen the docs suggest that – if you can point me
to where it said that, maybe we can clarify whatever it says.

mmap_device_memory() is for accessing RAM on an I/O card, not for
allocating system RAM.

  1. Is there any way to obtain a logical-to-physical address conversion
    (or
    perhaps on
    a lower level, can the page tables for a particular process be
    obtained)? I
    found the
    function drvr_mphys(), but I think this function is only available if
    I
    use
    the Network DDK.

See my previous post… but for memory that is allocated contiguously,
you can call mem_offset().

e.g. allocate a 2M DMA safe buffer, and get physical address:

ptr = mmap( 0, 210241024, PROT_READ|PROT_WRITE|PROT_NOCACHE,
MAP_PHYS|MAP_ANON, NOFD, 0 );
mem_offset( ptr, NOFD, 1, &physical_addr, NULL);

NOTE: if you know that your hardware will invalidate CPU cache
due to DMA operations, you need not set the PROT_NOCACHE flag,
which can increase performance accessing the memory inside your
driver.

-David

David Gibbs
QNX Training Services
dagibbs@qnx.com

Tom Labno <tlabno@birinc.com> wrote:

I was not aware that “mmap()” allocated (or reserved memory).
Is this realy the case when using NOFD and an offset=0?

Also, the following note is included on the mmap() help page:

→ You should use mmap_device_memory() instead of MAP_PHYS.

I interpreted this as a warning that although the MAP_PHYS flag is available
in the mmap() call, it’s use is obsolete and mmap_device_memory() should be
used instead.

Yes, that could be confusing, especially with the “warning” label.

It should probably be re-written to say,

→ You should use mmap_device_memory() instead of MAP_PHYS unless
allocating physically contiguous memory.


(stever, you got this, or should I PR it?)

-David

David Gibbs
QNX Training Services
dagibbs@qnx.com

David Gibbs <dagibbs@qnx.com> wrote:

It should probably be re-written to say,

→ You should use mmap_device_memory() instead of MAP_PHYS unless
allocating physically contiguous memory.



(stever, you got this, or should I PR it?)

Please create a PR. Thanks.


Steve Reid stever@qnx.com
TechPubs (Technical Publications)
QNX Software Systems

Steve Reid <stever@sreid.ott.qnx.com> wrote:

David Gibbs <> dagibbs@qnx.com> > wrote:
It should probably be re-written to say,

→ You should use mmap_device_memory() instead of MAP_PHYS unless
allocating physically contiguous memory.



(stever, you got this, or should I PR it?)

Please create a PR. Thanks.

PR 22348.

-David

David Gibbs
QNX Training Services
dagibbs@qnx.com

Thanks for the information.
This has helped me a lot.

I have other issues, though…

  1. If I use mmap() to allocate contiguous memory, how do I free it?
    I tried munmap() but it doesn’t appear to work. This is not critical since
    I was able to re-create the logical-to-physical mappings for buffers
    allocated using malloc(), memalign() or mmap(), and I can use a
    chain DMA.

  2. The purpose for all of this was to prevent a data copy when sending
    data between processes. I have the following three options (I think)

a) Use neutrino MsgSend()/MsgRecieve().
This will be blocking (there are ways around this) and involves a data copy.

b) Use conventional shared memory
This may work fine, but either I will have to create a separate shared
memory object for each buffer sent or I will need to create a single
chunk of shared memory and write my own memory manager

c) Use my method for determining the list of physical addresses for
the buffer, send a message (using message queues) to the next process
(including this list) and marking the buffer as read-only using mprotect().
After the next process has completed processing the data, a return
message will be sent which informs the source process that he may
remove the write protection and re-use the buffer.

My problem with (c), is how do I handle the case where the sending
process dies before the receiver processes the data? Is there a way
to lock the memory region during this transaction so that it cannot
be re-allocated?

Is there a better way of doing this?

-Tom






“David Gibbs” <dagibbs@qnx.com> wrote in message
news:cmbg2v$o7v$1@inn.qnx.com

Steve Reid <> stever@sreid.ott.qnx.com> > wrote:
David Gibbs <> dagibbs@qnx.com> > wrote:
It should probably be re-written to say,

→ You should use mmap_device_memory() instead of MAP_PHYS unless
allocating physically contiguous memory.


(stever, you got this, or should I PR it?)

Please create a PR. Thanks.

PR 22348.

-David

David Gibbs
QNX Training Services
dagibbs@qnx.com

Tom Labno <tlabno@birinc.com> wrote:

Thanks for the information.
This has helped me a lot.

I have other issues, though…

  1. If I use mmap() to allocate contiguous memory, how do I free it?
    I tried munmap() but it doesn’t appear to work. This is not critical since
    I was able to re-create the logical-to-physical mappings for buffers
    allocated using malloc(), memalign() or mmap(), and I can use a
    chain DMA.

munmap() should free the memory allocated by mmap().

(If setting up a named shared memory area, the memory is managed
by reference count, each fd & mapping is a reference, as is the name.
When all maps are unmapped, all fds are closed, and the name is
unlinked, then the memory is freed.)

If it doesn’t that is a bug – what makes you think it wasn’t
freed?

Note that if you malloc() your memory, while virtually contiguous,
the memory returned need not be physically contiguous. And you
don’t know where the boundary is, and the behaviour of mem_offset()
is not defined for malloc()ed data. (Note in the mem_offset() docs
it says the fd must be the one passed for mmap() the data – you don’t
know that any particular fd you pass is the same as malloc() might have
used to get the memory.)

  1. The purpose for all of this was to prevent a data copy when sending
    data between processes. I have the following three options (I think)

How much data are you trying to move around? Sometimes there is as
much work in not copying the data as there is in copying it.

a) Use neutrino MsgSend()/MsgRecieve().
This will be blocking (there are ways around this) and involves a data copy.

Blocking can be useful – it is built-in synchronisation.

b) Use conventional shared memory
This may work fine, but either I will have to create a separate shared
memory object for each buffer sent or I will need to create a single
chunk of shared memory and write my own memory manager

For a simple, fixed-size allocator, this is pretty easy. Hopefully
you don’t need a general-purpose allocator.

Note, you can combine this scheme with message-passing – send the
offset (and size if relevant) of the chunk to be processed on to
the next process in the chain.

c) Use my method for determining the list of physical addresses for
the buffer, send a message (using message queues) to the next process
(including this list) and marking the buffer as read-only using mprotect().
After the next process has completed processing the data, a return
message will be sent which informs the source process that he may
remove the write protection and re-use the buffer.

This is a bit ugly. Again, how much data are you handling? Message
queues involve a double S/R/R combination – extra overhead in
context-switching and kernel calls, that may eat up your savings
in data-copy times.

My problem with (c), is how do I handle the case where the sending
process dies before the receiver processes the data? Is there a way
to lock the memory region during this transaction so that it cannot
be re-allocated?

About the only way to do this, is to associate the memory with a name
in the pathname space.

Is there a better way of doing this?

Possibly, but it may require stepping back another step or two,
examining data flow and assumptions about data flow, then designing.

Who is the ultimate “owner” of the data? Can somebody up-stream
allocate the memory, then send a request down to the driver doing
the DMA, saying, in essence, fill this physical area with the data
and reply when you’re done? Or, start filling this physical area
with data, and notify me every so many chunks (maybe with a pulse)?

-David

David Gibbs
QNX Training Services
dagibbs@qnx.com