Questions regarding PCI busmaster DMA on Neutrino 2.0

I have some questions regarding PCI busmaster DMA on Neutrino 2.0. If I
were to allocate a physically contiguous memory buffer by calling

addr = mmap(0, size, PROT_WRITE, MAP_PHYS | MAP_ANON,
NOFD, 0);

Is the address in addr a virtual address in my address space or a
physical address? If it’s a virtual address, how do I get the
corresponding physical address?

Alternatively, is it possible to DMA directly from a regular memory
buffer (not physically contiguous)? In many other OS, one can get a
list of the physical addresses of all the physical pages the virtual
buffer is made of. Then the DMA engine can be programmed (if supported)
to DMA from the physical addresses one page at a time (chain DMA).
There seems to be support for this in QNX 4 but I can’t find it in
Neutrino 2.0.


Sent via Deja.com http://www.deja.com/
Before you buy.

hhpt@my-deja.com wrote:

I have some questions regarding PCI busmaster DMA on Neutrino 2.0. If I
were to allocate a physically contiguous memory buffer by calling

addr = mmap(0, size, PROT_WRITE, MAP_PHYS | MAP_ANON,
NOFD, 0);

Is the address in addr a virtual address in my address space or a
physical address? If it’s a virtual address, how do I get the
corresponding physical address?

The above won’t even neccessarily give physically contiguous
memory.

The code you need is:

ptr = mmap( 0, size, PROT_READ|PROT_WRITE|PROT_NOCACHE,
MAP_ANON|MAP_PHYS, NOFD, 0 );
/* if on x86, you may also want MAP_NOX64K | MAP_BELOW16M */
mem_offset( ptr, NOFD, 1, &phys_addr, NULL );

Then ptr will be the virtual address of the DMA safe buffer, and
phys_addr will be the corresponding physical address.


Alternatively, is it possible to DMA directly from a regular memory
buffer (not physically contiguous)? In many other OS, one can get a
list of the physical addresses of all the physical pages the virtual
buffer is made of. Then the DMA engine can be programmed (if supported)
to DMA from the physical addresses one page at a time (chain DMA).

You could probably do this – look at mem_offset() again. But, it
would be messy – having to figure out exactly how your virtual
memory was set up in physical.

-David

The code you need is:

ptr = mmap( 0, size, PROT_READ|PROT_WRITE|PROT_NOCACHE,
MAP_ANON|MAP_PHYS, NOFD, 0 );
/* if on x86, you may also want MAP_NOX64K | MAP_BELOW16M */
mem_offset( ptr, NOFD, 1, &phys_addr, NULL );

Then ptr will be the virtual address of the DMA safe buffer, and
phys_addr will be the corresponding physical address.

Thanks for pointing out the mem_offset() function.

I’m a little confused about the mmap() call though. The difference between
your code and the original code is the PROT_READ and the PROT_NOCACHE flags.
Why is the PROT_NOCACHE flag necessary in order to get a physically
contiguous block? I have not done any coding so far but it’s not intuitive
to me why this matters in this respect. Are you instead concerned about DMA
coherency? This leads to another question of mine. In general (not specific
to x86), one has to worry about DMA coherency. By this I mean, when the CPU
writes to a memory location, it may only write to the CPU cache but not the
main memory. If the DMA engine (on a PCI device) reads this memory location,
it may read old data. To deal with this problem, some other OS (let’s pick
Windows NT) have a call to “flush” that memory region from the CPU cache to
main memory, before DMA can be initiated. I don’t see such a call in
Neutrino and it looks like the approach assumed by Neutrino is that one has
to allocate noncacheable memory for such purpose. But access to noncacheable
memory might be less efficient.

  1. So, is there such a “flush” call in Neutrino?

  2. Is DMA coherency even an issue on x86 at all? I heard that x86 has
    hardware-enforced DMA coherency. In fact, the flush call in Windows NT is a
    null operation when the target is x86. It’s just there for source code
    portability to other RISC platforms that actually need it.

-Kim

Kim Liu <kliu@terayon.com> wrote:

The code you need is:

ptr = mmap( 0, size, PROT_READ|PROT_WRITE|PROT_NOCACHE,
MAP_ANON|MAP_PHYS, NOFD, 0 );
/* if on x86, you may also want MAP_NOX64K | MAP_BELOW16M */
mem_offset( ptr, NOFD, 1, &phys_addr, NULL );

Then ptr will be the virtual address of the DMA safe buffer, and
phys_addr will be the corresponding physical address.

Thanks for pointing out the mem_offset() function.

I’m a little confused about the mmap() call though. The difference between
your code and the original code is the PROT_READ and the PROT_NOCACHE flags.

Yup.

Why is the PROT_NOCACHE flag necessary in order to get a physically
contiguous block?

It is not neccessary for getting a physically contiguous block.

I have not done any coding so far but it’s not intuitive
to me why this matters in this respect. Are you instead concerned about DMA
coherency?

Well, actually about memory coherency for memory that can be independently
accessed from two different places (DMA controller, main processor).

This leads to another question of mine. In general (not specific
to x86), one has to worry about DMA coherency. By this I mean, when the CPU
writes to a memory location, it may only write to the CPU cache but not the
main memory. If the DMA engine (on a PCI device) reads this memory location,
it may read old data. To deal with this problem, some other OS (let’s pick
Windows NT) have a call to “flush” that memory region from the CPU cache to
main memory, before DMA can be initiated. I don’t see such a call in
Neutrino and it looks like the approach assumed by Neutrino is that one has
to allocate noncacheable memory for such purpose. But access to noncacheable
memory might be less efficient.

  1. So, is there such a “flush” call in Neutrino?

I don’t know of any “memory flush” type call for Neutrino.

  1. Is DMA coherency even an issue on x86 at all? I heard that x86 has
    hardware-enforced DMA coherency. In fact, the flush call in Windows NT is a
    null operation when the target is x86. It’s just there for source code
    portability to other RISC platforms that actually need it.

I don’t know the CPU architecture well enough to comment. Somebody else
might, though…

-David