Aligning RAM for DMA access

I need to allocate blocks of memory on a 16 byte boundary for DMA access by a PCI card. Obviously memalign is the solution. Except that this would be in Process memory and therefore useless (i am assuming this is why the system resets when I pass the addresses to the pci card and it attempts DMA).
So is there a method of aligning mmapped memory ?
If i use memalign and then pass the resultant address as the location to map to in mmap will this work or do I have to resort to the method previously used of adding 16 to the memory size required and then adding 15 to the resulting address and masking out the bottom 4 bits which requires keeping two addresses.

Memory returned from mmap() should be page aligned, so of course it will also be 16 byte aligned.

I don’t think there anything in the specs what says so, at least not that I could find. It’s probably best to check the physical address of the block of memory and add an offset if need be.

It seems memalign was good enough. I was getting confused by the mem_offset help description that mentioned mapped memory and assumed this meant mmapped. If i memalign then use the address from that in an mmap call then it seems to corrupt the previous block of memory I allocated for a completely different object as it generates a SIGSEV error if I do a free of the other block immediately after the mmap call suggesting it may be overwriting the block descriptor for the previous block somehow.
This seems to suggest that memalign and mmap take different amounts of memory for the same size value.
Or possibly it’s something more complicated.
Anyway problem solved.