PCI burst mode transfer

I am developing a PCI peripheral card and have a question about burst mode transfer.
I have mapped the dual port ram on my card with mmap_device_memory and can read and write to the card a single 32 bit word at a time.
My question is how do I immplement a burst mode transfer?
My card uses an Altera FPGA with a PCI target IP core supplied by Altera. Memory is arranged as a 1k block of pre-fetchable memory. I would like to initiate a read and/or write from the PC and burst transfer a large block of contiguous addresses.

You need to find out if your card supports DMA. If it does, then you can use DMA to do the transfer. Programming it involves, allocating a memory buffer, telling the card where to put the data and then trigger it to start. Some IRQ handling is also needed in most cases if IRQ is used for notification from the card to the host.

It really depends on the hardware and firmware you have. If you have dual-ported memory, for example, then you can simply share the physical memory, no transfering needed.

I have no bus-mastering on the card (current revision). We were hoping to not have to implement bus mastering, but it looks now like we may have to.
What we know so far is that the cpu will write to the card using burst transfers, but that reads from the card are done 1 address at a time. Each read requires 14 clock samples. I believe that this is a fact of life with the PC archetecture - at least that is what others are telling me from their similiar experience. I guess the answer is to redesign the card with PCI bus mastering, so that I can do burst transfers from the card to memory directly. If anyone knows of any shortcuts, I would still be quite interested.

I did hear of some special burst instructions being added to x86 in the past but I assume that they are for main memory and won’t impact on PCI.

One way that might speed things up is to change the MMU mode for that address space to non-volatile and explicitly invalidate the range every time you want to read new data. The CPU’s data cache should prefetch and buffer with burst transactions as you are reading from the block.

If that doesn’t make a difference then, yeah, upgrade to DMA. :slight_smile: