Resource manager eats too much processor

maschoen · July 13, 2007, 7:02am

The best example I can think of is a video card. The processor can update video memory directly, just like any other memory, except for the caching issue. Simultaneously, hardware reads the memory and turns it into some kind of video signal, eg. RGB, or composite.

Basically, it is memory that two different pieces of hardware can read and write at the same time.
In some sense, all ram has this property in that both the CPU and DMA processors can access it
at the same time, so it might be wise to require the memory to be peripheral.

How this is accomplished is a hardware issue, and quite beyond my experience, although to make it
clear what a dinosaur I am, I’ll mention that the original IBM PC had a color “CGA” adapter with dual
ported memory that was slightly broken. If you accessed the memory from the cpu while the video
hardware was also reading it, you would get white flecks of “snow” on the screen. The proper
procedure was to poll an I/O port until a bit indicated that the video gun was in either horizontal
or vertical refresh. You could squeeze one byte in for a horizontal refresh, and about 240 for
a vertical one. This problem was fixed for some non-IBM compatible CGA adapters, and IBM
fixed it for their next product, the EGA adapter.

mario · July 13, 2007, 12:23pm

I believe that with modern hardware if the memory is modified via DMA that the CPU will flush the cache automaticaly. I think it’s called bus snopping. It’s part of the mecanism to keep CPU caches coherents in a multi-core/multi-cpu environment.

maschoen · July 13, 2007, 12:28pm

I believe you are right Mario, but of course DMA channels come with with the processor and motherboard, whereas many add on devices, such as video cards do not. On the other hand, AGP video cards that use system ram may not have this problem.

ed1k · July 13, 2007, 2:52pm

Maschoen’s example with CGA is correct, but with DMA is not. CPU and DMA processor cannot access memory at the same time. Specifically (if explaining in old terminology of 8086 and such which is easy to understand), DMA processor sets HOLD signal (or BREQ “bus request” if you prefer motorola flavour) and CPU turns all bus signals to hi-Z state and responds HLDA signal. DMA processor as soon it received “hold acknowledge” starts operation with memory. This is still true for today’s hardware with exeption that modern CPU may continue execution using cache untill it doesn’t need access to memory.
And yes, modern hardware will flush the cache automaticly, so DMA buffer need not be uncachable. However, I can’t understand observed perfomance degradation just because of that NO_CACHE flag.

mario · July 13, 2007, 3:06pm

I don’t think that DMA operation disable CPU access to the RAM in every circumstances. Chipset/memory controler are getting quite intelligent but I’m just guestimating here ;-)

ed1k · July 13, 2007, 4:22pm

Mario, any chunk of RAM (almost) anywhere in address space could be declared as a DMA buffer. Does it mean all RAM in modern PC is an (expensive) dual-port memory? Actually, in neighbour thread there is a discussion of system time - you may try that example with and without some DMA active hardware
What I wrote was completely true (i.e. no CPU activity during DMA transactions) in classical PC/AT architecture. It is still true for some modern “rugged and bullet proof” computers. Becasue of idling CPU during DMA transactions, the DMA controllers usually have few modes of operation which are mostly different in regard how many bytes could be transfered at one DMA transaction. I.e. normally, if you care about real-time capabilities of your system you should not program DMA controller to send DMA buffer in big chunks. And anyway, DMA affects performance much less than using CPU for receiving/sending data because CPU doesn’t need to switch context or anything while granting the bus for DMA controller. Having less IRQs is better and with using DMA you just have to switch buffer for DMA controller in ISR, data are already in memory.

maschoen · July 13, 2007, 6:24pm

I don’t know this for sure, but I imagine that any “dual-ported” memory must arbitrate access. If access is cycled back and forth, then unliked DMA, there is no cycle stealing. ed1k is quite right about (At least original technology) DMA stealing bus access from the cpu, and at least potentially slowing things down. This is of course mitigated by cpu caches, and optimized code.

mario · July 13, 2007, 11:16pm

ed1k:

Mario, any chunk of RAM (almost) anywhere in address space could be declared as a DMA buffer. Does it mean all RAM in modern PC is an (expensive) dual-port memory? Actually, in neighbour thread there is a discussion of system time - you may try that example with and without some DMA active hardware

What I wrote was completely true (i.e. no CPU activity during DMA transactions) in classical PC/AT architecture. It is still true for some modern “rugged and bullet proof” computers. Becasue of idling CPU during DMA transactions, the DMA controllers usually have few modes of operation which are mostly different in regard how many bytes could be transfered at one DMA transaction. I.e. normally, if you care about real-time capabilities of your system you should not program DMA controller to send DMA buffer in big chunks. And anyway, DMA affects performance much less than using CPU for receiving/sending data because CPU doesn’t need to switch context or anything while granting the bus for DMA controller. Having less IRQs is better and with using DMA you just have to switch buffer for DMA controller in ISR, data are already in memory.

You are right indeed it’s not dual port memory, but there is the concept of banks, there also the NEMA architecture ( which all AMD system uses)