I/O vs Memory Access

I have a PCI device who’s registers can be accessed both through I/O (
inpd( ) , outpd( ) ) and memory (shm_open( ) , mmap( ) ) write-read . What
is the practical difference of those 2 methods ?
Thank you in advance!

Akis <romeoita@yahoo.com> wrote:

I have a PCI device who’s registers can be accessed both through I/O (
inpd( ) , outpd( ) ) and memory (shm_open( ) , mmap( ) ) write-read . What
is the practical difference of those 2 methods ?

in/out are x86-only features and are enabled by ThreadCtl(_NTO_TCTL_IO, 0);
whereas memory mapped functions are, as you noted, accessed via mmap().

There are going to be caching issues with the mmap() approach, but you can
turn that off. Also, mmap()'d regions can be marked “read only”, whereas
I/O ports are enabled bidirectionally when you do the threadctl() above.

Mapping memory happens in page-sized chunks (e.g., 4k on x86).

What kinds of differences were you looking for?

Cheers,
-RK


Robert Krten, PARSE Software Devices +1 613 599 8316.
Realtime Systems Architecture, Books, Video-based and Instructor-led
Training and Consulting at www.parse.com.
Email my initials at parse dot com.

Memory mapped IO should be much more efficient. Is that what you’re looking
for?


Bill Caroselli – 1(626) 824-7983
Q-TPS Consulting
QTPS@EarthLink.net


“Robert Krten” <nospam90@parse.com> wrote in message
news:a3v15i$d1p$1@inn.qnx.com

Akis <> romeoita@yahoo.com> > wrote:
I have a PCI device who’s registers can be accessed both through I/O (
inpd( ) , outpd( ) ) and memory (shm_open( ) , mmap( ) ) write-read .
What
is the practical difference of those 2 methods ?

in/out are x86-only features and are enabled by ThreadCtl(_NTO_TCTL_IO,
0);
whereas memory mapped functions are, as you noted, accessed via mmap().

There are going to be caching issues with the mmap() approach, but you can
turn that off. Also, mmap()'d regions can be marked “read only”, whereas
I/O ports are enabled bidirectionally when you do the threadctl() above.

Mapping memory happens in page-sized chunks (e.g., 4k on x86).

What kinds of differences were you looking for?

Cheers,
-RK


Robert Krten, PARSE Software Devices +1 613 599 8316.
Realtime Systems Architecture, Books, Video-based and Instructor-led
Training and Consulting at > www.parse.com> .
Email my initials at parse dot com.

Why do they have to be more efficient?


“Bill Caroselli” <qtps@earthlink.net> ÓÏÏÂÝÉÌ/ÓÏÏÂÝÉÌÁ × ÎÏ×ÏÓÔÑÈ ÓÌÅÄÕÀÝÅÅ:
news:a3vsni$1ha$1@inn.qnx.com

Memory mapped IO should be much more efficient. Is that what you’re
looking
for?


Bill Caroselli – 1(626) 824-7983
Q-TPS Consulting
QTPS@EarthLink.net


“Robert Krten” <> nospam90@parse.com> > wrote in message
news:a3v15i$d1p$> 1@inn.qnx.com> …
Akis <> romeoita@yahoo.com> > wrote:
I have a PCI device who’s registers can be accessed both through I/O (
inpd( ) , outpd( ) ) and memory (shm_open( ) , mmap( ) ) write-read .
What
is the practical difference of those 2 methods ?

in/out are x86-only features and are enabled by ThreadCtl(_NTO_TCTL_IO,
0);
whereas memory mapped functions are, as you noted, accessed via mmap().

There are going to be caching issues with the mmap() approach, but you
can
turn that off. Also, mmap()'d regions can be marked “read only”,
whereas
I/O ports are enabled bidirectionally when you do the threadctl() above.

Mapping memory happens in page-sized chunks (e.g., 4k on x86).

What kinds of differences were you looking for?

Cheers,
-RK


Robert Krten, PARSE Software Devices +1 613 599 8316.
Realtime Systems Architecture, Books, Video-based and Instructor-led
Training and Consulting at > www.parse.com> .
Email my initials at parse dot com.

DMitri <ivdal@yahoo.com> wrote:

Why do they have to be more efficient?

I’d say “may be more efficient” rather than “should”. It depends on the hardware.
However, if the cache and the phase of the moon cooperate :slight_smile: you could
conceivable read memory at DMA speeds, and if the hardware and Jupiter are in
the wrong state then I/O could be much slower, even when using the x86 “rep”
prefix or the ins/outs instructions…

It really depends.

Cheers,
-RK

“Bill Caroselli” <> qtps@earthlink.net> > sez:
news:a3vsni$1ha$> 1@inn.qnx.com> …
Memory mapped IO should be much more efficient. Is that what you’re
looking
for?


Bill Caroselli – 1(626) 824-7983
Q-TPS Consulting
QTPS@EarthLink.net


“Robert Krten” <> nospam90@parse.com> > wrote in message
news:a3v15i$d1p$> 1@inn.qnx.com> …
Akis <> romeoita@yahoo.com> > wrote:
I have a PCI device who’s registers can be accessed both through I/O (
inpd( ) , outpd( ) ) and memory (shm_open( ) , mmap( ) ) write-read .
What
is the practical difference of those 2 methods ?

in/out are x86-only features and are enabled by ThreadCtl(_NTO_TCTL_IO,
0);
whereas memory mapped functions are, as you noted, accessed via mmap().

There are going to be caching issues with the mmap() approach, but you
can
turn that off. Also, mmap()'d regions can be marked “read only”,
whereas
I/O ports are enabled bidirectionally when you do the threadctl() above.

Mapping memory happens in page-sized chunks (e.g., 4k on x86).

What kinds of differences were you looking for?

Cheers,
-RK


Robert Krten, PARSE Software Devices +1 613 599 8316.
Realtime Systems Architecture, Books, Video-based and Instructor-led
Training and Consulting at > www.parse.com> .
Email my initials at parse dot com.


Robert Krten, PARSE Software Devices +1 613 599 8316.
Realtime Systems Architecture, Books, Video-based and Instructor-led
Training and Consulting at www.parse.com.
Email my initials at parse dot com.

“Robert Krten” <nospam90@parse.com> wrote in message
news:a40mi8$ks9$1@inn.qnx.com

DMitri <> ivdal@yahoo.com> > wrote:
Why do they have to be more efficient?

I’d say “may be more efficient” rather than “should”. It depends on the
hardware.

Well, “should” if both the hardware and software are designed properly.

Short story. Long ago, when I worked for a former employer, we contracted
with a San Francisco firm to design and build a multi-channel PCI audio
card. I won’t give that firm’s name because they still claim to be PCI
experts AND have a very agressive legal department.

They designed and built a working prototype of this card and then gave me
the specs to program for it. For each 32 bit sample to read or write I had
to have a loop that looked something like this:

  1. write 16 bit command to get status
  2. read 16 bit status
  3. check bits in status to determine if buffer was capable of
    transferring next 32 bit sample
  4. if not ready, loop back to step #1
  5. write a 16 bit command to get/put sample
  6. read/write 32 bit sample
  7. write a 16 bit command to say I’m done transfering sampe
  8. loop back to step #1.

So why didn’t the card use interrupts? It did! I generated an interrupt
every 1 ms (I think) regardless of the state (or state change) of the
device.

We paid them $400,000 up front to develop this device. My accusation was
that they developed an ISA device that happened to sit on the PCI bus. We
wanted a PCI device because we needed PCI throughput. Needless to say a
very nasty law suite resulted.

I am not a hardware engineer. But when we contracted for the next PCI audio
card to be developped by someone else, we were VERY SPECIFIC about how we
wanted the card to behave. The result was 17 stereo channels (34 mono
channels) of concurrent PCM-16 audio per CPU box, via 3 x 8channel cards.


Bill Caroselli – 1(626) 824-7983
Q-TPS Consulting
QTPS@EarthLink.net