Huuuge delay with memory mapped hw registers (PPC)

Hi all,

we have mapped a few hw registers of a PPC device using the NOCACHE
option. When the ISR sets a bit to 1 in such a mapped register and
triggers then an interrupt event for the interrupt thread … we
have to do an active wait in the interrupt thread until we see the
expected status of that bit!

What are the reasons for such an impossible behavior ??
Writing to a mapped register can’t take 20-30us …

Is the ‘sync pipeline’ instruction included in the QNX ISR ??
Is there a bug in mmap … ignoring the NOCACHE option??

Armin

Armin Steinhoff <A-Steinhoff@web_.de> wrote:

Hi all,

we have mapped a few hw registers of a PPC device using the NOCACHE
option. When the ISR sets a bit to 1 in such a mapped register and
triggers then an interrupt event for the interrupt thread … we
have to do an active wait in the interrupt thread until we see the
expected status of that bit!

What are the reasons for such an impossible behavior ??
Writing to a mapped register can’t take 20-30us …

Is the ‘sync pipeline’ instruction included in the QNX ISR ??
Is there a bug in mmap … ignoring the NOCACHE option??

If you take a quick look at the /usr/nto/include/sys/ppc/inout.h,
you will see that the in*/out* family of functions always seem to
issue an eieio() when accessing the memory mapped registers. You
might need to do so as well.

-David

QNX Training Services
dagibbs@qnx.com

David Gibbs wrote:

Armin Steinhoff <A-Steinhoff@web_.de> wrote:

Hi all,

we have mapped a few hw registers of a PPC device using the NOCACHE
option. When the ISR sets a bit to 1 in such a mapped register and
triggers then an interrupt event for the interrupt thread … we
have to do an active wait in the interrupt thread until we see the
expected status of that bit!

What are the reasons for such an impossible behavior ??
Writing to a mapped register can’t take 20-30us …

Is the ‘sync pipeline’ instruction included in the QNX ISR ??
Is there a bug in mmap … ignoring the NOCACHE option??

If you take a quick look at the /usr/nto/include/sys/ppc/inout.h,
you will see that the in*/out* family of functions always seem to
issue an eieio() when accessing the memory mapped registers. You
might need to do so as well.

Thanks, it’s now working with the correct timing behavior :slight_smile:

The remaining problem is that there are no hints in the
documentation how mapped hardware registers (or IO addresses) must
be handled if you deal with a PPC.

I’m sure nobody would get the idea to use inp/outp calls for reading
and writing to memory mapped device registers. It’s also not very
intuitive to submit a “eieio” instruction after reading an IO
address … like

status = *INTMASK,
eieio();

The docs of the inp/outp calls refering only hw/inout.h … no hints
for the PPC and the PPC specific IO calls included in
/usr/include/ppc.

IMHO … there must be included at least one or better two big
‘pointing hands’ in the resource manager library. Regarding the
resource manager library … there are also a lot of none-documented
library calls like:

int resmgr_open_bind(…);
int resmgr_msg_again(…);
void *_resmgr_ocb(…);
const resmgr_io_funcs_t *_resmgr_iofuncs(…);
int _resmgr_handle_grow(…);
void *_resmgr_handle(…);
int _resmgr_unbind(…);
int _resmgr_pathname_attach(…);
int _resmgr_pathname_detach(…);
int _resmgr_start(…);
int _resmgr_pulse_attach(…);
int _resmgr_pulse_detach(int code);

You will also find a lot of other calls mentioned in the
include files, e.g.:

int procmgr_session(…);

int InterruptHookTrace(…);
int InterruptHookIdle(void (…);

int SyncCreate(…);

int __Ring0(…);

int NetCred(…);
int NetVtid(…);
int NetUnblock(…);
int NetInfoscoid(…);

int TraceEvent(…);

It would be nice and would save much time to see a first class
documentation … just as a compensation for the non existing ‘open
sources’ :slight_smile:

Armin

Armin Steinhoff <A-Steinhoff@web_.de> wrote:

[Interesting PPC notes that people should be aware of snipped ]

IMHO … there must be included at least one or better two big
‘pointing hands’ in the resource manager library. Regarding the
resource manager library … there are also a lot of none-documented
library calls like:

[A number of function calls recently “exposed” by QNX’s
opening of some of its sources to the rest of the world snipped ]

It would be nice and would save much time to see a first class
documentation … just as a compensation for the non existing ‘open
sources’ > :slight_smile:

Absolutely! Shall we sign you up as a technical writer =;-). You
could produce one of those “Secrets of Neutrino Revealed” books
and be a hero for the open-souce community!

On a more serious note, as we (QNX) open up more and more of our
souces you are guaranteed to find interesting functions that seem
to provide some neato kind of functionality. Yes there are times
where the technical writers are lagging behind the developers,
even times when they get ahead of us, but for the most part any
function that is not documented is done so for a reason.
Possible reasons may include:

  • It is a deprecated API
  • It is a non-standard function
  • It is an incomplete/growing function
  • It is a temporary function to get functionality now but
    there are plans to superced it later (ie a crutch)
  • It would entail much more work then the benefit
    provided to the user compared to documenting another function.

Of these the last few are more than likely the reasons for lack
of documentation for functions that people will discover in the
QNX sources. Once something is doced, then we are locked into
providing that behaviour (and compatability) for the remaining
life of QNX Neutrino. Considering that Neutrino is still fairly
young we may find better or more flexible ways to take care of
some internal implementation details, and want to pass those
benefits on to our developers. The resource manager calls
will be doced someday I have no doubt. Right now however
most of them are considered as being “internal library”
functions, even though we export their symbols globally.

Trust us a bit … we do have the best of intentions.

Thomas

Armin Steinhoff <A-Steinhoff@web_.de> wrote:

David Gibbs wrote:

Armin Steinhoff <A-Steinhoff@web_.de> wrote:

Hi all,

we have mapped a few hw registers of a PPC device using the NOCACHE
option. When the ISR sets a bit to 1 in such a mapped register and
triggers then an interrupt event for the interrupt thread … we
have to do an active wait in the interrupt thread until we see the
expected status of that bit!

What are the reasons for such an impossible behavior ??
Writing to a mapped register can’t take 20-30us …

Is the ‘sync pipeline’ instruction included in the QNX ISR ??
Is there a bug in mmap … ignoring the NOCACHE option??

If you take a quick look at the /usr/nto/include/sys/ppc/inout.h,
you will see that the in*/out* family of functions always seem to
issue an eieio() when accessing the memory mapped registers. You
might need to do so as well.

Thanks, it’s now working with the correct timing behavior > :slight_smile:

The remaining problem is that there are no hints in the
documentation how mapped hardware registers (or IO addresses) must
be handled if you deal with a PPC.

I had thought that the in8/in16/in32 functions, combined with
mmap_device_io() was intended for access hardware registers – ones
that would be mmap()ed on a PPC or ioports on an x86.

I’m sure nobody would get the idea to use inp/outp calls for reading
and writing to memory mapped device registers. It’s also not very
intuitive to submit a “eieio” instruction after reading an IO
address … like

status = *INTMASK,
eieio();

I think that may be an oddity of the PPC. If you came from a PPC
background – the “special” oddities of the x86 memory map would
probably be non-intuitive as well. Should we document them as an
OS company? I don’t know – we do try to provide a generic abstraction
for accessing control registers on all the platforms – the in*/out*
family of functions.

The docs of the inp/outp calls refering only hw/inout.h … no hints
for the PPC and the PPC specific IO calls included in
/usr/include/ppc.

True, and if you look at hw/inout.h it does:

#if defined(X86)
#ifndef _X86_INOUT_INCLUDED
#include <x86/inout.h>
#endif
#elif defined(PPC)
#ifndef _PPC_INOUT_INCLUDED
#include <ppc/inout.h>
#endif
#elif defined(MIPS)
#ifndef _MIPS_INOUT_INCLUDED
#include <mips/inout.h>
#endif

(And, actually has no definition or prototype for any of the in*()
functions.)

That is, it pretty clearly grabs a specific to what you are compiling
against header file for those inline macros. In the docs we document
the platform independent abstraction that you include & call.

-David

QNX Training Services
dagibbs@qnx.com

David Gibbs wrote:

Armin Steinhoff <A-Steinhoff@web_.de> wrote:

David Gibbs wrote:

[ clip … ]

I had thought that the in8/in16/in32 functions, combined with
mmap_device_io() was intended for access hardware registers – ones
that would be mmap()ed on a PPC or ioports on an x86.

in8/in16/in32 functions are primarily designed for accessing
hardware resources located in the IO address space … at least
for the Intel platform.

IMHO … it is for me a weired idea to use these functions for
memory mapped hardware registers. So a pointer in the docs would
be really be very helpful …

I’m sure nobody would get the idea to use inp/outp calls for reading
and writing to memory mapped device registers. It’s also not very
intuitive to submit a “eieio” instruction after reading an IO
address … like

status = *INTMASK,
eieio();

I think that may be an oddity of the PPC. If you came from a PPC
background – the “special” oddities of the x86 memory map would
probably be non-intuitive as well. Should we document them as an
OS company?

Yes … because of I believe that these oddities are backed up by
PPC specific code generators.

Armin

Thomas Fletcher wrote:

Armin Steinhoff <A-Steinhoff@web_.de> wrote:

[Interesting PPC notes that people should be aware of snipped ]

IMHO … there must be included at least one or better two big
‘pointing hands’ in the resource manager library. Regarding the
resource manager library … there are also a lot of none-documented
library calls like:

[A number of function calls recently “exposed” by QNX’s
opening of some of its sources to the rest of the world snipped ]

It would be nice and would save much time to see a first class
documentation … just as a compensation for the non existing ‘open
sources’ > :slight_smile:

Absolutely! Shall we sign you up as a technical writer =;-).

Hmm… if I hadn’t to spend so much time because of missing
documentation I could just think about… but unfortunately writing
doesn’t belong to my preferred skills :wink:)

You
could produce one of those “Secrets of Neutrino Revealed” books
and be a hero for the open-souce community!

I’m not an ‘open-source’ freak :slight_smile: … although I have published
ported open-source apps. and will go on publishing, it’s just for
FUN!

BTW, do you think QSSL is really interested in so called ‘competing
books’ :slight_smile: ??

On a more serious note, as we (QNX) open up more and more of our
souces you are guaranteed to find interesting functions that seem
to provide some neato kind of functionality.

These sources are more or less useful as long as they are including
non documented calls …

Yes there are times
where the technical writers are lagging behind the developers,
even times when they get ahead of us, but for the most part any
function that is not documented is done so for a reason.
Possible reasons may include:

  • It is a deprecated API
  • It is a non-standard function
  • It is an incomplete/growing function
  • It is a temporary function to get functionality now but
    there are plans to superced it later (ie a crutch)
  • It would entail much more work then the benefit
    provided to the user compared to documenting another function.

Of these the last few are more than likely the reasons for lack
of documentation for functions that people will discover in the
QNX sources. Once something is doced, then we are locked into
providing that behaviour (and compatability) for the remaining
life of QNX Neutrino. Considering that Neutrino is still fairly
young we may find better or more flexible ways to take care of
some internal implementation details, and want to pass those
benefits on to our developers.

It would be OK if these developers are only active in the OS
development to speed up completing the system …

The resource manager calls
will be doced someday I have no doubt. Right now however
most of them are considered as being “internal library”
functions, even though we export their symbols globally.

Trust us a bit … we do have the best of intentions.

Take care about the ‘open-source’ community, they have the best
intentions, too … you are playing with fire.

Armin


Thomas

Armin Steinhoff <A-Steinhoff@web_.de> wrote:

I had thought that the in8/in16/in32 functions, combined with
mmap_device_io() was intended for access hardware registers – ones
that would be mmap()ed on a PPC or ioports on an x86.

in8/in16/in32 functions are primarily designed for accessing
hardware resources located in the IO address space … at least
for the Intel platform.

IMHO … it is for me a weired idea to use these functions for
memory mapped hardware registers. So a pointer in the docs would
be really be very helpful …

Some of the non-x86 platforms actually support I/O cycles through
various oddball mechanisms.

I think they were forced into it because of the stupid way most
video cards have to be initialized… you need to do at least a few
I/O operations before you can get the card to the point where memory
mapped accesses only are needed.

pete@qnx.com wrote:

Armin Steinhoff <A-Steinhoff@web_.de> wrote:

I had thought that the in8/in16/in32 functions, combined with
mmap_device_io() was intended for access hardware registers – ones
that would be mmap()ed on a PPC or ioports on an x86.

in8/in16/in32 functions are primarily designed for accessing
hardware resources located in the IO address space … at least
for the Intel platform.

IMHO … it is for me a weired idea to use these functions for
memory mapped hardware registers. So a pointer in the docs would
be really be very helpful …

It would also be very useful if the the article
“Talking to hardware under QNX Neutrino”
could be corrected:


  1. Memory-mapped resources

For some devices, registers are accessed via regular memory
operations. To gain access to a device’s registers, you need to map
them to a pointer in the driver’s virtual address space. This can be
done by calling mmap_device_memory().

volatile uint32_t regbase; / device has 32-bit registers */

regbase = mmap_device_memory(NULL, info.BaseAddressSize[0],
PROT_READ|PROT_WRITE|PROT_NOCACHE,0,
info.CpuBaseAddress[0]);
[ clip … ]

Now you may access the device’s memory using the regbase pointer.
For example:

regbase[SHUTDOWN_REGISTER] = 0xdeadbeef;

But this is only the half of the truth if you deal with a PPC:

—> there must be added a eieio() after each usage of regbase[] !!

Please checkout if the semantic of the PROT_NOCACHE option is
correct explained … it seems to promise to do the job of eieio()
??

Armin


Some of the non-x86 platforms actually support I/O cycles through
various oddball mechanisms.

I think they were forced into it because of the stupid way most
video cards have to be initialized… you need to do at least a few
I/O operations before you can get the card to the point where memory
mapped accesses only are needed.

Armin Steinhoff <A-Steinhoff@web_.de> wrote:

It would also be very useful if the the article
“Talking to hardware under QNX Neutrino”
could be corrected:

But this is only the half of the truth if you deal with a PPC:

—> there must be added a eieio() after each usage of regbase[] !!

Please checkout if the semantic of the PROT_NOCACHE option is
correct explained … it seems to promise to do the job of eieio()
??

PROT_NOCACHE is correctly explained. I have submitted a docs bug
report to add a pointing finger to the mmap_device_memory section
so that readers will know that they may have to take additional
measures to ensure correct operation on various platforms.

FYI - on PPC 800 variants, even though you turn cacheing off, the
processor may still execute writes to memory out of order.

If PROT_NOCACHE is not specified, then writes to memory may sit
in the cache forever and never even actually make it to the memory
you’re trying to talk to. With PROT_NOCACHE, the writes are
guaranteed to go straight to the intended memory area, but not
neccessarily in the order you specified.

eieio() ensures that these writes do occur in the order specified.
The mnemonic stands for `execute in order I/O’. To use it, you
should execute the instruction before you do your stuff, then after
each memory mapped access, issue another one.

If you know that you’re doing a sequence that the hardware
doesn’t need to have done in a specific order, you don’t need
to use eieio() for that sequence.