SH$ Solution engine cache

Is there a way to turn the SH4 775x CPU cache on and off
once the OS has booted?
Is it possible or will the OS not like this?
Thanks

Using Opera’s revolutionary e-mail client: http://www.opera.com/m2/

On Wed, 22 Sep 2004 13:09:24 +0200, Alex/Systems 104
<acellarius@yah0o.lsd.com> wrote:

Is there a way to turn the SH4 775x CPU cache on and off
once the OS has booted?
Is it possible or will the OS not like this?
Thanks

I guess I always ask the questions that have no answers…


Using Opera’s revolutionary e-mail client: http://www.opera.com/m2/

I’d say give it a try. It’ll either go splat or not. The CPU
programming manual should have instructions on how.

This function may help too: ThreadCtl(_NTO_TCTL_IO, 0)

On Fri, 24 Sep 2004 07:43:21 +1200, Evan Hillas <blarg@blarg.blarg> wrote:

I’d say give it a try. It’ll either go splat or not. The CPU
programming manual should have instructions on how.

This function may help too: ThreadCtl(_NTO_TCTL_IO, 0)

It hangs the CPU after a while, presumably when it tries to
access something in the cache.

The SH$ CPU manual show some instruction sequences,
but it would appear Proc’s help will be needed.


Using Opera’s revolutionary e-mail client: http://www.opera.com/m2/

On Mon, 27 Sep 2004 19:16:04 +0200, Alex/Systems 104
<acellarius@yah0o.lsd.com> wrote:

It hangs the CPU after a while, presumably when it tries to
access something in the cache.

The SH$ CPU manual show some instruction sequences,
but it would appear Proc’s help will be needed.

The eCos source shows functions doing this.
The application is one for space (micro sattelite),
and cache memory is highly susceptible to radation
induced faults.
so the idea is to normally run with cache turned off,
and only turn it on when the faster processing is required:

FUNC_START(cyg_hal_dcache_enable)
FUNC_START(cyg_hal_dcache_disable)
FUNC_START(cyg_hal_dcache_invalidate_all)
FUNC_START(cyg_hal_dcache_sync)
FUNC_START(cyg_hal_dcache_sync_region)
FUNC_START(cyg_hal_dcache_write_mode)
FUNC_START(cyg_hal_icache_enable)
FUNC_START(cyg_hal_icache_disable)
FUNC_START(cyg_hal_icache_invalidate_all)

code is in variant.S

What will it take to get similar support for SH4
in QNX 6.3/4 or even a custom version?


Using Opera’s revolutionary e-mail client: http://www.opera.com/m2/

Anyone from QNX willing & able to comment on this?
Attempts to do this from a user application
fails (probably because at some stage
code execution is attempted out of a now
non-existent cache…)

On Tue, 28 Sep 2004 22:42:51 +0200, Alex/Systems 104
<acellarius@yah0o.lsd.com> wrote:

On Mon, 27 Sep 2004 19:16:04 +0200, Alex/Systems 104
acellarius@yah0o.lsd.com> > wrote:

It hangs the CPU after a while, presumably when it tries to
access something in the cache.

The SH$ CPU manual show some instruction sequences,
but it would appear Proc’s help will be needed.



The eCos source shows functions doing this.
The application is one for space (micro sattelite),
and cache memory is highly susceptible to radation
induced faults.
so the idea is to normally run with cache turned off,
and only turn it on when the faster processing is required:

FUNC_START(cyg_hal_dcache_enable)
FUNC_START(cyg_hal_dcache_disable)
FUNC_START(cyg_hal_dcache_invalidate_all)
FUNC_START(cyg_hal_dcache_sync)
FUNC_START(cyg_hal_dcache_sync_region)
FUNC_START(cyg_hal_dcache_write_mode)
FUNC_START(cyg_hal_icache_enable)
FUNC_START(cyg_hal_icache_disable)
FUNC_START(cyg_hal_icache_invalidate_all)

code is in variant.S

What will it take to get similar support for SH4
in QNX 6.3/4 or even a custom version?


Using Opera’s revolutionary e-mail client: http://www.opera.com/m2/

Alex/Systems 104 wrote:

Anyone from QNX willing & able to comment on this?
Attempts to do this from a user application
fails (probably because at some stage
code execution is attempted out of a now
non-existent cache…)

Firstly, you need to have CPU privity to do cache operations. Also, to
turn off the cache, you need to flush what ever was in the cache first
before yanking the plug. If you manage to turn off the cache, you can’t
turn it on without first invalidating whatever is inside it (since you
shouldn’t make assumptions about the cache state). So it’s not quite
the simple ‘on/off’ switch you might think it is. Additionally, IIRC,
the SH variant of the kernel does turn cache on, so you’ll need a custom
version turn completely turn cache off all the time (your sales rep can
help you with that aspect).

That said, the cache is on chip. If it’s “highly susceptible to
radiation induced faults”, how are sure that the rest of what’s on chip
(including the core) isn’t also unstable? Is the SH variant you’re
using hardened for this particular application (which begs why there is
a cache in the first place if it’s so unstable)?

Finally, you mentioned that you would turn the cache on for when “faster
processing is required”. IMHO, if the environment is that harsh, you
can’t turn the cache on for any period of time and expect anything sane
to occur or even recover in any sane manner (since the recovery is
subject to the same issue).


Cheers,
Adam

QNX Software Systems Ltd.
[ amallory@qnx.com ]

With a PC, I always felt limited by the software available.
On Unix, I am limited only by my knowledge.
–Peter J. Schoenster <pschon@baste.magibox.net>

On Wed, 13 Oct 2004 16:41:02 -0400, Adam Mallory <amallory@qnx.com> wrote:

Firstly, you need to have CPU privity to do cache operations. Also, to
turn off the cache, you need to flush what ever was in the cache first
before yanking the plug. If you manage to turn off the cache, you can’t
turn it on without first invalidating whatever is inside it (since you
shouldn’t make assumptions about the cache state). So it’s not quite
the simple ‘on/off’ switch you might think it is. Additionally, IIRC,

No, and the ecos headers I mentioned earlier in the thread showed that too.
Not difficult, but you have to take care of everyhthing and
in the proper order.

the SH variant of the kernel does turn cache on, so you’ll need a custom
version turn completely turn cache off all the time (your sales rep can
help you with that aspect).

That said, the cache is on chip. If it’s “highly susceptible to
radiation induced faults”, how are sure that the rest of what’s on chip
(including the core) isn’t also unstable? Is the SH variant you’re
using hardened for this particular application (which begs why there is
a cache in the first place if it’s so unstable)?

Apparently cache RAM is more susceptible.

Finally, you mentioned that you would turn the cache on for when “faster
processing is required”. IMHO, if the environment is that harsh, you
can’t turn the cache on for any period of time and expect anything sane
to occur or even recover in any sane manner (since the recovery is
subject to the same issue).

Hopefully the customer will comment some more on these issues.


Using Opera’s revolutionary e-mail client: http://www.opera.com/m2/

Alex/Systems 104 wrote:

Apparently cache RAM is more susceptible.

True - but it’s not the only part. All the different memory
technologies in the core (and other parts perhaps) are susceptible to
SEU. This would include the pipeline, and the register file possibly as
well (both for the main core and the FPU).

\

Cheers,
Adam

QNX Software Systems Ltd.
[ amallory@qnx.com ]

With a PC, I always felt limited by the software available.
On Unix, I am limited only by my knowledge.
–Peter J. Schoenster <pschon@baste.magibox.net>

It is true that the cache is not the only part susceptible to upsets but the
configuration of the transistors makes a difference (some memory
technologies are more susceptible than others). The cache occupies a large
area on the die and contains many flip-flops (susceptible to SEUs) so the
chance of an upset occuring there is greater. By disabling the cache it
might be possible to reduce the processor failure rate significantly. There
is radiation shielding of the processor to reduce the number of upsets and
other measures to reduce the impact of the upsets that do occur. It is very
difficult to do accurate simulations of the environment in which the
processor will operate so it is important that we can switch the cache on or
off. It might turn out that it is not necessary at all to switch the cache
off if the number of failures is low enough.
In essence we just want to know if it is possible to switch the cache on or
off while QNX is running by following the correct sequence of commands.
Also would it be possible to start QNX with the cache disabled and then
switch it on afterwards?

Regards
Philip


“Adam Mallory” <amallory@qnx.com> wrote in message
news:cklqqi$fkf$1@inn.qnx.com

Alex/Systems 104 wrote:

Apparently cache RAM is more susceptible.

True - but it’s not the only part. All the different memory technologies
in the core (and other parts perhaps) are susceptible to SEU. This would
include the pipeline, and the register file possibly as well (both for the
main core and the FPU).

\

Cheers,
Adam

QNX Software Systems Ltd.
[ > amallory@qnx.com > ]

With a PC, I always felt limited by the software available.
On Unix, I am limited only by my knowledge.
–Peter J. Schoenster <> pschon@baste.magibox.net

Philip La Grange wrote:

It is true that the cache is not the only part susceptible to upsets but the
configuration of the transistors makes a difference (some memory
technologies are more susceptible than others). The cache occupies a large
area on the die and contains many flip-flops (susceptible to SEUs) so the
chance of an upset occuring there is greater. By disabling the cache it
might be possible to reduce the processor failure rate significantly.

That was always my understanding. I’m just wanted to make sure it was
Alex’s (ie. cache isn’t the only part susceptible).

There
is radiation shielding of the processor to reduce the number of upsets and
other measures to reduce the impact of the upsets that do occur. It is very
difficult to do accurate simulations of the environment in which the
processor will operate so it is important that we can switch the cache on or
off. It might turn out that it is not necessary at all to switch the cache
off if the number of failures is low enough.

So when you say ‘switch the cache on or off’ do you mean in terms of
testing trials to see the failure rates on or off? If so, that would go
against what Alex said. Alex mentioned that turning the cache on/off
was for speed advantages at runtime. If through testing, the failure
rates are high enough to warrant turning off the cache, I don’t really
understand how you could turn it on without stability and correctness
issues. Once the cache is on, you can’t stop the OS from switching out
your task (without disabling interrupts) which means that an error could
occur during the execution of the OS, or any other code. Not to mention
that turning on a cold cache and then executing a code path is going to
get you slower than normal execution/data access at first.

Additionally, if you’re in a code path which will benefit from the cache
after the initial cost; interrupts will be off for some seriously long
periods of time.

In essence we just want to know if it is possible to switch the cache on or
off while QNX is running by following the correct sequence of commands.
Also would it be possible to start QNX with the cache disabled and then
switch it on afterwards?

At the moment no. You should speak to your sales rep for custom
engineering options.


Cheers,
Adam

QNX Software Systems Ltd.
[ amallory@qnx.com ]

With a PC, I always felt limited by the software available.
On Unix, I am limited only by my knowledge.
–Peter J. Schoenster <pschon@baste.magibox.net>

“Adam Mallory” <amallory@qnx.com> wrote in message
news:ckojd7$l2i$1@inn.qnx.com

Philip La Grange wrote:
It is true that the cache is not the only part susceptible to upsets but
the configuration of the transistors makes a difference (some memory
technologies are more susceptible than others). The cache occupies a
large area on the die and contains many flip-flops (susceptible to SEUs)
so the chance of an upset occuring there is greater. By disabling the
cache it might be possible to reduce the processor failure rate
significantly.

That was always my understanding. I’m just wanted to make sure it was
Alex’s (ie. cache isn’t the only part susceptible).

There is radiation shielding of the processor to reduce the number of
upsets and other measures to reduce the impact of the upsets that do
occur. It is very difficult to do accurate simulations of the environment
in which the processor will operate so it is important that we can switch
the cache on or off. It might turn out that it is not necessary at all to
switch the cache off if the number of failures is low enough.

So when you say ‘switch the cache on or off’ do you mean in terms of
testing trials to see the failure rates on or off? If so, that would go
against what Alex said. Alex mentioned that turning the cache on/off was
for speed advantages at runtime. If through testing, the failure rates
are high enough to warrant turning off the cache, I don’t really
understand how you could turn it on without stability and correctness
issues. Once the cache is on, you can’t stop the OS from switching out
your task (without disabling interrupts) which means that an error could
occur during the execution of the OS, or any other code.

What exactly do you mean by “stop the OS from switching out your task
(without disabling interrupts)”?

Not to mention that turning on a cold cache and then executing a code
path is going to get you slower than normal execution/data access at
first.

Switching the cache on or off wouldn’t be for testing trials, it would be in
operation when more processing power is required for one specific intensive
task.
Typically the processor will run with its cache off as most satellite
processors do. If the failure rates with cache on is low enough then
execution with the cache on will be an option. The cache will then only be
switched on once for a few minutes every few hours. To put things into
perspective, the failure rates we are talking about is somwhere in the area
of 1 per hour to 1 per week or so with the cache on. If it is worse than
that the cache will probably never be used.


Additionally, if you’re in a code path which will benefit from the cache
after the initial cost; interrupts will be off for some seriously long
periods of time.

What do you mean by “interrupts will be off for some seriously long periods
of time.”?

In essence we just want to know if it is possible to switch the cache on
or off while QNX is running by following the correct sequence of
commands. Also would it be possible to start QNX with the cache disabled
and then switch it on afterwards?

At the moment no. You should speak to your sales rep for custom
engineering options.

Will do that.

Cheers,
Adam

QNX Software Systems Ltd.
[ > amallory@qnx.com > ]

With a PC, I always felt limited by the software available.
On Unix, I am limited only by my knowledge.
–Peter J. Schoenster <> pschon@baste.magibox.net

Philip La Grange wrote:

What exactly do you mean by “stop the OS from switching out your task
(without disabling interrupts)”?

If you turn on the cache in your userland application, then a context
switch occurs (due to an interrupt causing a scheduling change) - guess
what piece of software is running with the cache on. Both the OS and
the target software being brought into context will both be running with
cache on. Evaluation errors or access errors are possible during this
time, and the stability of the OS is brought into question.

The only way to avoid context changes behind your back while the cache
is on, is to disable interrupts during your use of the cache.

Switching the cache on or off wouldn’t be for testing trials, it would be in
operation when more processing power is required for one specific intensive
task.

But that’s the rub - constantly coming into the cache cold and paying
the extra cost of bringing data into the cache (and the stalls etc) and
then flushing it all out before turning it off could very well outweigh
its benefit.

What do you mean by “interrupts will be off for some seriously long periods
of time.”?

I mean that to have some semblance of sanity, you’ll have to disable
interrupts around the code you enabled the cache on (to avoid an SEU in
code that cannot tolerate it). If the code can take advantage of the
cache (ie. pay the price for a cold cache and benefit from what gets
brought into the cache) then IMHO, it’s probable that you’ll have
interrupts disabled for extended periods of time (a ‘bad thing’ generally).

\

Cheers,
Adam

QNX Software Systems Ltd.
[ amallory@qnx.com ]

With a PC, I always felt limited by the software available.
On Unix, I am limited only by my knowledge.
–Peter J. Schoenster <pschon@baste.magibox.net>