nanospin doc ?

The documentation for nanospin says:

“Busy-wait without blocking for a period of time”

And then

“The nanospin function busy-waits for the amount of time specified in when
without blocking the calling thread”

I’m confuse about “without blocking”. I can you busy-wait without blocking?

  • Mario

In article <9gvq2r$1r7$1@inn.qnx.com>,
“Mario Charest” <mcharest@deletezinformatic.com> wrote:

“The nanospin function busy-waits for the amount of time specified in when
without blocking the calling thread”

I’m confuse about “without blocking”. I can you busy-wait without blocking?

Traditionally, there are two ways for a thread to wait for some event
before proceding. One is called busy-waiting and usually involves
spinning in a tight loop:

while (!eventOccurred)
{
eventOccurred = PollForEventCompletion();
}

In this case, the CPU is consumed polling for the event result and
spinning in the tight loop until the event occurs. This makes the thread
CPU-bound and is usually (read: usually) a waste of CPU resources.

The other mechanism is called blocking and usually involves waiting on a
kernel routine (synchronization primitive or other).

WaitForEvent();

In this case, the calling thread is suspended by the kernel until the
event occurs. The kernel will probably schedule other threads to execute
until the event occurs, at which time the original thread may get
scheduled.

The problem with busy-waiting is that the busy-waiting thread is
consuming the CPU instead of allowing other threads to run, which is
generally considered impolite. The problem with blocking is that you’re
sucumbing to the scheduler.

This is particularly problematic when the event you’re waiting for is a
timer. Consider the case where I want to wait for 800ns. This may be
necessary because I am writing a driver which needs to wait 800ns
between successive writes to a given device. If I block on a timer, I
will probably sleep for at least 1ms because that is the resolution of
most common timers. Even with high resolution timers, I’m unlikely to
find one that will reliably give me 800ns resolution. Thus, if I want to
block myself, my throughput is going to drop like a rock.

However, once I’ve calibrated a spin loop routine on the CPU on which
I’m running, I could easily go into that loop for 800ns. This is what
nanospin does for me. QNX will take care of the calibration and
maintenance for me. I just need to tell it how long I want to spin.

The downside of spinning is that I’m consuming the CPU with no real work
being done. I am therefore degrading system performance. When I’m a low
priority driver spinning for several hundred nanoseconds, this is
generally not a problem. Eventually, the scheduler will kick in and
allow another runnable thread to take time.

It’s all a matter of why you want to wait, how long you want to wait,
and how often the thread will otherwise block on a kernel primitive.

“Eric Berdahl” <berdahl@intelligentparadigm.com> wrote in message
news:berdahl-E074EE.17140122062001@nntp.qnx.com

In article <9gvq2r$1r7$> 1@inn.qnx.com> >,
“Mario Charest” <> mcharest@deletezinformatic.com> > wrote:

“The nanospin function busy-waits for the amount of time specified in
when
without blocking the calling thread”

I’m confuse about “without blocking”. I can you busy-wait without
blocking?

Traditionally, there are two ways for a thread to wait for some event
before proceding. One is called busy-waiting and usually involves
spinning in a tight loop:

[cut ]

I think you misunderstood my question ( or I misphrased it )

Whether you wait via delay (which doesn’t use CPU) or nanospin( which
wastes CPU), the idea is that both function blocks the calling thread.

How can a function that blocks (doesn’t return until specified timeout
elapse)
do this “without blocking”. The phrasing doesn’t make sense to me (I’m
french that could explain it :wink: )

Maybe the term “without blocking” means without the use of the kernel.
My original post was mostly target for QNX people to suggest
the phrasing could be improved.

  • Mario

In article <9h1cic$39$1@inn.qnx.com>,
“Mario Charest” <mcharest@deletezinformatic.com> wrote:

I think you misunderstood my question ( or I misphrased it )

Yes, I think I misunderstood the question.

Whether you wait via delay (which doesn’t use CPU) or nanospin( which
wastes CPU), the idea is that both function blocks the calling thread.

How can a function that blocks (doesn’t return until specified timeout
elapse)
do this “without blocking”. The phrasing doesn’t make sense to me (I’m
french that could explain it > :wink: > )

Maybe the term “without blocking” means without the use of the kernel.
My original post was mostly target for QNX people to suggest
the phrasing could be improved.

It’s the use of the term “blocking” which appears to be confusing you.
The use of the term “block” does not merely convey that the function
does not return until an event transpires. Instead, “block” implies that
the thread enters a state whereby it notifies a scheduling entity
(usually a kernel) that the thread should not be scheduled again until
such time as a given event occurs (e.g. a message is received or a timer
goes off). This allows the scheduling entity to let a different thread
utilize the CPU in the interim.

The nanospin functions do not do this. Instead, they physically consume
the CPU. The net effect is similar to “blocking” on a timer, but is a
different mechanism under the covers. The fact that it is a different
mechanism is very important because it means that the nanospin function
is inappropriate to use as a generic “wait for this much time” function.

Does that help clear it up?

Eric

“Eric Berdahl” <berdahl@intelligentparadigm.com> wrote in message
news:berdahl-B65CE0.23551922062001@nntp.qnx.com

In article <9h1cic$39$> 1@inn.qnx.com> >,
“Mario Charest” <> mcharest@deletezinformatic.com> > wrote:

I think you misunderstood my question ( or I misphrased it )

Yes, I think I misunderstood the question.

Whether you wait via delay (which doesn’t use CPU) or nanospin( which
wastes CPU), the idea is that both function blocks the calling thread.

How can a function that blocks (doesn’t return until specified timeout
elapse)
do this “without blocking”. The phrasing doesn’t make sense to me (I’m
french that could explain it > :wink: > )

Maybe the term “without blocking” means without the use of the kernel.
My original post was mostly target for QNX people to suggest
the phrasing could be improved.

It’s the use of the term “blocking” which appears to be confusing you.
The use of the term “block” does not merely convey that the function
does not return until an event transpires. Instead, “block” implies that
the thread enters a state whereby it notifies a scheduling entity
(usually a kernel) that the thread should not be scheduled again until
such time as a given event occurs (e.g. a message is received or a timer
goes off). This allows the scheduling entity to let a different thread
utilize the CPU in the interim.

The nanospin functions do not do this. Instead, they physically consume
the CPU. The net effect is similar to “blocking” on a timer, but is a
different mechanism under the covers. The fact that it is a different
mechanism is very important because it means that the nanospin function
is inappropriate to use as a generic “wait for this much time” function.

Does that help clear it up?

I didn’t need to doc to tell me nanospin was busy waiting, I knew that
already :wink:

I guess what I’m trying to say, is do you thing the doc is fine as it is or
that I could be clarify.





Eric

Mario Charest <mcharest@deletezinformatic.com> wrote:


I guess what I’m trying to say, is do you thing the doc is fine as it is or
that I could be clarify.

There’s definitely room for improvement… we’ll look into it.
-Donna

Mario Charest <mcharest@deletezinformatic.com> wrote:

“Eric Berdahl” <> berdahl@intelligentparadigm.com> > wrote in message
news:> berdahl-B65CE0.23551922062001@nntp.qnx.com> …
In article <9h1cic$39$> 1@inn.qnx.com> >,
“Mario Charest” <> mcharest@deletezinformatic.com> > wrote:

I think you misunderstood my question ( or I misphrased it )

Yes, I think I misunderstood the question.

Whether you wait via delay (which doesn’t use CPU) or nanospin( which
wastes CPU), the idea is that both function blocks the calling thread.

How can a function that blocks (doesn’t return until specified timeout
elapse)
do this “without blocking”. The phrasing doesn’t make sense to me (I’m
french that could explain it > :wink: > )

Maybe the term “without blocking” means without the use of the kernel.
My original post was mostly target for QNX people to suggest
the phrasing could be improved.

It’s the use of the term “blocking” which appears to be confusing you.
The use of the term “block” does not merely convey that the function
does not return until an event transpires. Instead, “block” implies that
the thread enters a state whereby it notifies a scheduling entity
(usually a kernel) that the thread should not be scheduled again until
such time as a given event occurs (e.g. a message is received or a timer
goes off). This allows the scheduling entity to let a different thread
utilize the CPU in the interim.

The nanospin functions do not do this. Instead, they physically consume
the CPU. The net effect is similar to “blocking” on a timer, but is a
different mechanism under the covers. The fact that it is a different
mechanism is very important because it means that the nanospin function
is inappropriate to use as a generic “wait for this much time” function.

Does that help clear it up?

I didn’t need to doc to tell me nanospin was busy waiting, I knew that
already > :wink:

I guess what I’m trying to say, is do you thing the doc is fine as it is or
that I could be clarify.

I think that as long as people understand ‘not blocking’ to mean ‘not giving
up the cpu to the scheduler’, the doc is fine.

Kris


Eric


Kris Warkentin
kewarken@qnx.com
(613)591-0836 x9368
“You’re bound to be unhappy if you optimize everything” - Donald Knuth

In article <9h7iek$deg$2@nntp.qnx.com>,
Kris Eric Warkentin <kewarken@qnx.com> wrote:

Mario Charest <> mcharest@deletezinformatic.com> > wrote:

I didn’t need to doc to tell me nanospin was busy waiting, I knew that
already > :wink:

I guess what I’m trying to say, is do you thing the doc is fine as it is or
that I could be clarify.

I think that as long as people understand ‘not blocking’ to mean ‘not giving
up the cpu to the scheduler’, the doc is fine.

I concur. The term “blocking” is used commonly in multi-threading
terminology to mean “giving up the CPU to the scheduler”. “non-blocking”
is commonly used to have the meaning you might derive from “blocking”.
If QNX wanted to modify the documentation, they might include a glossary
with such commonly used terms, just to help level the playing field for
people who are not familiar with multi-threading, but I’m ambivalent
about it.

Regards,
Eric

Previously, Eric Berdahl wrote in qdn.public.qnxrtp.os:

I concur. The term “blocking” is used commonly in multi-threading
terminology to mean “giving up the CPU to the scheduler”. “non-blocking”
is commonly used to have the meaning you might derive from “blocking”.
If QNX wanted to modify the documentation, they might include a glossary
with such commonly used terms, just to help level the playing field for
people who are not familiar with multi-threading, but I’m ambivalent
about it.

I side with you on this Eric, but Mario is not your typical
newbie QNX user, and it obviously confused him. Another
reference to the term might be the O_NONBLOCK bit passed to
open() and fcntl(), which causes I/O to not block. This use
curiously mean, you don’t wait for some event, but in fact
you do BLOCK using your meaning because there is a Send()
involved.


Mitchell Schoenbrun --------- maschoen@pobox.com

Eric Berdahl <berdahl@intelligentparadigm.com> wrote:

In article <9gvq2r$1r7$> 1@inn.qnx.com> >,
“Mario Charest” <> mcharest@deletezinformatic.com> > wrote:

The downside of spinning is that I’m consuming the CPU with no real work
being done. I am therefore degrading system performance. When I’m a low
priority driver spinning for several hundred nanoseconds, this is
generally not a problem. Eventually, the scheduler will kick in and
allow another runnable thread to take time.

Of course, if the a higher priority thread does pre-empt you, then
your count for the nanospin will stop counting, and you’ll wait for
longer than you wanted.

In fact, something like nanospin() is only accurate if it is the
highest priority thing around.

-David

QNX Training Services
dagibbs@qnx.com

In article <9hakmh$an6$3@nntp.qnx.com>, David Gibbs <dagibbs@qnx.com>
wrote:

Eric Berdahl <> berdahl@intelligentparadigm.com> > wrote:
In article <9gvq2r$1r7$> 1@inn.qnx.com> >,
“Mario Charest” <> mcharest@deletezinformatic.com> > wrote:

The downside of spinning is that I’m consuming the CPU with no real work
being done. I am therefore degrading system performance. When I’m a low
priority driver spinning for several hundred nanoseconds, this is
generally not a problem. Eventually, the scheduler will kick in and
allow another runnable thread to take time.

Of course, if the a higher priority thread does pre-empt you, then
your count for the nanospin will stop counting, and you’ll wait for
longer than you wanted.

In fact, something like nanospin() is only accurate if it is the
highest priority thing around.

Correct. Which is one reason I don’t advocate using nanospin for
“accurate” timing. IMHO, nanospin is really only useful for client
accessing hardware at a very very low level.

For example, one of my PCI cards has a serial EEPROM chip where startup
data is stored. The chip is read by the card at startup time to
initialize itself. My driver can write to the chip at other times to
change the settings which should be used at the next startup.
Unfortunately, the software interface to the chip involves writing a
register which maps directly to the serial bus feeding the EEPROM. So, I
have to toggle the select, data, clock, and R/W lines directly. It’s a
bit of a pain, but not unreasonable. The big catch is that I have to
obey the timing requirements of the bus (e.g. setup and hold
restrictions).

If I use timers, delays, sleeps, or anything involving the scheduler,
the shortest interval I can practically get is on the order of
milliseconds (1-5), but the timing requirements of the bus are on the
order of microseconds (10).

Thus, if I use usleep, sleep, delay, or the like, my transfer speed is
degraded by a couple orders of magnitude. Yuck.

If I use nanosleep and I’m pre-empted, I wait a while for that one spin
and continue, which is just fine. Overall, my transfer speed is
reasonable given the priority of the thread which is doing the transfer.

If I use nanosleep and I’m not pre-empted, my transfer speed is as good
as I can get it and overall system performance is not negatively
affected (unless I run this in a high-priority thread, which would be a
silly thing for me to do and would be my own damn fault if I did).

This gets back to the whole “Why would I spin when I could block?” and
“Why would I block when I could spin?” questions. The ultimate answer
involves understanding your tools and your system design, and then
deciding which is the best tool for the job.

For example, one of my PCI cards has a serial EEPROM chip where
startup
data is stored. The chip is read by the card at startup time to
initialize itself. My driver can write to the chip at other times to
change the settings which should be used at the next startup.
Unfortunately, the software interface to the chip involves writing a
register which maps directly to the serial bus feeding the EEPROM. So,
I
have to toggle the select, data, clock, and R/W lines directly. It’s a

bit of a pain, but not unreasonable.

Having to bit bang to communicate with a peripheral from a 32 bit
processor is not unreasonable ?

Don’t encourage the HW designers with this kind of talk :wink:

Hey Rennie

Isn’t that what SMP is for?

One Pentium to communicate with the winModem, one Pentium to driver the
winPrinter. And oh yeah, another Pentium to run any actual applications
that you want.

And that seems unreasonable to you. You wouldn’t want them to have to use a
$3.00 UART instead of the CPU chip that already exists would you. That
would be extravagant!

Bill Caroselli


“Rennie Allen” <RAllen@csical.com> wrote in message
news:D4907B331846D31198090050046F80C9058E77@exchangecal.hq.csical.com

For example, one of my PCI cards has a serial EEPROM chip where
startup
data is stored. The chip is read by the card at startup time to
initialize itself. My driver can write to the chip at other times to
change the settings which should be used at the next startup.
Unfortunately, the software interface to the chip involves writing a
register which maps directly to the serial bus feeding the EEPROM. So,
I
have to toggle the select, data, clock, and R/W lines directly. It’s a

bit of a pain, but not unreasonable.

Having to bit bang to communicate with a peripheral from a 32 bit
processor is not unreasonable ?

Don’t encourage the HW designers with this kind of talk > :wink:

“Bill Caroselli @ Q-TPS” <BillCaroselli@Q-TPS.com> wrote in message
news:9hd6io$3mj$1@inn.qnx.com

Hey Rennie

Isn’t that what SMP is for?

I’ve actually use a busy loop in a SMP design to decrease latency. I worked
great! It also help keep the cache intact, but that’s a subject for a whole
new
thread :wink:

One Pentium to communicate with the winModem, one Pentium to driver the
winPrinter. And oh yeah, another Pentium to run any actual applications
that you want.

And that seems unreasonable to you. You wouldn’t want them to have to use
a
$3.00 UART instead of the CPU chip that already exists would you. That
would be extravagant!

Bill Caroselli


“Rennie Allen” <> RAllen@csical.com> > wrote in message
news:> D4907B331846D31198090050046F80C9058E77@exchangecal.hq.csical.com> …

For example, one of my PCI cards has a serial EEPROM chip where
startup
data is stored. The chip is read by the card at startup time to
initialize itself. My driver can write to the chip at other times to
change the settings which should be used at the next startup.
Unfortunately, the software interface to the chip involves writing a
register which maps directly to the serial bus feeding the EEPROM. So,
I
have to toggle the select, data, clock, and R/W lines directly. It’s a

bit of a pain, but not unreasonable.

Having to bit bang to communicate with a peripheral from a 32 bit
processor is not unreasonable ?

Don’t encourage the HW designers with this kind of talk > :wink: