sched_yield vs. sleep(0)

I need to implement a busy wait loop checking a condition to be true. Which kernel function should be used to avoid the process taking up all the CPU time, sched_yield() or sleep(0)? The sched_yield will yield itself to the end of the same priority thread queue, while the sleep(0) suspend itself for 0 second, which one is better? The code is very simple:

while(!condition)
sched_yiled/sleep

The condition is an external resource like system clock or a driver register, so it’s impossible to use pthread_cond_wait() here.

I know this is very common in the real time programming arena, what is the classic solution in QNX?

I don’t know what the result of sleep(0) is in QNX 6. At best it would do the same as sched_yield(). At worst it will do nothing. But you have a bigger problem. Neither of these solutions is likely to get the result that you want. Assume for a moment that you only have one cpu available and consider three questions about the threads that this thread will be competing with.

  1. What priority will they run at?
  2. What scheduling method do they use?
  3. What is their cpu usage profile?

Case 1:
If you are only running against a thread at a higher priority you can safely change your code by removing the sched_yield()/sleep(). The higher priority thread will take control whenever it wants to. But if that thread is also in a busy loop, your thread will get no cpu regardless of what your code looks like. So in this case calling yield just wastes time with re-scheduling.

Case 2:
If you are running against a thread at a lower priority, your thread will never give up the cpu to that thread. Calling sched_yield() will cause the kernel to reschedule, but since your thread is still ready and at the top of the queue, it will continue to run. This of course makes no sense.

Case 3:
Your thread runs at the same priority as another thread, either one runs using sched_rr scheduling and the other thread hogs the cpu.
In this case, the following behavior occurs.

Your thread runs for one loop
The other thread runs for one time slice
REPEAT

This at least has some identifiable purpose, but can be implemented in a much more elegant way.
Run your thread at a higher priority, and use a timer to wake it up at whatever period you want to check the condition.

Case 4:
Your thread runs at the same priority as another thread, either one runs using sched_rr scheduling and the other thread does not hog the cpu.

Your thread runs for one loop
If READY the other thread runs for one time slice or until it is done.
REPEAT

This at least might make some sense if you want your thread to be checking the condition unless the other thread is running, but also guarantee at least one check per time slice. A better solution is for you to run two threads. One lower priority thread checks the condition in a busy loop, but also resets a timer each time it checks. The other higher priority thread waits for the timer to fire, checks the condition and also resets the timer. This way, you decide what the minimum periodicity of the check is. If you think about this carefully, it isn’t the exact same behavior, but that too could be implemented if you really want it.

Case 5:
Your thread runs at the same priority as another thread, both run using sched_fifo scheduling and the other thread hogs the cpu.
Now you get this obviously useless behavior.

Your thread runs for one loop.
The other thread runs forever.

Case 6:
Your thread runs at the same priority as another thread, both run using sched_fifo scheduling and the other thread does not hog the cpu.
If the other cpu never uses a whole slice, this case is the same as case 4. If it does, then it is the same as case 3.

This is all complicated enough, but there are other parameters that you can consider. Do you have more than one cpu? Are you running AP?
Even so, I can’t think of any situation in which you would want to do what you are doing.

Wow, Mitchell shared a wealth of good information.

We’ve had a few cases where busy wait is needed (usually delays of a microsecond or so while waiting for a register bit to set or clear). That’s not a good thing to do but sometimes the hardware is just plain dumb. On longer delays you can implement a periodic interrupt to sample a condition rather than busy-wait. There’s no real kernel call solution AFAIK.

What’s your worst case busy-wait condition?

I have never thought it so thoroughly:)

My thread is running in a normal priority, so of cause there are other threads running at the same level. All the threads use default scheduling algorithm (is it sched_rr?).

My code is running in SMP environment. This piece of code could be used by multiple threads, so simply boosting the thread priority doesn’t solve the problem.

So I think the elegant way is setting a timer and waking the thread up periodically to check the condition, no matter what the priority is.

Also as you said, I should consider SMP condition. I want to know how to dynamically set the process cpu affinity when the code is running in multiple threads. The ThreadCtl(NTO TCTL RUNMASK, mask) only set the process affinity to a specified cpu, but how can I set the thread to attach with current running cpu to prevent it from migration? I need to use cpu-dedicated function ClockPeriod() in my thread.

Are the two threads only used to measure the minimum wake up time? In real situation, it seems redundant to use two threads to check the same condition.

Any other special concern about SMP?

I haven’t measure it yet. It should be couple of microseconds from my guess. It could wait for ever in case a hardware error occurs, so I need to set a timer to count the time out.

Well SMP changes your options. You could run your thread at high priority in a busy loop. That will take up a whole processor, but if the remaining processor(s) are enough, there is no problem. It doesn’t change the fact the fact that yielding to a lower priority process doesn’t do anything.

You may also be able to control things by processor afinity.

I post this SMP question at my previous message.

"I want to know how to dynamically set the process cpu affinity when the code is running in multiple threads. The ThreadCtl(NTO TCTL RUNMASK, mask) only set the process affinity to a specified cpu, but how can I set the thread to attach with current running cpu to prevent it from migration? I need to use cpu-dedicated function ClockPeriod() in my thread.

My code is running in SMP environment. This piece of code could be used by multiple threads."

Using affinity is usually not a very idea. If you want to use ClockPeriod() that could be a good reason ;-) However check with your processor model, some have a synchronised clock and some won’t be affected by CPU frequency throttling.

If you can, avoid ClockPeriod() on SMP.

Generally speaking, what time function is considered to be SMP safe if I need to measure the clock tick?

I probably don’t understand the discussion, but I think the issue has to do with using the processor cycle counter to time something in an SMP system? I’m guessing that if you have a dual or quad core processor that the cycle counters will all be in sync, but I don’t know that. Otherwise, shouldn’t any system timer should be SMP safe.

All time function expection ClockPeriod are safe. What do you mean by “measure the clock tick”

The processor cycle counter on x86 can be nasty. I`v seen some on AMD go totally independent, even screws the eclipse profiler.

I`ve heard that on the latest intel processor they are in sync. Even better, if the core frequency is reduce to save power the cycle counter register will not be affected which used to be the case. On AMD phenom this is a problem I believe since each of the 4 cores can have its own frequency.

The ClockPeriod() set the clock period, every period is a tick and triggers a system 0 interrupt on x86.

If timer is not synchronized between multiple CPUs, then every CPU has its own clock period. To measure the ticks in a precise way, I think I have to assign a counter to each CPU to hook up with the CPU system 0 interrupt.

The general method I can think is setting a timer to trigger an event at a specific interval, since using timer is considered to be SMP safe.

I think I confused you ;-) ClockPeriod defines the tick, but the tick is the same for every core. As a matter of fact there is only one tick. So getting the time with clock_gettime() will be ok whatever the core it’s running on.

What I was refering to is ClockCycles(), this one is independent on each core.

a kernel function, i.e., the kernel of an integral operator; for that topic see kernel (mathematics), or
the kernel of a function. This definition is the key of getting a right function related to topic.


642-691 questions | 642-642 questions | 642-533 questions