Hyperthreading support in QNX 6+?

I just noticed a document at qnx.com extolling the wonders of the Intel-QNX relationship. At the end of the document it indicated that Hyperthreaded CPU’s were supported. Hmmmm, does that by any chance mean that there is an update that is Hyperthreading is supported correctly?
I know a hyperthreaded cpu will appear as two cpu’s but of course one of them is significantly slower than the other, and previous the scheduler knew nothing of this.

What do you think the scheduler would need to do differently in order to “support” one “CPU” running slower than another “CPU” in an SMP system?

Let me invent some terminalogy to answer this. Each HT processor has two processors, a fast one, and a slow one. When scheduling, the highest priority processes get to run on the fast processors, and the next in line get put on the slow ones.

That sounds simple enough, but I admit that there are two problems.

  1. What happens, for example, when the highest priority process becomes blocked. Instead of just starting the next waiting process for that processor, a process already running on a slow processor should be moved to the available fast one.
  2. Scheduler attempts to keep a process on a single processor get more complicated. I’m not sure about this, but I’ll bet that a pair of fast/slow processors share a cache, so if the fast processor becomes open, it makes sense (if otherwise ok) to try to move the process from the slow to fast processor. Of course scheduling rules may dictate otherwise.

Hi,

I’m not a hardware guru but my understanding is that a hyperthreaded processor is symmetric: Some resources are really duplicated, while others are shared, and it appears to the OS as two separate processor of equal abilities. I’m curious to learn what makes one CPU fast and the other slow.

Of course, if one CPU uses a lot of the shared resources then the other CPU is not as useful, but I think that is a dynamic issue.

Thanks,
Albrecht

I’ll accept the concept of asymmetry. Whether or not it exists in this case, it is a definate possibility in other cases.

Yes, and this would be bad (cache coherency updates required).

Yeah, there are lots of issues, but I can’t get a good feel of what the overall goal would be (is it simply minimal time-to-completion for the highest priority thread). Certainly the magnitude of the delta between processors is a concern also. I can easily see a cache update resulting in longer time-to-completion for a high priority thread that switched cores to a faster core (unless the new core is significantly faster).

Determinism is still available when the scheduler pays no attention to which core is faster (you simply calculate worst case time-to-completion based on the slow core). If you don’t like the numbers for the slow core, then use thread affinity to lock the thread to the fast core. That’s the official story at this point AFAIK.

I don’t think that there is any question of this. HT processors are not dual core. There is one processor available, but during idle periods, for example while waiting for memory accesses, a second thread gets attention. The number 10% is just something I’ve seen in various places. I’m sure that the actual amount varies considerably with the code being executed.

Here is my simplistic viewpoint. There are two competing issues.
First, there is an additional resource, the slow processor, that should be made available. This is accomplished currently.

Next, there is the possibility for this additional processor to cause a violation of the priority scheme. A higher priority process in the slow processor, competing with a lower priority process in the fast process.

The affinity solution creates another violation, a high priority process, locked out of the fast processor by another higher priority process, but a lower priority process getting cycles in the slow processor.

I admit that there may be no good solution in a rtos, given the complications of cache coherency. At the very least, it would be nice to be able to know which processors are fast and which are slow.

Well certainly 10/1 is significantly faster. How this plays out with the hit on a cache, I don’t know.

Maybe this is just a temporary ignorable glitch in techology. With dual cores coming on line, the issue may go away unless each of the dual cores itself is HT.

With hyperthreading, there is no fast and no slow processor. Where did you get this information from? To actually use hyperthreading under QNX, you would have to use the SMP kernel anyway. And that one is only available in the SMP/Multicore-TDK. And that one costs… something. :slight_smile: Anyway, Hyperthreading is what the name implies - a hype. Only with very intensive math calculations you get small (yes 10%) performance increase. On average, you get maybe 3% performance increase only.

Hello Thunderblade,

this is precisely what I tried to express in my earlier statement - HT processors are symmetric. As to SMP: Yes that is expensive, but the x86-SMP kernel is also in the x86 Runtime Kit Extended Bundle, which is only slightly more expensive than the Standard Bundle.

I have seen such processors, so it is an issue.

Regards,
Albrecht

I agree with Thunderblade, HT is mostly hype ( in some case it can slow down a machine).

HAving a dual-core machine I would tell you that I will never go back to single core machine ;-)

From non-technical literature, so I have no idea if it is correct.
Wherever I’ve seen anything written I’ve gotten the following
information.

  1. An HT processor appears as 2 processors. This is verifyably true.
  2. The processors are single threaded, that is the 2nd is only doing something when the 1st is idle or waiting.

So the question is, is either processor dominant? If one is not, then the hardware must switch between them on a regular basis. This would be indistinguishable from (a slower) dual core.

Please note that while the above argument sounds logical to me, I have no factual basis for it.

ok i believe there are a few misconceptions about hypter-threading. this should clear a few of them up for you:
intel.com/business/bss/produ … erview.htm

Well I read the article, and did not clear up anything for me. The article seems to indicate that HT technology shares the resources of the processor. Ok, so how does it share them? Equally? As I pointed out, if it is symmetric, then it is not very distinquishable from a slower dual core.
That’s fine if it is true, it also explains why a program would run slower with HT than without. If its competing, with anything else, even an IDLE loop, it should run slower. Throughput should be greater, and the article suggests a maximum of 25 or 30%, which means the number %15 is probably realistic.