Bug 71035
Summary: | Scheduler puts computationally bound processes on same die with hyperthreading | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Ben Woodard <woodard> | ||||||
Component: | kernel | Assignee: | Arjan van de Ven <arjanv> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Brian Brock <bbrock> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 7.3 | CC: | djuran, joshua.bakerlepain | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i686 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2004-02-10 23:33:29 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Ben Woodard
2002-08-08 01:37:11 UTC
Please sit on this issue for a little bit. We may have uncovered some additional facts which may shed light on the issue. Created attachment 69824 [details]
computationally bound process which is useful in illustrating the problem
Created attachment 69825 [details]
data file needed with test program
It sort of seems like the kernel hyperthreading capability may have been damaged by the linux-2.4.17-selected-ac-bits.patch. I noticed that in the stock 2.4.18 there are two places where the varible cpu_sibling_map is used one is in the smpboot.c and the other is in sched.c. smpboot.c seems to store the fact that two cpus are actually related in the variable and then sched.c makes use of that piece of information when selecting an idle CPU. In our kernel however, smpboot.c still stores the fact that the two CPUs are related however the scheduler doesn't seem to make use of that fact. Caveat: I currently do not have root access to a machine that has hyperthreading capability and so I am unable to verify that the way that the 2.4.18 kernel uses cpu_sibling_map yeilds the correct behavior. I expect to have root access on the needed machine within a day or so. The place in the stock 2.4.18 kernel that I'm refering to is: 259 /* 260 * We use the first available idle CPU. This creates 261 * a priority list between idle CPUs, but this is not 262 * a problem. 263 */ 264 if (tsk == idle_task(cpu)) { 265 #if defined(__i386__) && defined(CONFIG_SMP) 266 /* 267 * Check if two siblings are idle in the same 268 * physical package. Use them if found. 269 */ 270 if (smp_num_siblings == 2) { 271 if (cpu_curr(cpu_sibling_map[cpu]) == 272 idle_task(cpu_sibling_map[cpu])) { 273 oldest_idle = last_schedule(cpu); 274 target_tsk = tsk; 275 break; 276 } 277 278 } 279 #endif Because sched.c is so different in our kernels, I was unable to quickly find the analogous location. OK we can unsuspend this one. I think I've gathered as much information as I can about the problem as I can at this particular time. The problem is still present in RH9's 2.4.20 kernel, pretty much as Ben describes in his most excellent bug report. Fixing this has apparently been done by Ingo's O(1) scheduler in the 2.5 series kernel. Chunks of this scheduler appear to have already been backported to 2.4. How easy would it be for the hyperthreading fixes to also be backported? As more and more hyperthread-enabled CPUs hit the market, this will become a bigger issue. Particularly since the Big New Feature of RH9 is the glibc threading stuff, it seems that RH would be quite interested in getting the most out of this new hardware gizmo. Seems to be fixed in at least in kernel-smp-2.4.21-9.EL |