Bug 183364
| Summary: | CONFIG_DEBUG_SPINLOCK causes massive increase in kernel buildtimes | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Prarit Bhargava <prarit> |
| Component: | kernel | Assignee: | Ingo Molnar <mingo> |
| Status: | CLOSED NEXTRELEASE | QA Contact: | Brian Brock <bbrock> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 5 | CC: | davej, dchapman, kmurray, konradr, maurizio.antillon, miyer, wtogami |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2006-08-08 14:12:05 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 163350 | ||
|
Description
Prarit Bhargava
2006-02-28 17:11:16 UTC
some data from my HP 64p (ia64) Integrity 1TB system command in both cases was: time make -j 70 compiling kernel under RHEL4U3: real 1m3.786s user 26m38.878s sys 2m46.416s compiling running under upstream 2.6.16-rc5 w/CONFIG_DEBUG_SPINLOCK real 1m12.617s user 26m3.748s sys 9m18.908s It is slower but not as bad as you were seeing... - Doug After a discussion with Kimball, he pointed out that the issue is probably not
any sort of cacheline ping-pong issue, but rather with the __delay(1) used
in lib/spinlock_debug.c: __spin_lock_debug()
for (i = 0; i < loops_per_jiffy * HZ; i++) {
if (__raw_spin_trylock(&lock->raw_lock))
return;
__delay(1);
}
Removing this __delay cuts down the time drastically. Ingo, I'm probably
missing something basic here -- but is it really necessary to do a __delay(1)?
Why not make the "loops_per_jiffy * HZ" value tunable at boot time and allow
users to set a value -- wouldn't that be the same as attempting to
get close to one second?
Just some thoughts,
P.
With the __delay(1) removed, and building a kernel: real 2m44.371s user 38m0.008s sys 51m19.492s We're still seeing an impact, however, that maybe due to other CONFIG options, etc.. P. The more I think about this, the more I believe the correct solution is to
remove the __delay(1).
The loop
for (i = 0; i < loops_per_jiffy * HZ; i++)
does loop for approximately a second. I do not see any advantage to delaying
for a single clock tick on ia64.
Manoj -- what is the ramification of removing the __delay on ppc & ppc64?
P.
P.
Prarit have you seen is LKML discussion ? http://lkml.org/lkml/2006/2/6/381 >http://lkml.org/lkml/2006/2/6/381
Hmmm ... interesting. Ingo is saying that loops_per_jiffy is the wrong
metric to use -- and I agree.
I think the correct metric to use here is the TSC? IIRC most arch's have a
value that is TSC ticks/second that can be determined by querying the CPU ...
P.
changed the __delay(1) to a cpu_relax() in latest builds. Let me know how it works out. most recent rawhide kernel with cpu_relax instead of __delay(1) on 64p ... is actually worse, not better :/. real 21m32.134s user 56m47.696s sys 1016m2.616s |