Bug 183364
Summary: | CONFIG_DEBUG_SPINLOCK causes massive increase in kernel buildtimes | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Prarit Bhargava <prarit> |
Component: | kernel | Assignee: | Ingo Molnar <mingo> |
Status: | CLOSED NEXTRELEASE | QA Contact: | Brian Brock <bbrock> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 5 | CC: | davej, dchapman, kmurray, konradr, maurizio.antillon, miyer, wtogami |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-08-08 14:12:05 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 163350 |
Description
Prarit Bhargava
2006-02-28 17:11:16 UTC
some data from my HP 64p (ia64) Integrity 1TB system command in both cases was: time make -j 70 compiling kernel under RHEL4U3: real 1m3.786s user 26m38.878s sys 2m46.416s compiling running under upstream 2.6.16-rc5 w/CONFIG_DEBUG_SPINLOCK real 1m12.617s user 26m3.748s sys 9m18.908s It is slower but not as bad as you were seeing... - Doug After a discussion with Kimball, he pointed out that the issue is probably not any sort of cacheline ping-pong issue, but rather with the __delay(1) used in lib/spinlock_debug.c: __spin_lock_debug() for (i = 0; i < loops_per_jiffy * HZ; i++) { if (__raw_spin_trylock(&lock->raw_lock)) return; __delay(1); } Removing this __delay cuts down the time drastically. Ingo, I'm probably missing something basic here -- but is it really necessary to do a __delay(1)? Why not make the "loops_per_jiffy * HZ" value tunable at boot time and allow users to set a value -- wouldn't that be the same as attempting to get close to one second? Just some thoughts, P. With the __delay(1) removed, and building a kernel: real 2m44.371s user 38m0.008s sys 51m19.492s We're still seeing an impact, however, that maybe due to other CONFIG options, etc.. P. The more I think about this, the more I believe the correct solution is to remove the __delay(1). The loop for (i = 0; i < loops_per_jiffy * HZ; i++) does loop for approximately a second. I do not see any advantage to delaying for a single clock tick on ia64. Manoj -- what is the ramification of removing the __delay on ppc & ppc64? P. P. Prarit have you seen is LKML discussion ? http://lkml.org/lkml/2006/2/6/381 >http://lkml.org/lkml/2006/2/6/381
Hmmm ... interesting. Ingo is saying that loops_per_jiffy is the wrong
metric to use -- and I agree.
I think the correct metric to use here is the TSC? IIRC most arch's have a
value that is TSC ticks/second that can be determined by querying the CPU ...
P.
changed the __delay(1) to a cpu_relax() in latest builds. Let me know how it works out. most recent rawhide kernel with cpu_relax instead of __delay(1) on 64p ... is actually worse, not better :/. real 21m32.134s user 56m47.696s sys 1016m2.616s |