Bug 480844
Summary: | CONFIG_X86_BIGSMP not set causes hang on Nehalem DP systems | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | John Villalovos <jvillalo> |
Component: | kernel | Assignee: | Kyle McMartin <kmcmartin> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | urgent | Docs Contact: | |
Priority: | high | ||
Version: | 10 | CC: | johnsonm+rhbugz, kernel-maint, kmcmartin, peterm, quintela, suresh.b.siddha |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2009-02-09 21:19:20 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
John Villalovos
2009-01-20 20:43:49 UTC
Can you provide a console log? Even without BIGSMP, the machine should not be hanging, but should be limited to 8 cpus (quite likely 4 real and 4 ht...) Can you try booting with maxcpus set to less than 8? Say, maxcpus set to 4? If this doesn't succeed either how about nosmp? I realize some of these boxes are likely pre-release, so feel free to take this bug private if need be. (That said, is it intentional that you're going i386? F-11 will likely be defaulting to x86_64 kernels in compat mode even for i386 installs where appropriate.) regards, Kyle Suresh, Can you answer the question in Comment 1 ? I don't have the console logs in hand. But I believe redhat already has the systems (and this can even happen on any old x86 platform having more than 8 logical cpus) and it should be very easy to collect the console log. Perhaps John can help. maxcpus=8 boot option is one of the workarounds for this problem. If we don't set BIGSMP and if we find more than 8 cpus, we will be able to bring them online, but during their apic initialization, multiple cpu's will have configured same apic LDR values. This will confuse the interrupt handling creating weird hangs etc. Perhaps we can add more defensive code to either panic with the appropriate message or just ignore more than 8 cpus. But thats a different thing. I'm more than happy to turn this on in Fedora, but it sounds like a pretty serious upstream bug to not do the right thing when we overflow. I'll commit this to Fedora this afternoon so it should be in the next builds. cheers, Kyle Ok, it's been committed to rawhide, F-10 and F-10-2_6_27, after I checked that it wouldn't have any adverse effects (hopefully...) We had NR_CPUS set to 32 on i386 anyway, so it was likely just an oversight. I'll try to see if I can sort out access to such a machine internally to look into why it doesn't gracefully handle BIGSMP being unset. cheers, Kyle Trivial patch limiting the configuration to what is bootable sent upstream, see: http://thread.gmane.org/gmane.linux.kernel/825782 That does not preclude a better patch that prevents the boot from hanging, but at least avoids the misconfiguration. |