Bug 87550
Summary: | Intermittent lockups during kernel load with SuperMicro P4DPE systemboard, Intel E7500 chipset | ||
---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Jesse Keating <jkeating> |
Component: | kernel | Assignee: | Dave Jones <davej> |
Status: | CLOSED WONTFIX | QA Contact: | Brian Brock <bbrock> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 8.0 | CC: | nudea, pfrields |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2004-09-30 15:40:42 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jesse Keating
2003-03-28 17:54:52 UTC
Well, I've been able to duplicate the problem on another Supermicro board, this one using the E7501 chipset. I've tried both with Hyperthreading off and on, both result in lockups. With Hyperthreading disabled, it will stop just after apm: BIOS version 1.2 Flags 0x07 (Driver version 1.16) (or something close to that, pulled from another system). I do not get this problem if I use the Uniproc kernel. It also duplicates in Red Hat 7.3 with SMP kernels. Further info. Intel tech claims to have seen this on a Supermicro board he had, but "fixed" it by disabling hyperthreading. They've been unable to duplicate the problem on any board they have there. We got in an Intel SE 7500CW2 board, and after 240 reboots, it locked up, but never again (200+ reboots after). We've also got yet another Gigabyte dual xeon board, w/ the E7500 chipset, no lockups for it either after 100 or so reboots. Supermicro is playing dumb, saying they have never heard of it, nor can they duplicate it, so we are going to build a system for them that we can reliably duplicate the problem with (like every 10~15 reboots) and ship it to them. This should happen later this week. I'll update the bug if I get any more info. Ok, after further (much further) investigation, this seems to be an issue with 3ware cards and flashing their bios. If we flash the 3ware card bios, the system will then randomly lock up at this point. The resolution is to flash the card, remove the card from the PCI bus, boot the system and shut it down, re-install the card, then everything works. Talk about a weird situation.... This seemed to re-appear, and I think a co-worker of mine, Robin Battey, has tracked down the error. In drivers/char/pc_keyb.c, line 1216, there is a statement: kbd_write_command(KBD_CCMD_MOUSE_DISABLE); /* Disable aux device. */ And if we change this to say: kbd_write_command_w(KBD_CCMD_MOUSE_DISABLE); /* Disable aux device. */ We don't get the error. Perhaps this is only a workaround, or maybe it was a typo in the source, either way this resolves our issue. Please advise. Dave, which errata fixes this? We're now seeing this on some X5DPR-8 boards w/out any extra PCI cards in them, using the latest Fedora Core 1 kernel. We also still continue to see it on the X5DPE-G2 boards, with the latest RHL9 and FC1 kernels. This has become a very urgent issue. latest RHL9 errata definitly has this fix. It didn't get into the FC1 tree though. So, if you're running the latest RHL9 tree, then you're hitting a different issue. Can you tell me which patch involved the fix for this? It's not immediately obvious when looking at the changelog and the patch list. Thanks! n/m I found the patch: linux-2.4.21-wait-kbd-disable.patch I seem to remember still having this lockup a week or so ago, on a fully updated RHL9 system. It triggered immediately after flashing the bios on some 3ware cards in a X5DPE-G2. I will be applying this patch to the Fedora Core 1 kernel to see if I can duplicate it on a system we have here that is able to duplicate. I just talked to Robin, our tech who found the bug, and he says that he's tried using the _w version of the command, and it did not solve the issue. So this is not fixed. Thanks! I am currently running RH 9 ver. 2.4.20-20.9 on a "home-brew" computer system. This is an upgrade from the first RH 9 kernel that I installed. Currently I have this boot issue every time the system reboots whether by my hand or system reboots (power outages, etc). This hang continues every boot...even when I use a second boot path of RH 7.1. It is correctable only by leaving the system "off" for 20 minutes or more. After this is comes up cleanly. Will the current "fix" correct this issue? I think we have stumbled across a solution. Try disabling 'USB Legacy Support' in the bios. We did this to resolve a ps/2 issue, however it seems to have resolved this issue as well. Please let us know if it helps. Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/ |