Bug 214526
Summary: | sporadic panic in bnx2 module | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Lars Damerow <lars> | ||||||
Component: | kernel | Assignee: | Andy Gospodarek <agospoda> | ||||||
Status: | CLOSED UPSTREAM | QA Contact: | Brian Brock <bbrock> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 5 | CC: | davej, linville, peterm, wtogami | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2007-02-02 20:13:48 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Lars Damerow
2006-11-07 23:22:55 UTC
I've been seeing this recently on some bnx2 hardware. Can you please attach `lspci -vvv` output so I can understand which bnx2 hardware is on the system? Created attachment 141795 [details]
lspci -vv output for the machine suffering from bnx2 segfaults
Here you go. Thankfully we haven't seen one of these panics since submitting
the bug report, but we haven't changed anything that would have fixed them. I'd
still like to find a cause if we can.
thanks,
lars
Thanks for sending that output. I've been investigating panics like these on other kernels and will let you know when we come up with a solution there since it should apply here as well. Please let me know if you continue to see this panic or if you come up with a reliable way to reproduce it. No problem. I was incorrect about not having seen it in since reporting the bug--we actually catch seven or eight of them a day. The admins responsible for the farm have just been rebooting the machines and not telling me about it. :) So, if there's any other information I can provide, please let me know! So far we've found no pattern to the panics. thanks, lars Created attachment 142215 [details]
bnx2-txdebug2.diff
Currently we are still collecting data for the bnx2 crash and using the
attached patch.
Do you need me to roll a test kernel with this patch or would you be willing to
build one yourself?
I'm happy to build it myself. Thanks, though! It'll probably be a couple of days before we can install it on a significant number of machines, but I'll get the process going. Hi Andy, We finally had a panic on a machine with this patch installed. I don't see any output from the patch in the messages file from before the crash; would it have been logged to disk anywhere else before the machine froze up? I'm hoping a serial console wouldn't have been required to catch the message; we have hundreds of these machines, and attaching serial consoles to a number of them large enough to catch a panic soon would be pretty difficult. thanks, lars Lars, The output probably did go to the serial port, but that's OK. I've been working this issue with some others on a different release and arch and the following patch has produced good results: http://people.redhat.com/agospoda/rhel4/gtest/bnx2-poll-fix2.patch This came as a suggestion from the upstream maintainer based on the output from the patch in Comment #7. Based on the other feedback I've gotten it seems this should probably resolve your issue. I realize that installing yet another kernel on that many machines is non-trivial, but based on the results from others it seems like a good candidate to resolve the panics. Please let me know if this resolves your issue. -andy This patch looks like the final one that will resolve your issue: http://people.redhat.com/agospoda/rhel4/gtest/bnx2-txdesc-error.patch Lars, Any chance you were able to verify the patch in comment #11? Thanks! Hello Andy, We have the patch active on a test group of render machines, and so far things are looking good. We're going to increase the number of machines using it soon, so I should be able to have a more definitive answer soon. Thanks for the checking in! I'll update again when I have more info. -lars Sounds good, Lars. The patch for this will appear in 2.6.20. |