Bug 445073
| Summary: | Kernel looses secondary channels of Promise PATA Controller | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Bevis King <brwk> | ||||
| Component: | kernel | Assignee: | Alan Cox <alan> | ||||
| Status: | CLOSED NOTABUG | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | low | ||||||
| Version: | 8 | CC: | kernel-maint | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2008-05-06 11:20:19 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Bevis King
2008-05-03 07:54:39 UTC
MDADM notification of 04:16:44 this morning:
--------------------------------------------
A Fail event had been detected on md device /dev/md2.
Faithfully yours, etc.
P.S. The /proc/mdstat file currently contains the following:
Personalities : [raid6] [raid5] [raid4]
md2 : active raid5 sdb1[0] sde1[4](F) sdd1[2] sdc1[5](F)
1465151808 blocks level 5, 64k chunk, algorithm 2 [4/2] [U_U_]
md0 : active raid5 sdf1[0] sdi1[3] sdh1[2] sdg1[1]
879100608 blocks level 5, 32k chunk, algorithm 2 [4/4] [UUUU]
unused devices: <none>
Created attachment 304444 [details]
/var/log/messages from the period of the problem
Upgrading Kernel to 2.6.24.5-85.fc8 before attempting to recover software RAID. The trace shows the drive going busy and never coming back. At that point it jams the entire channel so both disks on the channel will be lost. Linux is correctly trying to recover by resetting the device but to no effect. At first glance that looks like a failing drive. Understood - except this happens almost simultaneously with two different controllers and in both cases the second channel dies. This is why I reported it as a bug rather than just figuring it was a hardware fault. Update: since going to the 2.6.24.5-85.fc8 kernel, plus disabling non-used devices in the BIOS (USB controller, Serial Ports, AC97 Audio card, IEEE1394 controller) to free IRQs, and reseating all the PCI boards, I've had the system under fairly heavy load for 48 hours with not a single reported disc error or problem. The area of previous heavy load was under similar stress as before, but with no adverse effects. No components have been changed. When brought back on-line, no disc errors were reported by e2fsck -f. There was no corruption found to file system or files (several hundred in the affected areas were checked). Regards, Bevis. Ok I'll close this bug for now, if it does it again please re-open the bug and we can dig deeper. |