Bug 114425
Summary: | boot hangs when loading aic7xx module with device attached to card Adaptec 29160N Ultra160 | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Need Real Name <irina> |
Component: | kernel | Assignee: | Tom Coughlan <coughlan> |
Status: | CLOSED WORKSFORME | QA Contact: | Brian Brock <bbrock> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 3.0 | CC: | bugzilla, k.georgiou, petrides, riel |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-09-19 18:43:38 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Need Real Name
2004-01-27 23:16:14 UTC
Sorry for the delay in looking at this. We will not use the aic7xxx Rev 6.3.4 (from /people.freebsd.org/~gibbs/linux/) because there are a number of changes in that driver that are not acceptable. We will need to identify the specifc fix for this problem. Have you tried to reproduce this problem on a recent RHEL 3 kernel? If so, please post the console messages leading up to the hang. If you are still willing to pursue this, I will give you a debug driver to try to identify the problem. Tom - I've seen this problem, or something very similar, on all RHEL3 SMP i686 kernels. My hardware is an IBM x236 with an Adaptec 29160B Ultra160 SCSI adapter. The only device attached to the adapter is an IBM Tape library: Attached devices: Host: scsi0 Channel: 00 Id: 01 Lun: 00 Vendor: IBM Model: ULTRIUM-TD2 Rev: 3AY4 Type: Sequential-Access ANSI SCSI revision: 03 Host: scsi0 Channel: 00 Id: 02 Lun: 00 Vendor: IBM Model: ULTRIUM-TD2 Rev: 3AY4 Type: Sequential-Access ANSI SCSI revision: 03 Host: scsi0 Channel: 00 Id: 03 Lun: 00 Vendor: IBM Model: ULTRIUM-TD2 Rev: 3AY4 Type: Sequential-Access ANSI SCSI revision: 03 Host: scsi0 Channel: 00 Id: 04 Lun: 00 Vendor: IBM Model: ULTRIUM-TD2 Rev: 3AY4 Type: Sequential-Access ANSI SCSI revision: 03 Host: scsi0 Channel: 00 Id: 06 Lun: 00 Vendor: IBM Model: 4560SLX Rev: 0425 Type: Medium Changer ANSI SCSI revision: 02 I'm more than happy to test using the current RHEL3 AS U5 errata kernel kernel-smp-2.4.21-32.0.1.EL (or newer) and would like to get a hold of the debug module you mention above. Are you still willing to supply the debug aic7xxx module and assist in resolving this issue? Yes, I would like to get this fixed. Please post /var/log/messages, showing the aic driver being loaded, and any other messages up to the time of the hang. Does the system hang right after the driver loads, or is there some I/O involved? Try booting the SMP kernel with the NOAPIC kernel parameter. Capturing the boot log messages will require a serial terminal since once it hangs the system tends not to boot and the forced reset looses the boot log. In other words, give me a day or two to recreate and provide what you want. Can you explain why you think the NOAPIC kernel parm has relivance? I know that everything boots fine if the tape devices are turned off or disconnected from the scsi chain. I'm not sure I see how the NOAPIC option would make a difference. Thanks for the quick response. APIC is one of the things that is implicated when there is a hard hang like this. It is also one of things that behaves differently UP vs. SMP. While you have the serial console, please try to get some alt-sysrq output. Before the hang: echo 1 > /proc/sys/kernel/sysrq Then after the hang try alt-sysrq-t, alt-sysrq-m. Also, try turning on the nmi watchdog timer. It will hopefully cause a panic after the hang. The console output from that would be a big help. On the kernel command line, add: "nmi_watchdog=1". This has been in NEEDINFO for nearly three months. We will assume the problem was not reproducible or has been fixed in a later RHEL 3 update. If this problem still exists, please reopen and provide the requested info. |