Bug 569668
| Summary: | [RHEL4] boot hangs if scsi read capacity fails on faulty non system drive | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 4 | Reporter: | Mark Goodwin <mgoodwin> | ||||
| Component: | kernel | Assignee: | David Milburn <dmilburn> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Gris Ge <fge> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 4.8 | CC: | emcnabb, fge, jwest, moshiro, tao, vgoyal | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | 569654 | Environment: | |||||
| Last Closed: | 2011-02-16 15:27:58 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 569654 | ||||||
| Bug Blocks: | 485811, 583726, 589295 | ||||||
| Attachments: |
|
||||||
|
Description
Mark Goodwin
2010-03-02 00:18:50 UTC
Created attachment 397224 [details]
upstream patch to set default capacity to zero on faulty scsi drive
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Committed in 89.25.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/ Any possible for us to simulate this kind of faulty disk? (In reply to comment #13) > Any possible for us to simulate this kind of faulty disk? You might be able to use a scsi_debug module in the initrd with the "every_nth" parameter set to 1 to force/inject I/O errors during boot. There are pre-built scsi_debug modules for RHEL at http://people.redhat.com/mgoodwin/scsi_debug/ See http://sg.danny.cz/sg/sdebug26.html for the scsi_debug documentation. But really, I seem to remember the customer (Fujitsu) reported it was tested and fixed in the GSS support issue tracking tool, as per comment #8. All the patch does is prevent a partition scan on faulty drives from causing the boot to hang. It's pretty simple. Regards -- Mark Goodwin GSS/SEG Mark, With kernel -89, I was not be able to reproduce the problem. I build up the new initrd.img with the module you provide and add these line into init: echo "Loading scsi-debug.ko module" insmod /lib/scsi_mod.ko insmod /lib/sd_mod.ko insmod /lib/scsi-debug.ko every_nth=1 opts=4 =========================================== I have tested these opts: opts=4 will cause system hang about 2 minutes and got this: scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 0 lun 0 The the system boot up normally and no /dev file created for scsi_debug module. ==================== opts=8 system got the correct disk size, but system doesn't hang. Any thing I miss? Code reviewed. Patch linux-2.6.9-scsi-fixup-size-on-read-capacity-failure.patch was applied into kernel-2.6.9-95.EL Customer (Fujitsu) report fix, No hardware, sanity only. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0263.html |