Created attachment 350025 [details] scsi timeout injection module Description of problem: A storage can break down in the way that it does not respond to scsi commands such as read/write, while a storage successfully respond to scsi commands such as test unit ready. (It may depend on implementation of storage.) When this type of a device trouble happens, the scsi-mid layer detects timeout for the device, and scsi-mid layer tries to recover the error. Then, scsi-mid layer detects that the device has been recovered by the result of Test Unit Ready. Therefore, the state of the device is not changed to offline and user application can continue to issue I/Os to the device. This may cause timeout errors repeatedly on the same device, and application can not do proper actions quickly. In addition, this issue seriously affects system boot time. During device scanning in scsi-mid layer, read I/Os are issued to recognized devices to get their partition table in check_partition(). Usually, many types of filesystems are registered, and partition check is executed for every filesystems. This is a very long process because every read I/O ends up by timeout. Moreover, scsi device scan is sequentially done, and other devices wait to be scanned. In some Linux distributions, boot processes go forward before valid devices are recognised, and system can not start correctly even if devices are fully redundant by mirroring. Version-Release number of selected component (if applicable): Every RHEL5 kernels How reproducible: See below Steps to Reproduce: 0. Environment kernel ... 2.6.18-128.el5 scsi LLD ... qla2xxx devices ... /dev/sdc (2:0:0:0) scsi timeout ... 3 seconds. 1. Getting an address of scsi_host_template for LLD Getting an address of scsi_host_template table specific to LLD. In case of qla2xxx driver, a table name is "qla2x00_driver_template". # grep qla2x00_driver_template /proc/kallsyms f8a323c0 d qla2x00_driver_template [qla2xxx] 2. Building and loading the scsi timeout injection module Loading the scsi timeout injection module with a "param" option, which is a series of two parameters, scsi_driver_template address got in step 1 and a scsi device target on which a timeout error is injected. Here is an example to inject a scsi timeout to scsi devices, 2:0:0:0. # insmod scsi_timeout.ko param=0xf8a323c0,2:0:0:0 3. Issuing I/Os to the device (/dev/sdc) Issue I/Os to the device several times and you can see it takes about 36 seconds for each I/Os. # dd if=/dev/sdc of=/dev/null bs=4096 count=1 dd: reading `/dev/sdc': Input/output error 0+0 records in 0+0 records out 0 bytes (0 B) copied, 36.0002 seconds, 0.0 kB/s Actual results: Timeout happens every time when a process issues I/O to a broken device, and the process needs to wait for a long time. This is caused because scsi layer does not change the device to offline state in this case of device problem. Expected results: scsi layer changes the state of the device to offline when timeout happened the number of times a user indicated, and the process which issued the I/O can receive -EIO without significant delay. Additional info: - Patch to add a parameter to limit timeout count per device and change a broken device to offline state is under discussion on linux-scsi. Introduce the parameter to limit scsi timeout count http://www.spinics.net/lists/linux-scsi/msg36406.html Introduce the parameter to limit scsi timeout count (take 2) http://www.spinics.net/lists/linux-scsi/msg36954.html
Hitachi can close this RHEL5 bug. We will check if this issue is reproducible in RHEL6. If yes, we will open a new bug. Seiji