Description of problem: dm-raid1: dmsetup stuck at suspending failed mirror device Version-Release number of selected component (if applicable): 2.6.18-182.el5 How reproducible: See the following steps. Steps to Reproduce: 1. create two way mirror # dmsetup ls vg00-lv00_mimage_1 (253, 2) vg00-lv00_mimage_0 (253, 1) vg00-lv00_mlog (253, 0) vg00-lv00 (253, 3) 2. suspend mirror device # dmsetup suspend vg00-lv00 3. disable one of the leg # echo offline > /sys/block/<dev>/device/state 4. kill dmeventd # ps -ef | grep dmeventd root 3378 1 0 11:49 ? 00:00:00 [dmeventd] # kill -9 3378 5. Reset sync bits of log disk # dd if=/dev/zero of=/dev/mapper/vg00-lv00_mlog seek=1024 bs=1 count=16 6. Write I/O to a region whose sync bit was reset in Step 5 # dd if=/dev/zero of=/dev/vg00/lv00 seek=1024 bs=4096 count=1 *** This command doesn't return. *** 7. Resume mirror device # dmsetup resume vg00-lv00 8. Wait seconds so that recovery handles the region on which the write I/O is issued in step 6. 9. Suspend mirror device # dmsetup suspend vg00-lv00 *** stuck *** Actual results: dmsetup suspend command stuck. Here is a call trace when this issue happened. crash> bt 4650 PID: 4650 TASK: f71a5aa0 CPU: 1 COMMAND: "dmsetup" #0 [f7fa7cb8] schedule at c061c078 #1 [f7fa7d30] __down at c061d715 #2 [f7fa7d58] __sched_text_start at c061b682 #3 [f7fa7d64] .text.lock.dm_raid1 (via mirror_presuspend) at f8d833c0 #4 [f7fa7d8c] suspend_targets at f88c59e2 #5 [f7fa7d9c] dm_suspend at f88c54d2 #6 [f7fa7dc8] dev_suspend at f88c7e95 #7 [f7fa7ddc] ctl_ioctl at f88c879c #8 [f7fa7f2c] do_ioctl at c0485ed9 #9 [f7fa7f44] vfs_ioctl at c0486440 #10 [f7fa7fa0] sys_ioctl at c04864e0 #11 [f7fa7fb8] system_call at c0404f10 EAX: 00000036 EBX: 00000003 ECX: c138fd06 EDX: 0904fcb8 DS: 007b ESI: 0904fc40 ES: 007b EDI: 00314ca0 SS: 007b ESP: bfea9d84 EBP: bfea9f28 CS: 0073 EIP: 00ce1402 ERR: 00000036 EFLAGS: 00000246 Expected results: dmsetup suspend command finishes successfully. Additional info: This issue is reported on dm-devel. https://www.redhat.com/archives/dm-devel/2010-January/msg00035.html
Created attachment 383531 [details] Patch for 2.6.18-182.el5 kernel This is a patch for 2.6.18-182.el5 kernel.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
I verified this fix on 2.6.18-187.el5.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0178.html