Red Hat Bugzilla – Bug 825126
vgs hungs due to the failing mpath
Last modified: 2015-01-26 19:10:30 EST
Description of problem:
if the multipathed devices are failing, lvm command 'vgs' will hung due to the failed multipathed devices.
Version-Release number of selected component (if applicable):
2.6.32-220.el6.x86_64 #1 SMP Wed Nov 9 08:03:13 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
Steps to Reproduce:
1. configure the iscsi initiator to login the iscsi target
2. enable multipath and create partition on the iscsi lun
>>lvm.conf, only scan the multpathed devices
filter = [ "a/mpath/" "r/.*/" ]
# multipath -ll
mpathbq (360060e801047103004f2c4b30000001f) dm-2 HITACHI,DF600F
size=200G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 2:0:0:1 sda 8:0 active ready running
# vgs -o+devices
VG #PV #LV #SN Attr VSize VFree Devices
iscsi_vg 1 3 0 wz--n- 199.80g 78.80g /dev/mapper/mpathbqp2(0)
iscsi_vg 1 3 0 wz--n- 199.80g 78.80g /dev/mapper/mpathbqp2(5120)
iscsi_vg 1 3 0 wz--n- 199.80g 78.80g /dev/mapper/mpathbqp2(30720)
3. sysctl kernel.hung_task_timeout_secs=10 ( hung task can be easily to reproduce )
4. ifdown the NIC used by the default iscsi iface
5. execute 'vgs'
INFO: task vgs:1781 blocked for more than 10 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message
vgs can report some error about the iscsi_vg but can display other VG(if have) and exit gracefully.
Created attachment 586798 [details]
queue_if_no_path is set, so is this not correct behaviour?
- LVM needs to probe the device. (But see also lvmetad as future alternative.)
- You configured your system to say: if this device is unavailable and something tries to access it, wait indefinitely.
(In reply to comment #0)
> Expected results:
> vgs can report some error about the iscsi_vg but can display other VG(if
> have) and exit gracefully.
I'd say this is just a misconfiguration - if "error" is expected instead, you should consider using the "error if no path" policy that does exactly that ("no_path_retry=fail" setting).
If you encounter this problem while doing a system shutdown, we already track that by bug #800801 (see also original bug #672530 comment #15).
Thanks for the clarification. I got the expected behavior after setting no_path_retry=fail. So close this bug.