| Summary: | I/O stuck on dm-mpath device even when physical paths are recovered | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | shivamerla1 <shiva.krishna> |
| Component: | device-mapper-multipath | Assignee: | Ben Marzinski <bmarzins> |
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Lin Li <lilin> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 6.7 | CC: | agk, bmarzins, heinzm, jbrassow, lilin, msnitzer, prajnoha, rbalakri, shiva.krishna, zkabelac |
| Target Milestone: | rc | Flags: | bmarzins:
needinfo?
(shiva.krishna) |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-09-29 22:46:06 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Description of problem: During controller failover tests with Nimble array, we have seen I/O on one of the dm device has not resumed after stand-by controller takeover. We also see that physical paths have been recovered, but I/O was not retried on those paths. As debugging, we issued certain reads manually on the device and then all stuck I/O's are flushed. Version-Release number of selected component (if applicable): 2.6.32-573.el6.x86_64 How reproducible: Seen few times Steps to Reproduce: 1.Reboot/Fail active controller 2.Stand-by controller will takeover and all iSCSI sessions will be redirected to stand-by. 3.Takeover complete, but I/O is hung on dm device, even after physical paths are recovered. Actual results: I/O hung on dm-13, even after paths are recovered. Expected results: I/O to resume after controller failover. Additional info: mpathbx (28ed72d3288ecea7c6c9ce900d2416567) dm-13 Nimble,Server size=117G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw `-+- policy='round-robin 0' prio=50 status=active |- 25:0:0:0 sdz 65:144 active ready running `- 24:0:0:0 sdy 65:128 active ready running No pending I/O on physical paths: [root@rtp-smc-qa24-vm2 ~]# cat /sys/block/sdz/inflight 0 0 [root@rtp-smc-qa24-vm2 ~]# cat /sys/block/sdy/inflight 0 0 Dm device has I/O stuck: [root@rtp-smc-qa24-vm2 ~]# cat /sys/block/dm-13/inflight 4 1 [root@rtp-smc-qa24-vm2 ~]# ls -l /sys/block/dm-13/slaves/ total 0 lrwxrwxrwx. 1 root root 0 Nov 23 13:02 sdy -> ../../../../platform/host24/session25/target24:0:0/24:0:0:0/block/sdy lrwxrwxrwx. 1 root root 0 Nov 23 13:02 sdz -> ../../../../platform/host25/session26/target25:0:0/25:0:0:0/block/sdz dmsetup status: mpathbx: 0 245760000 multipath 2 0 1 0 1 1 A 0 2 0 65:144 A 1 65:128 A 0 After running some reads, manually the I/O’s are flushed. [root@rtp-smc-qa24-vm2 ~]# dd if=/dev/dm-13 of=/dev/null bs=512 count=10 iflag=direct 10+0 records in 10+0 records out 5120 bytes (5.1 kB) copied, 0.852584 s, 6.0 kB/s [root@rtp-smc-qa24-vm2 ~]# cat /sys/block/dm-13/inflight 0 0