Bug 509396
Summary: | kpartx hangs when virtual disks are created/destroyed on a Dell MD3000i | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Adam Huffman <bloch> | ||||||
Component: | device-mapper-multipath | Assignee: | Ben Marzinski <bmarzins> | ||||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Cluster QE <mspqa-list> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 5.3 | CC: | agk, bmarzins, bmr, christophe.varoqui, dwysocha, egoggin, eric, flakrat, heinzm, iannis, junichi.nomura, kueda, lmb, prockai, tranlan | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2014-02-06 17:19:40 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Adam Huffman
2009-07-02 15:29:27 UTC
Did you have iscsid rescan the array? (from console: iscsiadm -m node -R). You have to make iscsid update its device nodes before you reload multipath. Finally, I must say multipath seems to go beserk on a device beeing removed while it has commands in the queue. Perhaps someone should look after that. Could you also run udevmonitor, to see if the kernel is really throwing out all those uevents? Created attachment 358541 [details]
Udevmonitor dump
Udevmonitor dump while removing and re-adding a disk over iscsi.
Created attachment 358542 [details]
Message log
Partial message log while removing and re-adding a disk over iscsi
It seems much happier now, running kernel 2.6.18-164.el5. I deleted a virtual disk yesterday then added a new one. When I rescanned the array then rebuild the multipath device map, the new disk appeared and there was no kpartx hang. Can't play around with this particular device any more as it's going into production. However, another one will be installed fairly soon and I can run more testing on that. Above the requsted udevmonitor dump. Please note that the addition and removal of the disk is done at the iscsi target side. I also noticed that iscsid did not remove the disk device nodes (in use by multipath?). Then again, I should check if iscsi still properly does that when not using multipath. Finnally, toying around with the iscsi disks really screwed up multipath again: 36001c23000dd034d000009ca4795fe16 dm-6 DELL,MD3000i [size=1.0G][features=1 queue_if_no_path][hwhandler=1 rdac][rw] \_ round-robin 0 [prio=100][enabled] \_ 2:0:0:10 sdj 8:144 [active][ready] \_ round-robin 0 [prio=100][enabled] \_ 5:0:0:10 sdk 8:160 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 3:0:0:10 sdr 65:16 [active][ghost] \_ round-robin 0 [prio=0][enabled] \_ 4:0:0:10 sds 65:32 [active][ghost] 36001c23000dd030e000007534795b6af dm-4 DELL,MD3000i [size=50G][features=1 queue_if_no_path][hwhandler=1 rdac][rw] \_ round-robin 0 [prio=0][enabled] \_ 2:0:0:7 sdf 8:80 [active][ghost] \_ round-robin 0 [prio=0][enabled] \_ 5:0:0:7 sdg 8:96 [active][ghost] \_ round-robin 0 [prio=100][active] \_ 3:0:0:7 sdn 8:208 [active][ready] \_ round-robin 0 [prio=100][enabled] \_ 4:0:0:7 sdo 8:224 [active][ready] 1_ dm-19 DELL,MD3000i [size=1.0G][features=1 queue_if_no_path][hwhandler=1 rdac][rw] \_ round-robin 0 [prio=0][enabled] \_ 2:0:0:1 sdb 8:16 [active][ghost] \_ round-robin 0 [prio=0][enabled] \_ 5:0:0:1 sdc 8:32 [active][ghost] \_ round-robin 0 [prio=0][enabled] \_ 3:0:0:1 sdd 8:48 [active][ghost] \_ round-robin 0 [prio=0][enabled] \_ 4:0:0:1 sde 8:64 [active][ghost] \_ round-robin 0 [prio=100][active] \_ 2:0:0:10 sdj 8:144 [active][ready] \_ round-robin 0 [prio=100][enabled] \_ 5:0:0:10 sdk 8:160 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 3:0:0:10 sdr 65:16 [active][ghost] \_ round-robin 0 [prio=0][enabled] \_ 4:0:0:10 sds 65:32 [active][ghost] 36001c23000dd034d0000098a4795ecb8 dm-5 DELL,MD3000i [size=50G][features=1 queue_if_no_path][hwhandler=1 rdac][rw] \_ round-robin 0 [prio=100][active] \_ 2:0:0:8 sdh 8:112 [active][ready] \_ round-robin 0 [prio=100][enabled] \_ 5:0:0:8 sdi 8:128 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 3:0:0:8 sdp 8:240 [active][ghost] \_ round-robin 0 [prio=0][enabled] \_ 4:0:0:8 sdq 65:0 [active][ghost] Note the designation "1_ dm-19 DELL,MD3000i". The paths of 2 disks that have been removed are grouped there. After re-adding one of those disks, "36001c23000dd034d000009ca4795fe16 dm-6 DELL,MD3000i" appears using the same paths as before. So now they're listed twice by multipath. PS. Maybe worth another bug-report or a manual change for the md3000i (from redhat): i dont like the fact the kernel tries to read the iscsi disk partition tables as the specific path might no be accessible. It causes a lot of read errors and I assume slows down boot dramatically. Having a device change wwids can really mess with multipath. It's quite possible that some of the recent iscsi changes have fixed this. Are you able to reproduce this on a recent version. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release. I've changed jobs since I reported this and no longer have access to this hardware. |