Bug 1069597
Summary: | blivet.reset() activates MD RAID on multipath devices | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Jan Safranek <jsafrane> | ||||||||
Component: | python-blivet | Assignee: | David Lehman <dlehman> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Release Test Team <release-test-team-automation> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 7.0 | CC: | jstodola | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | python-blivet-0.61.15.9-1 | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | |||||||||||
: | 1070062 (view as bug list) | Environment: | |||||||||
Last Closed: | 2015-11-19 08:44:59 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1070062 | ||||||||||
Attachments: |
|
Description
Jan Safranek
2014-02-25 11:18:54 UTC
Something is wrong in blivet or below. Simple blivet.reset() activates MD RAID on a multipath devices, while I do not see any mdadm --create nor --assemble call in its log (nor strace). I _guess_ something triggers multipathd, which triggers MD RAID creation, just as like the multipath devices were just assembled. Reproducer: 1) get two multipath devices 2) create raid: mdadm -C -l 0 -n 2 /dev/md127 /dev/mapper/mpath{a,b} 3) stop the raid: mdadm -S /dev/md127 4) initialize blivet python blivet_init.py 5) see that the raid is active mdadm -D /dev/md127 6) go to 4) until the bug it reproduced (= MD RAID gets magically activated), usually one or two rounds are needed. In 'udevadm monitor' I can see: - b.reset() triggers change events on my iSCSI devices: KERNEL[12499.159123] change /devices/platform/host3/session1/target3:0:0/3:0:0:1/block/sdg (block) UDEV [12499.182739] change /devices/platform/host3/session1/target3:0:0/3:0:0:1/block/sdg (block) KERNEL[12499.184022] change /devices/platform/host3/session1/target3:0:0/3:0:0:2/block/sdh (block) UDEV [12499.199825] change /devices/platform/host3/session1/target3:0:0/3:0:0:2/block/sdh (block) KERNEL[12499.213178] change /devices/platform/host4/session2/target4:0:0/4:0:0:1/block/sde (block) UDEV [12499.229363] change /devices/platform/host4/session2/target4:0:0/4:0:0:1/block/sde (block) KERNEL[12499.254153] change /devices/platform/host4/session2/target4:0:0/4:0:0:2/block/sdf (block) UDEV [12499.269382] change /devices/platform/host4/session2/target4:0:0/4:0:0:2/block/sdf (block) - md127 gets added and removed: KERNEL[12499.359647] add /devices/virtual/bdi/9:127 (bdi) UDEV [12499.360280] add /devices/virtual/bdi/9:127 (bdi) KERNEL[12499.360497] add /devices/virtual/block/md127 (block) UDEV [12499.362262] change /devices/virtual/block/dm-0 (block) KERNEL[12499.374738] change /devices/virtual/block/md127 (block) UDEV [12499.408290] add /devices/virtual/block/md127 (block) UDEV [12499.420455] change /devices/virtual/block/md127 (block) UDEV [12499.575393] change /devices/virtual/block/dm-1 (block) KERNEL[12499.593315] change /devices/virtual/block/md127 (block) KERNEL[12499.615377] change /devices/virtual/block/md127 (block) KERNEL[12499.615392] change /devices/virtual/block/md127 (block) KERNEL[12499.628675] remove /devices/virtual/bdi/9:127 (bdi) UDEV [12499.628689] remove /devices/virtual/bdi/9:127 (bdi) KERNEL[12499.628734] remove /devices/virtual/block/md127 (block) UDEV [12499.652990] change /devices/virtual/block/md127 (block) UDEV [12499.664355] change /devices/virtual/block/md127 (block) UDEV [12499.684930] change /devices/virtual/block/md127 (block) UDEV [12499.688831] remove /devices/virtual/block/md127 (block) - and sometimes this md127 is only added and not removed: KERNEL[13003.512605] change /devices/virtual/block/dm-0 (block) KERNEL[13003.542639] change /devices/virtual/block/dm-1 (block) KERNEL[13003.569595] add /devices/virtual/bdi/9:127 (bdi) UDEV [13003.570605] add /devices/virtual/bdi/9:127 (bdi) KERNEL[13003.570834] add /devices/virtual/block/md127 (block) UDEV [13003.573644] change /devices/virtual/block/dm-0 (block) KERNEL[13003.598806] change /devices/virtual/block/md127 (block) UDEV [13003.602232] add /devices/virtual/block/md127 (block) UDEV [13003.639707] change /devices/virtual/block/md127 (block) UDEV [13003.799472] change /devices/virtual/block/dm-1 (block) Now, what creates this md127 device? Created attachment 867383 [details]
reproducer (basically blivet.reset() caller)
Created attachment 867384 [details]
good blivet log (md127 is not created)
Created attachment 867400 [details]
bad blivet log (md127 is created)
Those change events are generated any time a read-write file descriptor to the mpath device is closed, which happens every time you reset a Blivet instance (which deletes/closes a libparted reference to the disk). I have no idea how this can be avoided. Something that opens the device read-only until it is asked to write to it would have to happen in libparted, and I think it would be a contentious topic. Ok, so let's accept that the RAID gets activated for now. I let this bug open so libparted can be eventually fixed and I'm going to clone this bug to fix OpenLMI - it should not throw traceback when this happens. Wait, why is blivet/libparted opening iscsi drives /dev/sdg - sdf? They are multipath members, any write operation on them can be harmful. It should scan the multipath device only. And if they weren't scanned, maybe the raid wouldn't be activated. (In reply to Jan Safranek from comment #8) > Wait, why is blivet/libparted opening iscsi drives /dev/sdg - sdf? They are > multipath members, any write operation on them can be harmful. It should > scan the multipath device only. And if they weren't scanned, maybe the raid > wouldn't be activated. Before the days of auto-activation via udev, blivet began using parted.Device instances to provide several bits of information about block devices. Some examples are size, read/write status, vendor, and model. On blivet's master branch I have changed this as part of the preparation for uevent handling. (In reply to Jan Safranek from comment #8) > Wait, why is blivet/libparted opening iscsi drives /dev/sdg - sdf? They are > multipath members, any write operation on them can be harmful. It should > scan the multipath device only. And if they weren't scanned, maybe the raid > wouldn't be activated. I am fairly certain that udev synthesizing change events on the mpath devices themselves is what is causing the md arrays to get activated. There is no reason to believe otherwise unless /proc/mdstat shows the members to be on SCSI instead of device-mapper. Retested with python-blivet-0.61.15.22-1.el7, this issue is no longer reproducible, RAID array is not activated. Moving to VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2232.html |