Bug 1070062

Summary: Traceback when getting instances of LMI_MDRAIDFormat of deactivated RAID on multipath devices.
Product: Red Hat Enterprise Linux 7 Reporter: Jan Safranek <jsafrane>
Component: openlmi-storageAssignee: Jan Safranek <jsafrane>
Status: CLOSED CURRENTRELEASE QA Contact: Robin Hack <rhack>
Severity: high Docs Contact:
Priority: high    
Version: 7.0CC: lmiksik, qcai, rhack
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1069597 Environment:
Last Closed: 2014-06-13 12:25:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1069597    
Bug Blocks:    

Description Jan Safranek 2014-02-26 07:57:02 UTC
+++ This bug was initially created as a clone of Bug #1069597 +++

Description of problem:
On machine with a deactivated MD RAID array on multipath devices (see below),
'lmi storage list', finishes with traceback:

Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/cmpi_pywbem_bindings.py", line 82, in __call__
    return self.meth(*args, **kwds)
  File "/usr/lib64/python2.7/site-packages/cmpi_pywbem_bindings.py", line 507, in get_instance
    pinst = self.proxy.MI_getInstance(env, op, plist)
  File "/usr/lib/python2.7/site-packages/pywbem/cim_provider2.py", line 1802, in MI_getInstance
    propertyList)
  File "/usr/lib/python2.7/site-packages/pywbem/cim_provider2.py", line 551, in MI_getInstance
    rval = self.get_instance(env=env, model=model)
  File "/usr/lib/python2.7/site-packages/lmi/providers/cmpi_logging.py", line 266, in _wrapper
    result = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/lmi/storage/LMI_MDRAIDFormatProvider.py", line 60, in get_instance
    model[\'MDUUID\'] = fmt.mdUuid
AttributeError: \'MultipathMember\' object has no attribute \'mdUuid\'
'

'lmi storage list' calls GetInstance() on LMI_MDRAIDFormat internally.



Version-Release number of selected component (if applicable):
openlmi-storage-0.7.1-5.el7.noarch
python-blivet-0.18.27-1.el7.noarch

How reproducible:
always

Steps to Reproduce:
1. create two multipath devices
2. using lmi, create a raid with these two multipath devices:
   lmi storage raid create 0 mpatha mpathb

3. deactivate the array *outside OpenLMI*, don't remove its metadata
   mdadm -S /dev/md127

4. restart OpenLMI provider:
   service tog-pegasus restart

5. notice that OpenLMI provider *activated* the array:
   lmi storage list
   mdadm -D  /dev/md127

6. go to 3 and repeat

Actual results:
step 5 for the first time:
- lmi storage list does not list the MD RAID
- but mdadm -D shows it

step 5 for the second time:
- lmi storage list throws a traceback


Expected results:
- the RAID is *not* activated at step 5 
- no traceback

--- Additional comment from Jan Safranek on 2014-02-25 07:44:45 EST ---

Something is wrong in blivet or below. Simple blivet.reset() activates MD RAID on a multipath devices, while I do not see any mdadm --create nor --assemble call in its log (nor strace).

I _guess_ something triggers multipathd, which triggers MD RAID creation, just as like the multipath devices were just assembled.

Reproducer:

1) get two multipath devices

2) create raid:
   mdadm -C -l 0 -n 2 /dev/md127 /dev/mapper/mpath{a,b}

3) stop the raid:
   mdadm -S /dev/md127

4) initialize blivet
   python blivet_init.py

5) see that the raid is active
   mdadm -D /dev/md127

6) go to 4) until the bug it reproduced (= MD RAID gets magically activated), usually one or two rounds are needed.

In 'udevadm monitor' I can see:

- b.reset() triggers change events on my iSCSI devices:
KERNEL[12499.159123] change   /devices/platform/host3/session1/target3:0:0/3:0:0:1/block/sdg (block)
UDEV  [12499.182739] change   /devices/platform/host3/session1/target3:0:0/3:0:0:1/block/sdg (block)
KERNEL[12499.184022] change   /devices/platform/host3/session1/target3:0:0/3:0:0:2/block/sdh (block)
UDEV  [12499.199825] change   /devices/platform/host3/session1/target3:0:0/3:0:0:2/block/sdh (block)
KERNEL[12499.213178] change   /devices/platform/host4/session2/target4:0:0/4:0:0:1/block/sde (block)
UDEV  [12499.229363] change   /devices/platform/host4/session2/target4:0:0/4:0:0:1/block/sde (block)
KERNEL[12499.254153] change   /devices/platform/host4/session2/target4:0:0/4:0:0:2/block/sdf (block)
UDEV  [12499.269382] change   /devices/platform/host4/session2/target4:0:0/4:0:0:2/block/sdf (block)

- md127 gets added and removed:
KERNEL[12499.359647] add      /devices/virtual/bdi/9:127 (bdi)
UDEV  [12499.360280] add      /devices/virtual/bdi/9:127 (bdi)
KERNEL[12499.360497] add      /devices/virtual/block/md127 (block)
UDEV  [12499.362262] change   /devices/virtual/block/dm-0 (block)
KERNEL[12499.374738] change   /devices/virtual/block/md127 (block)
UDEV  [12499.408290] add      /devices/virtual/block/md127 (block)
UDEV  [12499.420455] change   /devices/virtual/block/md127 (block)
UDEV  [12499.575393] change   /devices/virtual/block/dm-1 (block)
KERNEL[12499.593315] change   /devices/virtual/block/md127 (block)
KERNEL[12499.615377] change   /devices/virtual/block/md127 (block)
KERNEL[12499.615392] change   /devices/virtual/block/md127 (block)
KERNEL[12499.628675] remove   /devices/virtual/bdi/9:127 (bdi)
UDEV  [12499.628689] remove   /devices/virtual/bdi/9:127 (bdi)
KERNEL[12499.628734] remove   /devices/virtual/block/md127 (block)
UDEV  [12499.652990] change   /devices/virtual/block/md127 (block)
UDEV  [12499.664355] change   /devices/virtual/block/md127 (block)
UDEV  [12499.684930] change   /devices/virtual/block/md127 (block)
UDEV  [12499.688831] remove   /devices/virtual/block/md127 (block)


- and sometimes this md127 is only added and not removed:
KERNEL[13003.512605] change   /devices/virtual/block/dm-0 (block)
KERNEL[13003.542639] change   /devices/virtual/block/dm-1 (block)
KERNEL[13003.569595] add      /devices/virtual/bdi/9:127 (bdi)
UDEV  [13003.570605] add      /devices/virtual/bdi/9:127 (bdi)
KERNEL[13003.570834] add      /devices/virtual/block/md127 (block)
UDEV  [13003.573644] change   /devices/virtual/block/dm-0 (block)
KERNEL[13003.598806] change   /devices/virtual/block/md127 (block)
UDEV  [13003.602232] add      /devices/virtual/block/md127 (block)
UDEV  [13003.639707] change   /devices/virtual/block/md127 (block)
UDEV  [13003.799472] change   /devices/virtual/block/dm-1 (block)

Now, what creates this md127 device?

--- Additional comment from David Lehman on 2014-02-25 12:35:56 EST ---

Those change events are generated any time a read-write file descriptor to the mpath device is closed, which happens every time you reset a Blivet instance (which deletes/closes a libparted reference to the disk). I have no idea how this can be avoided. Something that opens the device read-only until it is asked to write to it would have to happen in libparted, and I think it would be a contentious topic.

Comment 2 Jan Safranek 2014-02-26 08:02:16 UTC
The RAID gets activated, there is not much we can do about it for now (but watch bug #1069597), so let's prevent the traceback on OpenLMI side.

Comment 4 Robin Hack 2014-02-26 15:20:07 UTC
QA_ACK+ for rhel7.

Comment 6 Ludek Smid 2014-06-13 12:25:28 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.