Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1003649 - It is impossible to hotplug a disk if the previous hotplug failed
It is impossible to hotplug a disk if the previous hotplug failed
Status: CLOSED ERRATA
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm (Show other bugs)
3.3.0
Unspecified Unspecified
unspecified Severity high
: ---
: 3.3.0
Assigned To: Sergey Gotliv
Katarzyna Jachim
storage
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-02 10:40 EDT by Katarzyna Jachim
Modified: 2016-02-10 12:18 EST (History)
12 users (show)

See Also:
Fixed In Version: is17
Doc Type: Bug Fix
Doc Text:
Previously, the HotPlugDiskToVmCommand simultaneously updated the 'isPlugged' and 'bootOrder' properties of all devices attached to virtual machines. This could cause a race with another thread which handled the hot plug of another disk for the same virtual machine, for example it was not possible to hotplug a disk if the previous hotplug failed. Now, HotPlugDiskToVmCommand only updates 'isPlugged' for the device that was plugged by this command, so this race does not occur.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-01-21 11:14:48 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
abaron: Triaged+


Attachments (Terms of Use)
test logs (vdsm, engine, server etc.) (8.45 MB, application/x-bzip2)
2013-09-02 10:40 EDT, Katarzyna Jachim
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 19311 None None None Never
oVirt gerrit 19521 None None None Never
Red Hat Product Errata RHBA-2014:0040 normal SHIPPED_LIVE vdsm bug fix and enhancement update 2014-01-21 15:26:21 EST

  None (edit)
Description Katarzyna Jachim 2013-09-02 10:40:06 EDT
Created attachment 792887 [details]
test logs (vdsm, engine, server etc.)

Description of problem:
When we try to hotplug 10 disks at once, sometimes all the calls finish with status 'complete' and there is no error in vdsm/engine log, but some of the disks are reported by RHEV-M as unplugged. However, if we want to plug such a disk again, the operation will fail with the following error:

Thread-770::ERROR::2013-08-30 01:08:22,232::vm::3252::vm.Vm::(hotplugDisk) vmId=`3758e0a3-e715-4e82-8a3f-187fd3a4f6f8`::Hotplug failed
Traceback (most recent call last):
  File "/usr/share/vdsm/vm.py", line 3250, in hotplugDisk
    self._dom.attachDevice(driveXml)
  File "/usr/share/vdsm/vm.py", line 824, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 76, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 399, in attachDevice
    if ret == -1: raise libvirtError ('virDomainAttachDevice() failed', dom=self)
libvirtError: internal error unable to reserve PCI address 0:0:15.0


Version-Release number of selected component (if applicable): is12


How reproducible:
Happens sometimes in following automated tests:
http://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.3/job/3.3-storage_hotplug_disk_hooks-iscsi-sdk
http://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.3/job/3.3-storage_hotplug_disk_hooks-iscsi-rest
http://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.3/job/3.3-storage_hotplug_disk_hooks-nfs-sdk
http://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.3/job/3.3-storage_hotplug_disk_hooks-nfs-rest

Failed job: http://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.3/job/3.3-storage_hotplug_disk_hooks-nfs-rest/44/

id of problematic disk: b9bdb478-5e64-4e50-a2db-6a4077c6fb6a


Steps to Reproduce:
1. hotplug 10 disks at once
2. if one of them is not reported as plugged, try to plug it again


Actual results:
* it is impossible to plug a disk if previous hotplug failed


Expected results:
* if hotplug fails it should be rolled back so that the next hotplug should be possible
Comment 1 Ayal Baron 2013-09-12 06:12:47 EDT
Aharon, seems to me like there should be 2 tests here (after we figure out what the issue is), to make sure we have deterministic results.
Comment 2 Ayal Baron 2013-09-16 07:20:21 EDT
The issue is that attaching devices changes the boot order of the VM and since we stopped taking an exclusive lock at VM level (3.1) the race is available.
So attaching multiple devices to the same VM concurrently is raceful.
Comment 9 Charlie 2013-11-27 19:32:44 EST
This bug is currently attached to errata RHBA-2013:15291. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to 
minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.
Comment 10 errata-xmlrpc 2014-01-21 11:14:48 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0040.html

Note You need to log in before you can comment on or make changes to this bug.