Red Hat Bugzilla – Bug 1003649
It is impossible to hotplug a disk if the previous hotplug failed
Last modified: 2016-02-10 12:18:48 EST
Created attachment 792887 [details] test logs (vdsm, engine, server etc.) Description of problem: When we try to hotplug 10 disks at once, sometimes all the calls finish with status 'complete' and there is no error in vdsm/engine log, but some of the disks are reported by RHEV-M as unplugged. However, if we want to plug such a disk again, the operation will fail with the following error: Thread-770::ERROR::2013-08-30 01:08:22,232::vm::3252::vm.Vm::(hotplugDisk) vmId=`3758e0a3-e715-4e82-8a3f-187fd3a4f6f8`::Hotplug failed Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 3250, in hotplugDisk self._dom.attachDevice(driveXml) File "/usr/share/vdsm/vm.py", line 824, in f ret = attr(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 76, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 399, in attachDevice if ret == -1: raise libvirtError ('virDomainAttachDevice() failed', dom=self) libvirtError: internal error unable to reserve PCI address 0:0:15.0 Version-Release number of selected component (if applicable): is12 How reproducible: Happens sometimes in following automated tests: http://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.3/job/3.3-storage_hotplug_disk_hooks-iscsi-sdk http://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.3/job/3.3-storage_hotplug_disk_hooks-iscsi-rest http://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.3/job/3.3-storage_hotplug_disk_hooks-nfs-sdk http://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.3/job/3.3-storage_hotplug_disk_hooks-nfs-rest Failed job: http://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.3/job/3.3-storage_hotplug_disk_hooks-nfs-rest/44/ id of problematic disk: b9bdb478-5e64-4e50-a2db-6a4077c6fb6a Steps to Reproduce: 1. hotplug 10 disks at once 2. if one of them is not reported as plugged, try to plug it again Actual results: * it is impossible to plug a disk if previous hotplug failed Expected results: * if hotplug fails it should be rolled back so that the next hotplug should be possible
Aharon, seems to me like there should be 2 tests here (after we figure out what the issue is), to make sure we have deterministic results.
The issue is that attaching devices changes the boot order of the VM and since we stopped taking an exclusive lock at VM level (3.1) the race is available. So attaching multiple devices to the same VM concurrently is raceful.
This bug is currently attached to errata RHBA-2013:15291. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag. Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information: * Cause: What actions or circumstances cause this bug to present. * Consequence: What happens when the bug presents. * Fix: What was done to fix the bug. * Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore') Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug. For further details on the Cause, Consequence, Fix, Result format please refer to: https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes Thanks in advance.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-0040.html