Bug 1037640

Summary: [vdsm] second disk hotplug fails in case the VM is paused
Product: Red Hat Enterprise Virtualization Manager Reporter: Elad <ebenahar>
Component: vdsmAssignee: Allon Mureinik <amureini>
Status: CLOSED CANTFIX QA Contact: Aharon Canan <acanan>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.3.0CC: amureini, bazulay, dallan, ebenahar, iheim, lpeer, mkletzan, pbonzini, scohen, yeylon
Target Milestone: ---Keywords: Triaged
Target Release: 3.5.0   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-05-06 16:24:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
logs
none
relevant libvirt log none

Description Elad 2013-12-03 14:21:11 UTC
Created attachment 832121 [details]
logs

Description of problem:
I have a VM which is in paused state due to miscommunication with its disk's SD. When I try to attach and activate a new created disk (which located on an active SD), it succeeds, but If I do hot unplug and hot plug again, the operation fails on vdsm.

Version-Release number of selected component (if applicable):
vdsm-4.13.0-0.10.beta1.el6ev.x86_64
rhevm-3.3.0-0.37.beta1.el6ev.noarch

How reproducible:
100%

Steps to Reproduce:
Have DC with 1 host and 2 SDs
1. have a VM with disk located on the non-master domain and OS installed on it
2. block connectivity from the host to the non-master domain (where the VM disk is located) and wait for the VM to enter to 'paused' state
3. add new disk to the VM and activate it
4. deactivate the disk and activate it again


Actual results:
The second activation fails on vdsm with this libvirt error:

Thread-11156::ERROR::2013-12-03 15:59:35,298::vm::3441::vm.Vm::(hotplugDisk) vmId=`077b08be-0274-4e54-af4d-e73b63fc2b63`::Hotplug failed
Traceback (most recent call last):
  File "/usr/share/vdsm/vm.py", line 3439, in hotplugDisk
    self._dom.attachDevice(driveXml)
  File "/usr/share/vdsm/vm.py", line 839, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 76, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 399, in attachDevice
    if ret == -1: raise libvirtError ('virDomainAttachDevice() failed', dom=self)
libvirtError: internal error unable to execute QEMU command 'device_add': Device 'virtio-blk-pci' could not be initialized


Expected results:
I'm not sure if HotPlugDiskVDSCommand is supposed to be sent to vdsm in case the VM is paused in the first place. If it does, than activation for the second time should work as in the first time.
If it doesn't, so the first disk activation should not be sent with  HotPlugDiskVDSCommand.


Additional info:
logs

Comment 1 Elad 2013-12-03 14:29:54 UTC
Created attachment 832125 [details]
relevant libvirt log

Attaching the relevant libvirt log

Comment 2 Allon Mureinik 2013-12-04 06:36:24 UTC
Elad, 

Besides the failed hotplugging, is there any other side effect?
If you later try to plug the disk, is it successful?

Comment 3 Elad 2013-12-04 07:41:48 UTC
(In reply to Allon Mureinik from comment #2)
> Elad, 
> 
> Besides the failed hotplugging, is there any other side effect?
> If you later try to plug the disk, is it successful?

Only the hotplug won't work. It will work only when the VM comes up

Comment 4 Xavi Francisco 2014-05-02 14:04:04 UTC
Seems the problem may be related to libvirt as, after the virtual machine enters the paused state, the second activation problem happens even to VMs with no disk in the unavailable storage: when you add a second disk to any VM and you repeat the process explained in the bug description the same error appears.

Dave, can any of the engineers in the libvirt team take a look into this issue? I'll gladly help them with any questions they may have.

Comment 5 Dave Allan 2014-05-02 14:27:40 UTC
Martin, can you talk to Xavi?

Comment 6 Martin Kletzander 2014-05-02 16:59:55 UTC
Actually, from what I see in the logs, the error message comes from QEMU and I see nothing we could do differently.  The relevant communication part is:

libvirt => QEMU:

{"execute":"device_add","arguments":{"driver":"virtio-blk-pci","scsi":"off","bus":"pci.0","addr":"0xb","drive":"drive-virtio-disk9","id":"virtio-disk9"},"id":"libvirt-2504"}

QEMU => libvirt:

{"id": "libvirt-2504", "error": {"class": "DeviceInitFailed", "desc": "Device 'virtio-blk-pci' could not be initialized", "data": {"device": "virtio-blk-pci"}}}

@Paolo: Could someone from QEMU shed some light upon this issue, please?  I'm not sure who'd be the best one to ask, so feel free to delegate my question :-)  Thank you.

Comment 7 Paolo Bonzini 2014-05-06 16:24:27 UTC
There's nothing that can be fixed here.

PCI device hot-unplug requires guest cooperation (QEMU just asks the guest to unplug, the guest accepts the request and ejects the device, the guest finishes the unplug), so it won't proceed while the guest is paused.