Bug 844408
Summary: | after failed hotplug qemu keeps the file descriptor open | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Haim <hateya> | ||||||||
Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||
Severity: | urgent | Docs Contact: | |||||||||
Priority: | urgent | ||||||||||
Version: | 6.3 | CC: | abaron, acathrow, areis, bazulay, bsarathy, chayang, cpelland, dallan, dyasny, dyuan, iheim, jdenemar, juzhang, lpeer, mjenner, mkenneth, mzhan, rwu, sgrinber, shuang, virt-maint, weizhan, ydu, yeylon, ykaul | ||||||||
Target Milestone: | rc | Keywords: | ZStream | ||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | storage | ||||||||||
Fixed In Version: | libvirt-0.10.2-1.el6 | Doc Type: | Bug Fix | ||||||||
Doc Text: |
Disk hot plug is a two-part action: the qemuMonitorAddDrive() call is followed by the qemuMonitorAddDevice() call. When the first part succeeded but the second one failed, libvirt failed to roll back the first part and the device remained in use even though the disk hot plug failed. With this update, the rollback for the drive addition is properly performed in the described scenario and disk hot plug now works as expected.
|
Story Points: | --- | ||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2013-02-21 07:20:29 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 859376 | ||||||||||
Attachments: |
|
Description
Haim
2012-07-30 14:00:20 UTC
Created attachment 601283 [details]
vdsm log
Haim, can you check which process is holding the device? could be that qemu just didn't close the FD or something. Which version of kvm-qrmu are you using? the disk is locked by qemu process: [root@nott-vds2 ~]# ls -l /rhev/data-center/44de127a-01b2-4295-8959-4733109c0f93/ce59f7e8-d3c6-49a5-a4f4-d55586863278/images/bf46d4bc-4588-40ac-a08c-f8f710c85718/e44b57d4-1dc2-4fc4-8ece-c3eedbf9b92e lrwxrwxrwx. 1 vdsm kvm 78 Aug 2 13:35 /rhev/data-center/44de127a-01b2-4295-8959-4733109c0f93/ce59f7e8-d3c6-49a5-a4f4-d55586863278/images/bf46d4bc-4588-40ac-a08c-f8f710c85718/e44b57d4-1dc2-4fc4-8ece-c3eedbf9b92e -> /dev/ce59f7e8-d3c6-49a5-a4f4-d55586863278/e44b57d4-1dc2-4fc4-8ece-c3eedbf9b92e [root@nott-vds2 ~]# lvs /dev/ce59f7e8-d3c6-49a5-a4f4-d55586863278/e44b57d4-1dc2-4fc4-8ece-c3eedbf9b92e LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert e44b57d4-1dc2-4fc4-8ece-c3eedbf9b92e ce59f7e8-d3c6-49a5-a4f4-d55586863278 -wi-ao-- 1.00g [root@nott-vds2 ~]# lvchange -a n lvs /dev/ce59f7e8-d3c6-49a5-a4f4-d55586863278/e44b57d4-1dc2-4fc4-8ece-c3eedbf9b92e Volume group "lvs" not found Skipping volume group lvs device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy device-mapper: remove ioctl on failed: Device or resource busy Unable to deactivate ce59f7e8--d3c6--49a5--a4f4--d55586863278-e44b57d4--1dc2--4fc4--8ece--c3eedbf9b92e (253:14) [root@nott-vds2 ~]# lsof | /dev/ce59f7e8-d3c6-49a5-a4f4-d55586863278/e44b57d4-1dc2-4fc4-8ece-c3eedbf9b92e -bash: /dev/ce59f7e8-d3c6-49a5-a4f4-d55586863278/e44b57d4-1dc2-4fc4-8ece-c3eedbf9b92e: Permission denied [root@nott-vds2 ~]# lsof /dev/ce59f7e8-d3c6-49a5-a4f4-d55586863278/e44b57d4-1dc2-4fc4-8ece-c3eedbf9b92e COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME qemu-kvm 17755 qemu 10u BLK 253,14 0x40000000 993119 /dev/ce59f7e8-d3c6-49a5-a4f4-d55586863278/../dm-14 additonal info, libvirt xml lacks the disk indicating that it was indeed unpluged (or at least, made libvirt think it as unplagged), but qemu process shows it still plugged: [root@nott-vds2 ~]# virsh -r dumpxml 1 <domain type='kvm' id='1'> <name>new_vm</name> <uuid>d7810094-bf60-4651-9388-5c3f2a9eae4f</uuid> <memory unit='KiB'>524288</memory> <currentMemory unit='KiB'>524288</currentMemory> <vcpu placement='static'>1</vcpu> <cputune> <shares>1020</shares> </cputune> <sysinfo type='smbios'> <system> <entry name='manufacturer'>Red Hat</entry> <entry name='product'>RHEV Hypervisor</entry> <entry name='version'>6Server-6.3.0.3.el6</entry> <entry name='serial'>38373035-3536-4247-3830-333334344130_78:E7:D1:E4:8C:58</entry> <entry name='uuid'>d7810094-bf60-4651-9388-5c3f2a9eae4f</entry> </system> </sysinfo> <os> <type arch='x86_64' machine='rhel6.3.0'>hvm</type> <boot dev='hd'/> <smbios mode='sysinfo'/> </os> <features> <acpi/> </features> <cpu mode='custom' match='exact'> <model fallback='allow'>Conroe</model> <topology sockets='1' cores='1' threads='1'/> </cpu> <clock offset='variable' adjustment='0'> <timer name='rtc' tickpolicy='catchup'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source startupPolicy='optional'/> <target dev='hdc' bus='ide'/> <readonly/> <serial></serial> <alias name='ide0-1-0'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> <controller type='usb' index='0'> <alias name='usb0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='ide' index='0'> <alias name='ide0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <controller type='virtio-serial' index='0'> <alias name='virtio-serial0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <interface type='bridge'> <mac address='00:1a:4a:23:61:5d'/> <source bridge='rhevm'/> <target dev='vnet0'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <console type='pty' tty='/dev/pts/1'> <source path='/dev/pts/1'/> <target type='virtio' port='0'/> <alias name='console0'/> </console> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channels/new_vm.com.redhat.rhevm.vdsm'/> <target type='virtio' name='com.redhat.rhevm.vdsm'/> <alias name='channel0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <channel type='spicevmc'> <target type='virtio' name='com.redhat.spice.0'/> <alias name='channel1'/> <address type='virtio-serial' controller='0' bus='0' port='2'/> </channel> <input type='mouse' bus='ps2'/> <graphics type='spice' port='5900' tlsPort='5901' autoport='yes' keymap='en-us' passwdValidTo='1970-01-01T00:00:01'> <listen type='network' address='10.35.115.11' network='vdsm-rhevm'/> <channel name='main' mode='secure'/> <channel name='display' mode='secure'/> <channel name='inputs' mode='secure'/> <channel name='cursor' mode='secure'/> <channel name='playback' mode='secure'/> <channel name='record' mode='secure'/> </graphics> <video> <model type='qxl' vram='65536' heads='1'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </memballoon> </devices> <seclabel type='dynamic' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c403,c698</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c403,c698</imagelabel> </seclabel> </domain> [root@nott-vds2 ~]# virsh -r dumpxml 1 | less [root@nott-vds2 ~]# pgrep qemu 17755 [root@nott-vds2 ~]# ps -ww `pgrep qemu ` PID TTY STAT TIME COMMAND 17755 ? Sl 0:38 /usr/libexec/qemu-kvm -S -M rhel6.3.0 -cpu Conroe -enable-kvm -m 512 -smp 1,sockets=1,cores=1,threads=1 -name new_vm -uuid d7810094-bf60-4651-9388-5c3f2a9eae4f -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=6Server-6.3.0.3.el6,serial=38373035-3536-4247-3830-333334344130_78:E7:D1:E4:8C:58,uuid=d7810094-bf60-4651-9388-5c3f2a9eae4f -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/new_vm.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2012-08-02T10:35:20,driftfix=slew -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw,serial= -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/rhev/data-center/44de127a-01b2-4295-8959-4733109c0f93/ce59f7e8-d3c6-49a5-a4f4-d55586863278/images/bf46d4bc-4588-40ac-a08c-f8f710c85718/e44b57d4-1dc2-4fc4-8ece-c3eedbf9b92e,if=none,id=drive-virtio-disk0,format=qcow2,serial=bf46d4bc-4588-40ac-a08c-f8f710c85718,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=27 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:23:61:5d,bus=pci.0,addr=0x3 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/new_vm.com.redhat.rhevm.vdsm,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -chardev pty,id=charconsole0 -device virtconsole,chardev=charconsole0,id=console0 -spice port=5900,tls-port=5901,addr=10.35.115.11,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=main,tls-channel=display,tls-channel=inputs,tls-channel=cursor,tls-channel=playback,tls-channel=record -k en-us -vga qxl -global qxl-vga.vram_size=67108864 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x The hot-plug failed with: libvirtError: internal error unable to execute QEMU command 'device_add': Device 'virtio-blk-pci' could not be initialized Engine tries to teardownImage and fails: lvmchange -a n fails because the device remains open. Thread-13746::INFO::2012-07-30 20:04:12,532::logUtils::37::dispatcher::(wrapper) Run and protect: teardownImage(sdUUID='4f38a996-dbeb-4981-885b-742b46a4714f', spUUID='b8423a1f-8889-469a-b2fa-39ab78ac3a57', imgUUID='74512746-e10f-4416-a38 4-baa29d92cde5', volUUID=None) Thread-13746::ERROR::2012-07-30 20:04:17,921::task::853::TaskManager.Task::(_setError) Task=`8a565abc-6792-42f1-b326-d2f4d2ce03d1`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 861, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 38, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2778, in teardownImage img.teardown(sdUUID, imgUUID, volUUID) File "/usr/share/vdsm/storage/image.py", line 358, in teardown volUUIDs=[vol.volUUID for vol in chain]) File "/usr/share/vdsm/storage/blockSD.py", line 847, in deactivateVolumes lvm.deactivateLVs(self.sdUUID, volUUIDs) File "/usr/share/vdsm/storage/lvm.py", line 1034, in deactivateLVs _setLVAvailability(vgName, toDeactivate, "n") File "/usr/share/vdsm/storage/lvm.py", line 719, in _setLVAvailability raise error(str(e)) CannotDeactivateLogicalVolume: <snip> Device or resource busy\n Unable to deactivate 4f38a996--dbeb--4981--885b--742b46a4714f-5c99c73c--0b11--4213--9ff5--20817d201f1e (253:41)\n'; <rc> = 5 Created attachment 601910 [details]
failed attach vdsm, libvirt, qemu logs.
Reproduced: hotplug fails then teardownImage fails too. From vdsm log: Thread-1549::DEBUG::2012-08-02 13:36:41,612::libvirtvm::1472::vm.Vm::(hotplugDisk) vmId=`d7810094-bf60-4651-9388-5c3f2a9eae4f`::Hotplug disk xml: <disk device="disk" snapshot="no" type="block"> <address domain="0x0000" function="0x0" slot="0x05" type="pci" bus="0x00"/> <source dev="/rhev/data-center/44de127a-01b2-4295-8959-4733109c0f93/ce59f7e8-d3c6-49a5-a4f4-d55586863278/images/bf46d4bc-4588-40ac-a08c-f8f710c85718/e44b57d4-1dc2-4fc4-8ece-c3eedbf9b92e"/> <target bus="virtio" dev="vdb"/> <serial>bf46d4bc-4588-40ac-a08c-f8f710c85718</serial> <driver cache="none" error_policy="stop" io="native" name="qemu" type="qcow2"/> </disk> Thread-1549::ERROR::2012-08-02 13:36:41,869::libvirtvm::1477::vm.Vm::(hotplugDisk) vmId=`d7810094-bf60-4651-9388-5c3f2a9eae4f`::Hotplug failed Traceback (most recent call last): File "/usr/share/vdsm/libvirtvm.py", line 1475, in hotplugDisk self._dom.attachDevice(driveXml) File "/usr/share/vdsm/libvirtvm.py", line 491, in f ret = attr(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 82, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 400, in attachDevice if ret == -1: raise libvirtError ('virDomainAttachDevice() failed', dom=self) libvirtError: internal error unable to execute QEMU command 'device_add': Device 'virtio-blk-pci' could not be initialized Thread-1549::INFO::2012-08-02 13:36:41,874::logUtils::37::dispatcher::(wrapper) Run and protect: teardownImage(sdUUID='ce59f7e8-d3c6-49a5-a4f4-d55586863278', spUUID='44de127a-01b2-4295-8959-4733109c0f93', imgUUID='bf46d4bc-4588-40ac-a08c-f8f710c85718', volUUID=None) Thread-1549::ERROR::2012-08-02 13:36:47,181::task::853::TaskManager.Task::(_setError) Task=`71a55d03-c328-4276-a611-bd393f325ef7`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 861, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 38, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2778, in teardownImage img.teardown(sdUUID, imgUUID, volUUID) File "/usr/share/vdsm/storage/image.py", line 358, in teardown volUUIDs=[vol.volUUID for vol in chain]) File "/usr/share/vdsm/storage/blockSD.py", line 847, in deactivateVolumes lvm.deactivateLVs(self.sdUUID, volUUIDs) File "/usr/share/vdsm/storage/lvm.py", line 1034, in deactivateLVs _setLVAvailability(vgName, toDeactivate, "n") File "/usr/share/vdsm/storage/lvm.py", line 719, in _setLVAvailability raise error(str(e)) CannotDeactivateLogicalVolume: <snip> device-mapper: remove ioctl on failed: Device or resource busy\n Unable to deactivate ce59f7e8--d3c6--49a5--a4f4--d55586863278-e44b57d4--1dc2--4fc4--8ece--c3eedbf9b92e (253:14)\n'; <rc> = 5 From libvirt log: 2012-08-02 10:36:41.612+0000: 17908: debug : virDomainAttachDevice:9185 : dom=0x7fea1402c200, (VM: name=new_vm, uuid=d7810094-bf60-4651-9388-5c3f2a9eae4f), xml=<disk device="disk" snapshot="no" type="block"> <address domain="0x0000" function="0x0" slot="0x05" type="pci" bus="0x00"/> <source dev="/rhev/data-center/44de127a-01b2-4295-8959-4733109c0f93/ce59f7e8-d3c6-49a5-a4f4-d55586863278/images/bf46d4bc-4588-40ac-a08c-f8f710c85718/e44b57d4-1dc2-4fc4-8ece-c3eedbf9b92e"/> <target bus="virtio" dev="vdb"/> <serial>bf46d4bc-4588-40ac-a08c-f8f710c85718</serial> <driver cache="none" error_policy="stop" io="native" name="qemu" type="qcow2"/> </disk> 2012-08-02 10:36:41.869+0000: 17908: error : virNetClientProgramDispatchError:174 : internal error unable to execute QEMU command 'device_add': Device 'virtio-blk-pci' could not be initialized This only happens on VMs without an OS, this is not a beta blocker. Was verified by QE that the hotplug always fails for VM's without an OS. Passing to qemu. Can reproduce this issue with libvirt + qemu-kvm-rhev: libvirt-0.9.10-21.el6.x86_64 qemu-kvm-rhev-0.12.1.2-2.304.el6.x86_64 Steps: 1. start a VM(without os installed) with a disk. will dumpxml test later. virsh # start test Domain test started 2. hot unplug this disk by chayang-hotunplug.xml virsh # detach-device test /root/chayang-hotunplug.xml Device detached successfully 3. hot plug this disk by chayang-hotplug.xml virsh # attach-device test /root/chayang-hotplug.xml error: Failed to attach device from /root/chayang-hotplug.xml error: internal error unable to execute QEMU command 'device_add': Device 'virtio-blk-pci' could not be initialized xml used to start test domain: ----------------------------- virsh # dumpxml test ... <disk type='file' device='disk'> <driver name='qemu' type='raw' cache='none'/> <source file='/home/test-1.raw'/> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> ... xml used to hotunplug and hotplug disk: -------------------------------------- # cat /root/chayang-hotunplug.xml <disk device="disk" snapshot="no" type="block"> <address bus="0x00" domain="0x0000" function="0x0" slot="0x06" type="pci"/> <source dev="/home/test-1.raw"/> <target bus="virtio" dev="vda"/> <serial></serial> <driver cache="none" error_policy="stop" io="threads" name="qemu" type="raw"/> </disk> # diff chayang-hotunplug.xml chayang-hotplug.xml 4c4 < <target bus="virtio" dev="vda"/> --- > <target bus="virtio" dev="vdb"/> Addional info: ------------- Change slot="0x6" to slot="0x7" in chayang-hotplug.xml will make this failure disappear. # diff chayang-hotunplug.xml chayang-hotplug.xml 2c2 < <address bus="0x00" domain="0x0000" function="0x0" slot="0x06" type="pci"/> --- > <address bus="0x00" domain="0x0000" function="0x0" slot="0x07" type="pci"/> 4c4 < <target bus="virtio" dev="vda"/> --- > <target bus="virtio" dev="vdb"/> This time just try to reproduce with qemu-kvm-rhev directly. qemu-kvm-rhev-0.12.1.2-2.304.el6.x86_64 Steps: 1. start a VM with 2 disks attached(no os installed) CLI: # /usr/libexec/qemu-kvm -M rhel6.3.0 -enable-kvm -m 1024 -smp 1 -name test -monitor stdio -boot menu=on -drive file=removable.raw,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0 -net none -vnc :10 -drive file=removable-1.raw,if=none,id=drive-virtio-disk1,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk1,id=virtio-disk1 2. hot unplug the 2nd disk by: (qemu) device_del virtio-disk1 (qemu) __com.redhat_drive_del drive-virtio-disk1 3. hot plug again this disk: (qemu) __com.redhat_drive_add file=removable-1.raw,id=drive-virtio-disk2,format=raw,cache=none (qemu) device_add virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk2,id=virtio-disk2 PCI: devfn 56 not available for virtio-blk-pci, in use by virtio-blk-pci Device 'virtio-blk-pci' could not be initialized ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 4. retry step 1~3 with a disk has os installed(rhel6.3) using same cli Actual Result: after step 4, no this failure happens. (qemu) __com.redhat_drive_add file=removable-1.raw,id=drive-virtio-disk2,format=raw,cache=none (qemu) device_add virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk2,id=virtio-disk2 So this issue: 1. only happens without os installed when hot plugging a disk with same addr as the one before hot unplugging 2. not happen without os installed when hot plugging a disk with *different* addr 3. not happen if has os installed even with same addr as the one before hot unplugging 2. with different addr (In reply to comment #15) > This time just try to reproduce with qemu-kvm-rhev directly. > qemu-kvm-rhev-0.12.1.2-2.304.el6.x86_64 > > Steps: > 1. start a VM with 2 disks attached(no os installed) > CLI: > # /usr/libexec/qemu-kvm -M rhel6.3.0 -enable-kvm -m 1024 -smp 1 -name test > -monitor stdio -boot menu=on -drive > file=removable.raw,if=none,id=drive-virtio-disk0,format=raw,cache=none > -device > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0, > id=virtio-disk0 -net none -vnc :10 -drive > file=removable-1.raw,if=none,id=drive-virtio-disk1,format=raw,cache=none > -device > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk1, > id=virtio-disk1 > > 2. hot unplug the 2nd disk by: > (qemu) device_del virtio-disk1 > (qemu) __com.redhat_drive_del drive-virtio-disk1 > > 3. hot plug again this disk: > (qemu) __com.redhat_drive_add > file=removable-1.raw,id=drive-virtio-disk2,format=raw,cache=none > (qemu) device_add > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk2, > id=virtio-disk2 > PCI: devfn 56 not available for virtio-blk-pci, in use by virtio-blk-pci > Device 'virtio-blk-pci' could not be initialized Seems the key reason is qemu can not delete the block cleanly without os installnation. Hi, Markus Do we support hotplug device if without os installnation? Unless I'm missing something, this is NOTABUG. PCI devices cannot be hot unplugged without guest cooperation. I don't think a guest without an OS can cooperate. Even when the guest cooperates, unplug is not instantaneous. device_del merely initiates the unplug. It completes some unpredictable time later. Your recipe fails unless the unplug initiated in step 2 completes before step 3. Race condition. Libvirt should encapsulate this problematic behavior of device_del. Bug 807023. You can't reuse the same PCI slot until unplug completes. If unplug doesn't complete, you can't reuse it, period. You can force qemu-kvm to give up the image even when unplug doesn't work: use __com.redhat_drive_del. Looks like a bad disk failure to the guest. But it lets you reuse the same image safely with a different device, or a different guest. Let me try to summarize what this bug is about now. Context: when unplug of a block device completes, its backend is destroyed automatically. This is a misfeature, but it's part of the API. Bug: when plug of a block device fails, its backend isn't destroyed automatically. This bug claims it should be, for consistency with unplug. Is this correct? (In reply to comment #21) > Let me try to summarize what this bug is about now. > > Context: when unplug of a block device completes, its backend is destroyed > automatically. This is a misfeature, but it's part of the API. > > Bug: when plug of a block device fails, its backend isn't destroyed > automatically. This bug claims it should be, for consistency with unplug. > > Is this correct? Actually I don't care about unplug, I want to plug a device into a VM. It fails. I want to be able to plug it into a different VM (but can't because qemu doesn't release the resource even though it's not using it and never did). TL;DR: This is not a qemu-kvm bug, but it might be a libvirt bug. Attaching a block device takes two steps: first step creates the backend with QMP command __com.redhat_drive_add, second step creates the frontend with device_add. Each of the two QMP commands either creates what it's supposed to create and succeeds, or doesn't create anything and fails. The commands are clearly visible in log_libvirtd.log attached in comment#9. Relevant part starts at 2012-08-02 10:36:41.613+0000. First step creates backend "drive-virtio-disk1" successfully. Second step tries to create backend "virtio-disk1", but fails. The failure does *not* destroy the backend. This is expected qemu-kvm behavior. Perhaps libvirt should clean up the backend created in the first step if the second step fails. If you think it should, please change the bug's component to libvirt. Else, let's close it NOTABUG. Possible workaround: clean up the backend manually with__com.redhat_drive_del. Jiri, can you weigh in here? Yeah, I think libvirt should really undo the first part when the second part fails for any reason. Ok, I've changed component to libvirt. Patch sent upstream for review: https://www.redhat.com/archives/libvir-list/2012-September/msg01498.html Fixed upstream by v0.10.2-rc2-5-g8125113: commit 8125113cdb61bb4352af8e80e66573282be9cf83 Author: Jiri Denemark <jdenemar> Date: Thu Sep 20 22:28:35 2012 +0200 qemu: Fix failure path in disk hotplug Disk hotplug is a two phase action: qemuMonitorAddDrive followed by qemuMonitorAddDevice. When the first part succeeds but the second one fails, we need to rollback the drive addition. Test on kernel-2.6.32-309.el6.x86_64 qemu-kvm-0.12.1.2-2.314.el6.x86_64 libvirt-0.10.2-1.el6.x86_64 # virsh attach-disk demo1 /var/lib/libvirt/images/disk.img vdb Disk attached successfully # virsh detach-disk demo1 vdb Disk detached successfully # virsh attach-disk demo1 /var/lib/libvirt/images/disk.img vdb error: Failed to attach disk error: internal error unable to execute QEMU command '__com.redhat_drive_add': Duplicate ID 'drive-virtio-disk1' for drive still report error You were actually testing bug 807023, which is related but still present. This bugzilla deals with a bug in failure path when the final hotplug phase fails but libvirt does not correctly undo the part of the hotplug process that succeeded. You can see the hotplug failure in comment 10: unable to execute QEMU command 'device_add': Device 'virtio-blk-pci' could not be initialized. However, I'm not sure how to easily achieve that, perhaps the guys who started this bugzilla could help you with the steps to verify this bug. Retest on kernel-2.6.32-309.el6.x86_64 qemu-kvm-0.12.1.2-2.314.el6.x86_64 libvirt-0.10.2-1.el6.x86_64 with steps 1. Start a guest with additional disk and without os in it .. <disk type='file' device='disk'> <driver name='qemu' type='raw' cache='none'/> <source file='/var/lib/libvirt/images/disk.img'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </disk> .. 2. start guest 3. Do detach-device for that disk # cat disk.xml <disk type='file' device='disk'> <driver name='qemu' type='raw' cache='none'/> <source file='/var/lib/libvirt/images/disk.img'/> <target dev='vda' bus='virtio'/> </disk> # virsh detach-device demo1 disk2.xml Device detached successfully 4. Attach-device with xml <disk type='file' device='disk'> <driver name='qemu' type='raw' cache='none'/> <source file='/var/lib/libvirt/images/disk.img'/> <target dev='vdb' bus='virtio'/> </disk> **note** the difference is dev='vdb' instead of dev='vda' # virsh attach-device demo1 disk2.xml error: Failed to attach device from disk2.xml error: internal error unable to execute QEMU command 'device_add': Device 'virtio-blk-pci' could not be initialized do it again # virsh attach-device demo1 disk2.xml error: Failed to attach device from disk2.xml error: internal error unable to execute QEMU command '__com.redhat_drive_add': Duplicate ID 'drive-virtio-disk1' for drive Hi jiri, Is that expected? I don't see any difference from old version Could you setup libvirtd to print debug logs from qemu, redo the test, and attach the generated debug logs? Created attachment 617362 [details]
libvirtd.log with filter qemu of debug log
Oh, I guess I previously misread the steps from comment #33. Bug 807023 is still playing a key role there. The problem is that once the disk is successfully attached (no matter if it's been attached before starting the domain or hotplugged when it's running), unplugging the disk is a risky business. Qemu monitor command to detach a disk is asynchronous (similarly to shutdown request) and reports success after sending a detach request to a guest OS. If there's no OS or the OS just ignores this detach request, the disk will not be detached although libvirt will think the operation succeeded. Thus attaching a new disk will fail. The target dev (vdb vs. vda) doesn't really matter since it's just an ordering hint. And because the disk will always be the only one presented to the domain, libvirt will use the same drive-virtio-disk1 alias for it. In other words, what you need to do is to hotplug a new disk to a domain and somehow make the second part of this hotplug operation (device_add) fail. However, I have no idea how to achieve that (except for hacking qemu). (In reply to comment #37) > Oh, I guess I previously misread the steps from comment #33. Bug 807023 is > still playing a key role there. The problem is that once the disk is > successfully attached (no matter if it's been attached before starting the > domain or hotplugged when it's running), unplugging the disk is a risky > business. Qemu monitor command to detach a disk is asynchronous (similarly to > shutdown request) and reports success after sending a detach request to a > guest OS. If there's no OS or the OS just ignores this detach request, the > disk will not be detached although libvirt will think the operation > succeeded. > Thus attaching a new disk will fail. The target dev (vdb vs. vda) doesn't > really matter since it's just an ordering hint. And because the disk will > always be the only one presented to the domain, libvirt will use the same > drive-virtio-disk1 alias for it. > > In other words, what you need to do is to hotplug a new disk to a domain and > somehow make the second part of this hotplug operation (device_add) fail. > However, I have no idea how to achieve that (except for hacking qemu). Hi Jiri, what about running a VM without virtio drivers installed and then try to attach a virtio disk? would that fail in the 'right' place? Ayal, unfortunately attaching a virtio disk succeeds even without virtio drivers. It looks like support for PCI hotplug is enough (and necessary) for virtio disk hotplug support. The disk can even be unplugged and plugged in again with no issues. Anyway, one would think that since this bug was requested to be fixed in z-stream, it's a serious one, which means it should also be easy to hit it... I must admit the bug was so clear and fix so straightforward that I didn't really try to reproduce it. (In reply to comment #40) > Ayal, unfortunately attaching a virtio disk succeeds even without virtio > drivers. It looks like support for PCI hotplug is enough (and necessary) for > virtio disk hotplug support. The disk can even be unplugged and plugged in > again with no issues. > > Anyway, one would think that since this bug was requested to be fixed in > z-stream, it's a serious one, which means it should also be easy to hit it... I agree, I was just trying to help while Haim wasn't available... > > I must admit the bug was so clear and fix so straightforward that I didn't > really try to reproduce it. Haim? After talk with Jiri, find that it is better to verify this bug through code inspection. Download libvirt-0.10.2-1.el6.src.rpm and install and can find the code if (qemuCapsGet(priv->caps, QEMU_CAPS_DEVICE)) { ret = qemuMonitorAddDrive(priv->mon, drivestr); if (ret == 0) { ret = qemuMonitorAddDevice(priv->mon, devstr); if (ret < 0) { virErrorPtr orig_err = virSaveLastError(); if (qemuMonitorDriveDel(priv->mon, drivestr) < 0) { VIR_WARN("Unable to remove drive %s (%s) after failed " "qemuMonitorAddDevice", drivestr, devstr); } if (orig_err) { virSetError(orig_err); virFreeError(orig_err); } } } } else { in src/qemu/qemu_hotplug.c line 250 So verify pass Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0276.html |