Hide Forgot
Description of problem: Live storage migration from In vdsm.log: jsonrpc.Executor/2::DEBUG::2016-04-11 10:15:35,946::__init__::511::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.diskReplicateFinish' in bridge with {u'dstDisk': {u'dev ice': u'disk', u'poolID': u'591475db-6fa9-455d-9c05-7f6e30fb06d5', u'volumeID': u'ef2f51d9-2f06-4ff2-816b-85611b68863a', u'domainID': u'f283d64e-b27c-414f-871b-a7a24ca608cb' , u'imageID': u'feebf8ab-cc21-497d-903a-be59b79105a5'}, u'vmID': u'2bc754e2-1896-4193-a4ae-b9096caa935a', u'srcDisk': {u'device': u'disk', u'poolID': u'591475db-6fa9-455d-9c 05-7f6e30fb06d5', u'volumeID': u'ef2f51d9-2f06-4ff2-816b-85611b68863a', u'domainID': u'f283d64e-b27c-414f-871b-a7a24ca608cb', u'imageID': u'feebf8ab-cc21-497d-903a-be59b7910 5a5'}} (No more logs in vdsm - this is another issue) In engine log: 2016-04-11 10:12:32,998 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (org.ovirt.thread.pool-8-thread-7) [220a7abc] Command 'VmReplicateD iskFinishVDSCommand(HostName = voodoo6, VmReplicateDiskParameters:{runAsync='true', hostId='89887385-48b3-4648-b397-f2078519a61e', vmId='2bc754e2-1896-4193-a4ae-b9096caa935a '})' execution failed: VDSGenericException: VDSErrorException: Failed to VmReplicateDiskFinishVDS, error = Resource unavailable, code = 40 In livbirt we see: virsh # list Please enter your authentication name: vdsm@ovirt Please enter your password: Id Name State ---------------------------------------------------- 4 from-temp-iscsi running virsh # dumpxml 4 <domain type='kvm' id='4'> <name>from-temp-iscsi</name> <uuid>2bc754e2-1896-4193-a4ae-b9096caa935a</uuid> <metadata xmlns:ovirt="http://ovirt.org/vm/tune/1.0"> <ovirt:qos/> </metadata> <maxMemory slots='16' unit='KiB'>4294967296</maxMemory> <memory unit='KiB'>1048576</memory> <currentMemory unit='KiB'>1048576</currentMemory> <vcpu placement='static' current='1'>16</vcpu> <cputune> <shares>1020</shares> </cputune> <numatune> <memory mode='interleave' nodeset='0'/> </numatune> <resource> <partition>/machine</partition> </resource> <sysinfo type='smbios'> <system> <entry name='manufacturer'>oVirt</entry> <entry name='product'>oVirt Node</entry> <entry name='version'>7.2-9.el7</entry> <entry name='serial'>77A14806-F954-4D2A-AEA6-C098A4469927</entry> <entry name='uuid'>2bc754e2-1896-4193-a4ae-b9096caa935a</entry> </system> </sysinfo> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.2.0'>hvm</type> <smbios mode='sysinfo'/> </os> <features> <acpi/> </features> <cpu mode='custom' match='exact'> <model fallback='allow'>SandyBridge</model> <topology sockets='16' cores='1' threads='1'/> <numa> <cell id='0' cpus='0' memory='1048576' unit='KiB'/> </numa> </cpu> <clock offset='variable' adjustment='0' basis='utc'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source startupPolicy='optional'/> <backingStore/> <target dev='hdc' bus='ide'/> <readonly/> <serial></serial> <alias name='ide0-1-0'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> <disk type='block' device='disk' snapshot='no'> <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/> <source dev='/rhev/data-center/591475db-6fa9-455d-9c05-7f6e30fb06d5/f283d64e-b27c-414f-871b-a7a24ca608cb/images/feebf8ab-cc21-497d-903a-be59b79105a5/ef2f51d9-2f06-4ff2-816b-85611b68863a'/> <backingStore type='block' index='1'> <format type='qcow2'/> <source dev='/rhev/data-center/591475db-6fa9-455d-9c05-7f6e30fb06d5/f283d64e-b27c-414f-871b-a7a24ca608cb/images/feebf8ab-cc21-497d-903a-be59b79105a5/../feebf8ab-cc21-497d-903a-be59b79105a5/900b3b77-fb1e-4b9d-8688-2678a00c3444'/> <backingStore/> </backingStore> <mirror type='file' file='/rhev/data-center/591475db-6fa9-455d-9c05-7f6e30fb06d5/54546265-cd38-476d-95f2-3a8b274ed152/images/feebf8ab-cc21-497d-903a-be59b79105a5/ef2f51d9-2f06-4ff2-816b-85611b68863a' format='qcow2' job='copy' ready='yes'> <format type='qcow2'/> <source file='/rhev/data-center/591475db-6fa9-455d-9c05-7f6e30fb06d5/54546265-cd38-476d-95f2-3a8b274ed152/images/feebf8ab-cc21-497d-903a-be59b79105a5/ef2f51d9-2f06-4ff2-816b-85611b68863a'> <seclabel model='selinux' labelskip='yes'/> </source> </mirror> <target dev='vda' bus='virtio'/> <serial>feebf8ab-cc21-497d-903a-be59b79105a5</serial> <boot order='1'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> <controller type='virtio-serial' index='0' ports='16'> <alias name='virtio-serial0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </controller> <controller type='usb' index='0'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='ide' index='0'> <alias name='ide'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <interface type='bridge'> <mac address='00:1a:4a:16:01:55'/> <source bridge='ovirtmgmt'/> <target dev='vnet0'/> <model type='virtio'/> <filterref filter='vdsm-no-mac-spoofing'/> <link state='up'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channels/2bc754e2-1896-4193-a4ae-b9096caa935a.com.redhat.rhevm.vdsm'/> <target type='virtio' name='com.redhat.rhevm.vdsm' state='connected'/> <alias name='channel0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channels/2bc754e2-1896-4193-a4ae-b9096caa935a.org.qemu.guest_agent.0'/> <target type='virtio' name='org.qemu.guest_agent.0' state='connected'/> <alias name='channel1'/> <address type='virtio-serial' controller='0' bus='0' port='2'/> </channel> <channel type='spicevmc'> <target type='virtio' name='com.redhat.spice.0' state='disconnected'/> <alias name='channel2'/> <address type='virtio-serial' controller='0' bus='0' port='3'/> </channel> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <graphics type='spice' port='5900' tlsPort='5901' autoport='yes' listen='10.35.0.110' passwdValidTo='2016-04-11T07:09:36' connected='disconnect'> <listen type='network' address='10.35.0.110' network='vdsm-ovirtmgmt'/> </graphics> <video> <model type='qxl' ram='65536' vram='32768' vgamem='16384' heads='1'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='none'> <alias name='balloon0'/> </memballoon> </devices> <seclabel type='dynamic' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c363,c924</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c363,c924</imagelabel> </seclabel> </domain> virsh # blockjob 4 vda --raw type=Block Copy bandwidth=0 cur=1478950912 end=1478950912 This is the failing check in vdsm/virt/vm.py: # Checking if we reached the replication mode ("mirroring" in libvirt # and qemu terms) if blkJobInfo['cur'] != blkJobInfo['end']: return response.error('unavail') May be: - libvirt bug (changed api that vdsm depends on) - engine bug (not handling ResourceUnavailable Version-Release number of selected component (if applicable): - ovirt-engine-3.6.0.3-1.fc22.noarch - vdsm master 770f6d251616f3e0d18242a48f286b1d37d68149 - libvirt-daemon-1.2.17-13.el7_2.4.x86_64 How reproducible: Always Steps to Reproduce: 1. Perform live storage migration Actual results: Migration fails, libvirt block jobs left running Expected results: block job aborted, migration suceeds Additional info: Issue appeared after adding rhel-7.z repo to host and updating host. Todo: - add logs showing libvirt response - check with more recent engine - check with older libvirt
Reproducing is tricky, I improved the relevant logging and tried to reproduce it again to understand what do we get from libvirt, but I cannot reproduce it.
Nir, what are you doing with this bug? You wrote it's always reproducible and then you could not reproduce
(In reply to Tal Nisan from comment #2) > Nir, what are you doing with this bug? You wrote it's always reproducible > and then you could not reproduce I could reproduce it 2-3 times, and then I could not. We need more time to investigate this.
Tested again with fresh install on ovirt-engine (3.6.4.1-1.el7.centos) and updated rhel 7.2 hosts with vdsm master + https://gerrit.ovirt.org/56110. Cannot reproduce. The attached patch improve logging so it this happens again, we will have more info on the failure.
Closing as works for me, please reopen if this happens again.