Bug 1904202 - Create backup failed on vm after hot plugging a disk
Summary: Create backup failed on vm after hot plugging a disk
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Backup-Restore.VMs
Version: 4.4.4
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ovirt-4.4.4
: 4.4.4.5
Assignee: Eyal Shenitzky
QA Contact: Ilan Zuckerman
bugs@ovirt.org
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-03 19:46 UTC by Yury.Panchenko
Modified: 2021-01-12 16:23 UTC (History)
7 users (show)

Fixed In Version: ovirt-engine-4.4.4.5
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-01-12 16:23:48 UTC
oVirt Team: Storage
Embargoed:


Attachments (Terms of Use)
logs from kvm (2.10 MB, application/zip)
2020-12-03 19:46 UTC, Yury.Panchenko
no flags Details
vm xml before hot plugging disk (13.79 KB, application/xml)
2020-12-09 17:06 UTC, Nir Soffer
no flags Details
vm xml after hot plugging disk (15.73 KB, application/xml)
2020-12-09 17:07 UTC, Nir Soffer
no flags Details
vm with hot plugged disk using virtio interface (15.86 KB, text/plain)
2020-12-09 18:05 UTC, Nir Soffer
no flags Details
Disk statuses (30.73 KB, image/png)
2020-12-09 18:16 UTC, Yury.Panchenko
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 112590 0 master MERGED core: add validation for disks to be active before starting a backup 2021-02-21 16:40:29 UTC

Description Yury.Panchenko 2020-12-03 19:46:06 UTC
Created attachment 1736164 [details]
logs from kvm

Version:
Version 4.4.4.2-0.0.master.20201121131637.git6b197466559.el8
vdsm-4.40.37-9.git2611c054f.el8.x86_64
ovirt-imageio-daemon-2.2.0-0.202011111341.git5931b13.el8.x86_64
libvirt-daemon-6.6.0-7.module+el8.3.0+8424+5ea525c5.x86_64
qemu-kvm-5.1.0-14.module+el8.3.0+8438+644aff69.x86_64


Description of problem:
I have regular vm with one virtio-scsi disk. This vm has backed up many times in full and increment mode. 
After that, i added second disk with sata interface.
Now vm backup cant started and vdsm log exception.
I try to reproduce this issue on another vm, but problem with only that vm

In vdsm log
2020-12-03 19:41:19,257+0100 INFO  (jsonrpc/3) [api.virt] START start_backup(config={'backup_id': '1ab13a69-d08a-453a-acf4-e056a6443418', 'disks': [{'checkpoint': True, 'imageID': '8ee4a0db-4363-4b8b-ae11-0ce762df76bb', 'volumeID': '421c5948-1d34-4dd6-9df3-5560fceae467', 'domainID': 'c30031cf-2660-46a5-9fcb-0f92817a6718'}, {'checkpoint': True, 'imageID': '2c768d24-a5c2-4799-a249-e16db3b4722e', 'volumeID': 'e66c5210-d935-42cb-aaa1-c70addee069f', 'domainID': 'c30031cf-2660-46a5-9fcb-0f92817a6718'}], 'from_checkpoint_id': None, 'parent_checkpoint_id': '3d8045e3-f3bd-42a8-89c8-b5656e8887ab', 'to_checkpoint_id': '68c1f8fe-258f-4e12-8e51-2f000e023227'}) from=::ffff:172.25.16.43,59102, flow_id=d5067bfd-9d98-4133-90b0-fd890c2c4eb9, vmId=8bfe0842-feea-4085-bbb7-8fb1e85c8d38 (api:48)
2020-12-03 19:41:19,258+0100 ERROR (jsonrpc/3) [api] FINISH start_backup error=Backup Error: {'vm_id': '8bfe0842-feea-4085-bbb7-8fb1e85c8d38', 'backup': <vdsm.virt.backup.BackupConfig object at 0x7f66e8254748>, 'reason': "Failed to find one of the backup disks: No such drive: '{'domainID': 'c30031cf-2660-46a5-9fcb-0f92817a6718', 'imageID': '8ee4a0db-4363-4b8b-ae11-0ce762df76bb', 'volumeID': '421c5948-1d34-4dd6-9df3-5560fceae467'}'"} (api:131)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/virt/backup.py", line 160, in start_backup
    drives = _get_disks_drives(vm, backup_cfg.disks)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/backup.py", line 364, in _get_disks_drives
    'volumeID': disk.vol_id})
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 3898, in findDriveByUUIDs
    raise LookupError("No such drive: '%s'" % drive)
LookupError: No such drive: '{'domainID': 'c30031cf-2660-46a5-9fcb-0f92817a6718', 'imageID': '8ee4a0db-4363-4b8b-ae11-0ce762df76bb', 'volumeID': '421c5948-1d34-4dd6-9df3-5560fceae467'}'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 124, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/API.py", line 666, in start_backup
    return self.vm.start_backup(config)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/backup.py", line 70, in wrapper
    return f(*a, **kw)
  File "<decorator-gen-263>", line 2, in start_backup
  File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 101, in method
    return func(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 4035, in start_backup
    return backup.start_backup(self, dom, config)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/backup.py", line 165, in start_backup

Comment 1 Nir Soffer 2020-12-03 21:54:12 UTC
Yuri, can you add the vm xml?

Best done using:

    virsh -r dumpxml vm-name

Comment 2 Yury.Panchenko 2020-12-09 16:04:20 UTC
Hello, Nir.

<domain type='kvm' id='22' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
  <name>kvm-cent</name>
  <uuid>8bfe0842-feea-4085-bbb7-8fb1e85c8d38</uuid>
  <metadata xmlns:ns1="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0">
    <ns1:qos/>
    <ovirt-vm:vm xmlns:ovirt-vm="http://ovirt.org/vm/1.0">
    <ovirt-vm:balloonTarget type="int">1048576</ovirt-vm:balloonTarget>
    <ovirt-vm:clusterVersion>4.4</ovirt-vm:clusterVersion>
    <ovirt-vm:destroy_on_reboot type="bool">False</ovirt-vm:destroy_on_reboot>
    <ovirt-vm:launchPaused>false</ovirt-vm:launchPaused>
    <ovirt-vm:memGuaranteedSize type="int">1024</ovirt-vm:memGuaranteedSize>
    <ovirt-vm:minGuaranteedMemoryMb type="int">1024</ovirt-vm:minGuaranteedMemoryMb>
    <ovirt-vm:resumeBehavior>auto_resume</ovirt-vm:resumeBehavior>
    <ovirt-vm:startTime type="float">1607020775.3071415</ovirt-vm:startTime>
    <ovirt-vm:device mac_address="56:6f:2d:cf:00:06">
        <ovirt-vm:network>ovirtmgmt</ovirt-vm:network>
    </ovirt-vm:device>
    <ovirt-vm:device mac_address="56:6f:2d:cf:00:00">
        <ovirt-vm:network>ovirtmgmt</ovirt-vm:network>
    </ovirt-vm:device>
    <ovirt-vm:device devtype="disk" name="sda">
        <ovirt-vm:domainID>c30031cf-2660-46a5-9fcb-0f92817a6718</ovirt-vm:domainID>
        <ovirt-vm:guestName>/dev/sda</ovirt-vm:guestName>
        <ovirt-vm:imageID>2c768d24-a5c2-4799-a249-e16db3b4722e</ovirt-vm:imageID>
        <ovirt-vm:poolID>b024005c-2e45-11eb-a07e-00163e4b112f</ovirt-vm:poolID>
        <ovirt-vm:volumeID>e66c5210-d935-42cb-aaa1-c70addee069f</ovirt-vm:volumeID>
        <ovirt-vm:volumeChain>
            <ovirt-vm:volumeChainNode>
                <ovirt-vm:domainID>c30031cf-2660-46a5-9fcb-0f92817a6718</ovirt-vm:domainID>
                <ovirt-vm:imageID>2c768d24-a5c2-4799-a249-e16db3b4722e</ovirt-vm:imageID>
                <ovirt-vm:leaseOffset type="int">0</ovirt-vm:leaseOffset>
                <ovirt-vm:leasePath>/rhev/data-center/mnt/KVMC-SD.robofish.local:_NFS/c30031cf-2660-46a5-9fcb-0f92817a6718/images/2c768d24-a5c2-4799-a249-e16db3b4722e/e66c5210-d935-42cb-aaa1-c70addee069f.lease</ovirt-vm:leasePath>
                <ovirt-vm:path>/rhev/data-center/mnt/KVMC-SD.robofish.local:_NFS/c30031cf-2660-46a5-9fcb-0f92817a6718/images/2c768d24-a5c2-4799-a249-e16db3b4722e/e66c5210-d935-42cb-aaa1-c70addee069f</ovirt-vm:path>
                <ovirt-vm:volumeID>e66c5210-d935-42cb-aaa1-c70addee069f</ovirt-vm:volumeID>
            </ovirt-vm:volumeChainNode>
        </ovirt-vm:volumeChain>
    </ovirt-vm:device>
    <ovirt-vm:device devtype="disk" name="sdc"/>
</ovirt-vm:vm>
  </metadata>
  <maxMemory slots='16' unit='KiB'>4194304</maxMemory>
  <memory unit='KiB'>1048576</memory>
  <currentMemory unit='KiB'>1048576</currentMemory>
  <vcpu placement='static' current='1'>16</vcpu>
  <iothreads>1</iothreads>
  <resource>
    <partition>/machine</partition>
  </resource>
  <sysinfo type='smbios'>
    <system>
      <entry name='manufacturer'>oVirt</entry>
      <entry name='product'>RHEL</entry>
      <entry name='version'>8.3-1.0.el8</entry>
      <entry name='serial'>9be32c42-cfc3-dceb-044a-53e2a04b451e</entry>
      <entry name='uuid'>8bfe0842-feea-4085-bbb7-8fb1e85c8d38</entry>
      <entry name='family'>oVirt</entry>
    </system>
  </sysinfo>
  <os>
    <type arch='x86_64' machine='pc-q35-rhel8.1.0'>hvm</type>
    <smbios mode='sysinfo'/>
  </os>
  <features>
    <acpi/>
    <vmcoreinfo state='on'/>
  </features>
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>SandyBridge</model>
    <topology sockets='16' dies='1' cores='1' threads='1'/>
    <feature policy='require' name='pcid'/>
    <feature policy='require' name='spec-ctrl'/>
    <feature policy='require' name='ssbd'/>
    <feature policy='require' name='md-clear'/>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='arat'/>
    <feature policy='require' name='xsaveopt'/>
    <numa>
      <cell id='0' cpus='0-15' memory='1048576' unit='KiB'/>
    </numa>
  </cpu>
  <clock offset='variable' adjustment='0' basis='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw' error_policy='report'/>
      <source file='/rhev/data-center/mnt/KVMC-SD.robofish.local:_ISO/52859891-62c3-4903-97fc-fbccee0f7b89/images/11111111-1111-1111-1111-111111111111/CentOS-8.2.2004-x86_64-dvd1.iso' startupPolicy='optional' index='2'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <backingStore/>
      <target dev='sdc' bus='sata'/>
      <readonly/>
      <boot order='2'/>
      <alias name='ua-b56a330d-257d-4db1-ba9f-2505db6a8824'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>
    <disk type='file' device='disk' snapshot='no'>
      <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='threads' discard='unmap'/>
      <source file='/rhev/data-center/mnt/KVMC-SD.robofish.local:_NFS/c30031cf-2660-46a5-9fcb-0f92817a6718/images/2c768d24-a5c2-4799-a249-e16db3b4722e/e66c5210-d935-42cb-aaa1-c70addee069f' index='1'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <backingStore/>
      <target dev='sda' bus='scsi'/>
      <serial>2c768d24-a5c2-4799-a249-e16db3b4722e</serial>
      <boot order='1'/>
      <alias name='ua-2c768d24-a5c2-4799-a249-e16db3b4722e'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x10'/>
      <alias name='pci.1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x11'/>
      <alias name='pci.2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0x12'/>
      <alias name='pci.3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0x13'/>
      <alias name='pci.4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0x14'/>
      <alias name='pci.5'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0x15'/>
      <alias name='pci.6'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x5'/>
    </controller>
    <controller type='pci' index='7' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='7' port='0x16'/>
      <alias name='pci.7'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x6'/>
    </controller>
    <controller type='pci' index='8' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='8' port='0x17'/>
      <alias name='pci.8'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x7'/>
    </controller>
    <controller type='pci' index='9' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='9' port='0x18'/>
      <alias name='pci.9'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='10' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='10' port='0x19'/>
      <alias name='pci.10'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x1'/>
    </controller>
    <controller type='pci' index='11' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='11' port='0x1a'/>
      <alias name='pci.11'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x2'/>
    </controller>
    <controller type='pci' index='12' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='12' port='0x1b'/>
      <alias name='pci.12'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x3'/>
    </controller>
    <controller type='pci' index='13' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='13' port='0x1c'/>
      <alias name='pci.13'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x4'/>
    </controller>
    <controller type='pci' index='14' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='14' port='0x1d'/>
      <alias name='pci.14'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x5'/>
    </controller>
    <controller type='pci' index='15' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='15' port='0x1e'/>
      <alias name='pci.15'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x6'/>
    </controller>
    <controller type='pci' index='16' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='16' port='0x1f'/>
      <alias name='pci.16'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x7'/>
    </controller>
    <controller type='pci' index='0' model='pcie-root'>
      <alias name='pcie.0'/>
    </controller>
    <controller type='sata' index='0'>
      <alias name='ide'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='virtio-serial' index='0' ports='16'>
      <alias name='ua-330794d0-f086-42d6-a939-0ae78730bb41'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </controller>
    <controller type='usb' index='0' model='qemu-xhci' ports='8'>
      <alias name='ua-64eb5f60-3ff7-4b1a-964a-f620939b3a65'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </controller>
    <controller type='scsi' index='0' model='virtio-scsi'>
      <driver iothread='1'/>
      <alias name='ua-8b2eb04c-5e84-4d6f-92a1-aa2b8d3025ba'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='56:6f:2d:cf:00:00'/>
      <source bridge='ovirtmgmt'/>
      <target dev='vnet23'/>
      <model type='virtio'/>
      <filterref filter='vdsm-no-mac-spoofing'/>
      <link state='up'/>
      <mtu size='1500'/>
      <alias name='ua-e819885e-761e-4e83-aace-42b9ea04afa0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='56:6f:2d:cf:00:06'/>
      <source bridge='ovirtmgmt'/>
      <target dev='vnet24'/>
      <model type='virtio'/>
      <filterref filter='vdsm-no-mac-spoofing'/>
      <link state='up'/>
      <mtu size='1500'/>
      <alias name='ua-a82abd01-0b86-452a-abeb-c798a559c07b'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </interface>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channels/8bfe0842-feea-4085-bbb7-8fb1e85c8d38.org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0' state='connected'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <channel type='spicevmc'>
      <target type='virtio' name='com.redhat.spice.0' state='disconnected'/>
      <alias name='channel1'/>
      <address type='virtio-serial' controller='0' bus='0' port='2'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'>
      <alias name='input1'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input2'/>
    </input>
    <graphics type='spice' port='5903' tlsPort='5904' autoport='yes' listen='172.25.16.33' passwdValidTo='1970-01-01T00:00:01'>
      <listen type='network' address='172.25.16.33' network='vdsm-ovirtmgmt'/>
      <channel name='main' mode='secure'/>
      <channel name='display' mode='secure'/>
      <channel name='inputs' mode='secure'/>
      <channel name='cursor' mode='secure'/>
      <channel name='playback' mode='secure'/>
      <channel name='record' mode='secure'/>
      <channel name='smartcard' mode='secure'/>
      <channel name='usbredir' mode='secure'/>
    </graphics>
    <graphics type='vnc' port='5905' autoport='yes' listen='172.25.16.33' keymap='en-us' passwdValidTo='1970-01-01T00:00:01'>
      <listen type='network' address='172.25.16.33' network='vdsm-ovirtmgmt'/>
    </graphics>
    <video>
      <model type='qxl' ram='65536' vram='32768' vgamem='16384' heads='1' primary='yes'/>
      <alias name='ua-cbd0abe0-5982-481a-8525-fe4e85413613'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <stats period='5'/>
      <alias name='ua-b7688ad7-65cf-4936-9f77-a071a3989fd7'/>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </memballoon>
    <rng model='virtio'>
      <backend model='random'>/dev/urandom</backend>
      <alias name='ua-68d8e882-0e5d-4279-85cc-e57a63cc10a2'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </rng>
  </devices>
  <seclabel type='dynamic' model='selinux' relabel='yes'>
    <label>system_u:system_r:svirt_t:s0:c842,c918</label>
    <imagelabel>system_u:object_r:svirt_image_t:s0:c842,c918</imagelabel>
  </seclabel>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+107:+107</label>
    <imagelabel>+107:+107</imagelabel>
  </seclabel>
  <qemu:capabilities>
    <qemu:add capability='blockdev'/>
    <qemu:add capability='incremental-backup'/>
  </qemu:capabilities>
</domain>

Comment 3 Nir Soffer 2020-12-09 16:20:32 UTC
(In reply to Yury.Panchenko from comment #2)
> Hello, Nir.

Thanks, lets make this part of every bug report in future bugs.

> <domain type='kvm' id='22' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
>   <name>kvm-cent</name>

Here we have metadata for disk sda:

>     <ovirt-vm:device devtype="disk" name="sda">
>        <ovirt-vm:domainID>c30031cf-2660-46a5-9fcb-0f92817a6718</ovirt-vm:domainID>
>         <ovirt-vm:guestName>/dev/sda</ovirt-vm:guestName>
>         <ovirt-vm:imageID>2c768d24-a5c2-4799-a249-e16db3b4722e</ovirt-vm:imageID>
>         <ovirt-vm:poolID>b024005c-2e45-11eb-a07e-00163e4b112f</ovirt-vm:poolID>
>         <ovirt-vm:volumeID>e66c5210-d935-42cb-aaa1-c70addee069f</ovirt-vm:volumeID>
>      ...
>     </ovirt-vm:device>

Disk sdc is cdrom on file storage, we don't keep metadata fo that.

>     <ovirt-vm:device devtype="disk" name="sdc"/>
>     </ovirt-vm:vm>
>   </metadata>

Indeed sdc is a cdrom:

>     <disk type='file' device='cdrom'>
>       <driver name='qemu' type='raw' error_policy='report'/>
...
>       <target dev='sdc' bus='sata'/>
...
>     </disk>

This is disk sda:

>     <disk type='file' device='disk' snapshot='no'>
>       <driver name='qemu' type='qcow2' cache='none' error_policy='stop'
> io='threads' discard='unmap'/>
>       <source
> file='/rhev/data-center/mnt/KVMC-SD.robofish.local:_NFS/c30031cf-2660-46a5-
> 9fcb-0f92817a6718/images/2c768d24-a5c2-4799-a249-e16db3b4722e/e66c5210-d935-
> 42cb-aaa1-c70addee069f' index='1'>
>         <seclabel model='dac' relabel='no'/>
>       </source>
>       <backingStore/>
>       <target dev='sda' bus='scsi'/>
>       <serial>2c768d24-a5c2-4799-a249-e16db3b4722e</serial>
>       <boot order='1'/>
>       <alias name='ua-2c768d24-a5c2-4799-a249-e16db3b4722e'/>
>       <address type='drive' controller='0' bus='0' target='0' unit='0'/>
>     </disk>

This vm looks normal, but we don't see the hot plugged disk. Looking at 
the xml after hot plugging the disk may reveal the issue.

If you can reproduce this, please add the xml of the vm after hot pluging
the disk, before starting the backup.

Comment 4 Nir Soffer 2020-12-09 16:23:55 UTC
Avihai, can we reproduce this in the lab?

This should be pretty easy:
1. Star vm with one disk
2. Hot plug another disk
3. Verify that the hot plug succeeded
4. Try to backup both disks or only the new disk

Comment 5 Yury.Panchenko 2020-12-09 17:00:30 UTC
Nir.
This issue confirmed for SATA and VIRTIO but not for default VIRTIO-SCSI
If disk is hotplugged backup always failed, vm reboot not solve that problem.

> This vm looks normal, but we don't see the hot plugged disk. Looking at 
> the xml after hot plugging the disk may reveal the issue.
Checked again, after hoplung no new disk is disks node

[root@KVMC-RHEL83 ~]# virsh -r dumpxml kvm-cent
<domain type='kvm' id='30' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
  <name>kvm-cent</name>
  <uuid>8bfe0842-feea-4085-bbb7-8fb1e85c8d38</uuid>
  <metadata xmlns:ns1="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0">
    <ns1:qos/>
    <ovirt-vm:vm xmlns:ovirt-vm="http://ovirt.org/vm/1.0">
    <ovirt-vm:balloonTarget type="int">1048576</ovirt-vm:balloonTarget>
    <ovirt-vm:clusterVersion>4.4</ovirt-vm:clusterVersion>
    <ovirt-vm:destroy_on_reboot type="bool">False</ovirt-vm:destroy_on_reboot>
    <ovirt-vm:launchPaused>false</ovirt-vm:launchPaused>
    <ovirt-vm:memGuaranteedSize type="int">1024</ovirt-vm:memGuaranteedSize>
    <ovirt-vm:minGuaranteedMemoryMb type="int">1024</ovirt-vm:minGuaranteedMemoryMb>
    <ovirt-vm:resumeBehavior>auto_resume</ovirt-vm:resumeBehavior>
    <ovirt-vm:startTime type="float">1607532863.6684952</ovirt-vm:startTime>
    <ovirt-vm:device mac_address="56:6f:2d:cf:00:06">
        <ovirt-vm:network>ovirtmgmt</ovirt-vm:network>
    </ovirt-vm:device>
    <ovirt-vm:device mac_address="56:6f:2d:cf:00:00">
        <ovirt-vm:network>ovirtmgmt</ovirt-vm:network>
    </ovirt-vm:device>
    <ovirt-vm:device devtype="disk" name="sda">
        <ovirt-vm:domainID>c30031cf-2660-46a5-9fcb-0f92817a6718</ovirt-vm:domainID>
        <ovirt-vm:guestName>/dev/sda</ovirt-vm:guestName>
        <ovirt-vm:imageID>2c768d24-a5c2-4799-a249-e16db3b4722e</ovirt-vm:imageID>
        <ovirt-vm:poolID>b024005c-2e45-11eb-a07e-00163e4b112f</ovirt-vm:poolID>
        <ovirt-vm:volumeID>e66c5210-d935-42cb-aaa1-c70addee069f</ovirt-vm:volumeID>
        <ovirt-vm:volumeChain>
            <ovirt-vm:volumeChainNode>
                <ovirt-vm:domainID>c30031cf-2660-46a5-9fcb-0f92817a6718</ovirt-vm:domainID>
                <ovirt-vm:imageID>2c768d24-a5c2-4799-a249-e16db3b4722e</ovirt-vm:imageID>
                <ovirt-vm:leaseOffset type="int">0</ovirt-vm:leaseOffset>
                <ovirt-vm:leasePath>/rhev/data-center/mnt/KVMC-SD.robofish.local:_NFS/c30031cf-2660-46a5-9fcb-0f92817a6718/images/2c768d24-a5c2-4799-a249-e16db3b4722e/e66c5210-d935-42cb-aaa1-c70addee069f.lease</ovirt-vm:leasePath>
                <ovirt-vm:path>/rhev/data-center/mnt/KVMC-SD.robofish.local:_NFS/c30031cf-2660-46a5-9fcb-0f92817a6718/images/2c768d24-a5c2-4799-a249-e16db3b4722e/e66c5210-d935-42cb-aaa1-c70addee069f</ovirt-vm:path>
                <ovirt-vm:volumeID>e66c5210-d935-42cb-aaa1-c70addee069f</ovirt-vm:volumeID>
            </ovirt-vm:volumeChainNode>
        </ovirt-vm:volumeChain>
    </ovirt-vm:device>
    <ovirt-vm:device devtype="disk" name="sdc"/>
</ovirt-vm:vm>
  </metadata>
  <maxMemory slots='16' unit='KiB'>4194304</maxMemory>
  <memory unit='KiB'>1048576</memory>
  <currentMemory unit='KiB'>1048576</currentMemory>
  <vcpu placement='static' current='1'>16</vcpu>
  <iothreads>1</iothreads>
  <resource>
    <partition>/machine</partition>
  </resource>
  <sysinfo type='smbios'>
    <system>
      <entry name='manufacturer'>oVirt</entry>
      <entry name='product'>RHEL</entry>
      <entry name='version'>8.3-1.0.el8</entry>
      <entry name='serial'>9be32c42-cfc3-dceb-044a-53e2a04b451e</entry>
      <entry name='uuid'>8bfe0842-feea-4085-bbb7-8fb1e85c8d38</entry>
      <entry name='family'>oVirt</entry>
    </system>
  </sysinfo>
  <os>
    <type arch='x86_64' machine='pc-q35-rhel8.1.0'>hvm</type>
    <smbios mode='sysinfo'/>
  </os>
  <features>
    <acpi/>
    <vmcoreinfo state='on'/>
  </features>
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>SandyBridge</model>
    <topology sockets='16' dies='1' cores='1' threads='1'/>
    <feature policy='require' name='pcid'/>
    <feature policy='require' name='spec-ctrl'/>
    <feature policy='require' name='ssbd'/>
    <feature policy='require' name='md-clear'/>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='arat'/>
    <feature policy='require' name='xsaveopt'/>
    <numa>
      <cell id='0' cpus='0-15' memory='1048576' unit='KiB'/>
    </numa>
  </cpu>
  <clock offset='variable' adjustment='0' basis='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw' error_policy='report'/>
      <source file='/rhev/data-center/mnt/KVMC-SD.robofish.local:_ISO/52859891-62c3-4903-97fc-fbccee0f7b89/images/11111111-1111-1111-1111-111111111111/CentOS-8.2.2004-x86_64-dvd1.iso' startupPolicy='optional' index='2'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <backingStore/>
      <target dev='sdc' bus='sata'/>
      <readonly/>
      <boot order='2'/>
      <alias name='ua-b56a330d-257d-4db1-ba9f-2505db6a8824'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>
    <disk type='file' device='disk' snapshot='no'>
      <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='threads' discard='unmap'/>
      <source file='/rhev/data-center/mnt/KVMC-SD.robofish.local:_NFS/c30031cf-2660-46a5-9fcb-0f92817a6718/images/2c768d24-a5c2-4799-a249-e16db3b4722e/e66c5210-d935-42cb-aaa1-c70addee069f' index='1'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <backingStore/>
      <target dev='sda' bus='scsi'/>
      <serial>2c768d24-a5c2-4799-a249-e16db3b4722e</serial>
      <boot order='1'/>
      <alias name='ua-2c768d24-a5c2-4799-a249-e16db3b4722e'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <controller type='scsi' index='0' model='virtio-scsi'>
      <driver iothread='1'/>
      <alias name='ua-16568ed3-8450-4acc-bb0d-bb36f184f34b'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </controller>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x10'/>
      <alias name='pci.1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x11'/>
      <alias name='pci.2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0x12'/>
      <alias name='pci.3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0x13'/>
      <alias name='pci.4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0x14'/>
      <alias name='pci.5'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0x15'/>
      <alias name='pci.6'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x5'/>
    </controller>
    <controller type='pci' index='7' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='7' port='0x16'/>
      <alias name='pci.7'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x6'/>
    </controller>
    <controller type='pci' index='8' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='8' port='0x17'/>
      <alias name='pci.8'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x7'/>
    </controller>
    <controller type='pci' index='9' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='9' port='0x18'/>
      <alias name='pci.9'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='10' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='10' port='0x19'/>
      <alias name='pci.10'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x1'/>
    </controller>
    <controller type='pci' index='11' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='11' port='0x1a'/>
      <alias name='pci.11'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x2'/>
    </controller>
    <controller type='pci' index='12' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='12' port='0x1b'/>
      <alias name='pci.12'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x3'/>
    </controller>
    <controller type='pci' index='13' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='13' port='0x1c'/>
      <alias name='pci.13'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x4'/>
    </controller>
    <controller type='pci' index='14' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='14' port='0x1d'/>
      <alias name='pci.14'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x5'/>
    </controller>
    <controller type='pci' index='15' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='15' port='0x1e'/>
      <alias name='pci.15'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x6'/>
    </controller>
    <controller type='pci' index='16' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='16' port='0x1f'/>
      <alias name='pci.16'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x7'/>
    </controller>
    <controller type='pci' index='0' model='pcie-root'>
      <alias name='pcie.0'/>
    </controller>
    <controller type='virtio-serial' index='0' ports='16'>
      <alias name='ua-330794d0-f086-42d6-a939-0ae78730bb41'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </controller>
    <controller type='usb' index='0' model='qemu-xhci' ports='8'>
      <alias name='ua-64eb5f60-3ff7-4b1a-964a-f620939b3a65'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </controller>
    <controller type='sata' index='0'>
      <alias name='ide'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <interface type='bridge'>
      <mac address='56:6f:2d:cf:00:00'/>
      <source bridge='ovirtmgmt'/>
      <target dev='vnet34'/>
      <model type='virtio'/>
      <filterref filter='vdsm-no-mac-spoofing'/>
      <link state='up'/>
      <mtu size='1500'/>
      <alias name='ua-e819885e-761e-4e83-aace-42b9ea04afa0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='56:6f:2d:cf:00:06'/>
      <source bridge='ovirtmgmt'/>
      <target dev='vnet35'/>
      <model type='virtio'/>
      <filterref filter='vdsm-no-mac-spoofing'/>
      <link state='up'/>
      <mtu size='1500'/>
      <alias name='ua-a82abd01-0b86-452a-abeb-c798a559c07b'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </interface>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channels/8bfe0842-feea-4085-bbb7-8fb1e85c8d38.org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0' state='connected'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <channel type='spicevmc'>
      <target type='virtio' name='com.redhat.spice.0' state='disconnected'/>
      <alias name='channel1'/>
      <address type='virtio-serial' controller='0' bus='0' port='2'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'>
      <alias name='input1'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input2'/>
    </input>
    <graphics type='spice' port='5903' tlsPort='5904' autoport='yes' listen='172.25.16.33' passwdValidTo='1970-01-01T00:00:01'>
      <listen type='network' address='172.25.16.33' network='vdsm-ovirtmgmt'/>
      <channel name='main' mode='secure'/>
      <channel name='display' mode='secure'/>
      <channel name='inputs' mode='secure'/>
      <channel name='cursor' mode='secure'/>
      <channel name='playback' mode='secure'/>
      <channel name='record' mode='secure'/>
      <channel name='smartcard' mode='secure'/>
      <channel name='usbredir' mode='secure'/>
    </graphics>
    <graphics type='vnc' port='5905' autoport='yes' listen='172.25.16.33' keymap='en-us' passwdValidTo='1970-01-01T00:00:01'>
      <listen type='network' address='172.25.16.33' network='vdsm-ovirtmgmt'/>
    </graphics>
    <video>
      <model type='qxl' ram='65536' vram='32768' vgamem='16384' heads='1' primary='yes'/>
      <alias name='ua-cbd0abe0-5982-481a-8525-fe4e85413613'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <stats period='5'/>
      <alias name='ua-b7688ad7-65cf-4936-9f77-a071a3989fd7'/>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </memballoon>
    <rng model='virtio'>
      <backend model='random'>/dev/urandom</backend>
      <alias name='ua-68d8e882-0e5d-4279-85cc-e57a63cc10a2'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </rng>
  </devices>
  <seclabel type='dynamic' model='selinux' relabel='yes'>
    <label>system_u:system_r:svirt_t:s0:c46,c741</label>
    <imagelabel>system_u:object_r:svirt_image_t:s0:c46,c741</imagelabel>
  </seclabel>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+107:+107</label>
    <imagelabel>+107:+107</imagelabel>
  </seclabel>
  <qemu:capabilities>
    <qemu:add capability='blockdev'/>
    <qemu:add capability='incremental-backup'/>
  </qemu:capabilities>
</domain>

Comment 6 Nir Soffer 2020-12-09 17:06:41 UTC
Created attachment 1737967 [details]
vm xml before hot plugging disk

Comment 7 Nir Soffer 2020-12-09 17:07:11 UTC
Created attachment 1737968 [details]
vm xml after hot plugging disk

Comment 8 Nir Soffer 2020-12-09 17:08:37 UTC
I tried to reproduce with similar vm on nfs, but could not reproduce.

1. I started this a vm with one disk on nfs
   See attachment 1737967 [details]

2. Did a full backup with all disks - ok

3. I created another disk and attached it to the vm.

4. Wait until the disk status change to OK
   See attachment 1737968 [details]

5. Run full backup again with the new disk - ok

$ ./backup_vm.py -c engine-dev full 4dc3bb16-f8d1-4f59-9388-a93f68da7cf0 --disk-uuid 2d53b21e-183a-46c6-a3be-3a588c3a0418
[   0.0 ] Starting full backup for VM 4dc3bb16-f8d1-4f59-9388-a93f68da7cf0
[   0.3 ] Waiting until backup 57f91cc6-50a0-43b8-a26c-f6075778e3a5 is ready
[   2.3 ] Created checkpoint '65f6e057-72cb-4e76-9732-bfca812ad928' (to use in --from-checkpoint-uuid for the next incremental backup)
[   2.3 ] Creating image transfer for disk 2d53b21e-183a-46c6-a3be-3a588c3a0418
[   3.4 ] Image transfer 7febb461-a8ce-4981-8dd1-5a2283bae077 is ready
[ 100.00% ] 10.00 GiB, 0.12 seconds, 84.45 GiB/s                               
[   3.5 ] Finalizing image transfer
[   5.6 ] Finalizing backup
[   5.6 ] Waiting until backup is finalized
[   5.7 ] Full backup completed successfully

6. Run full backup again with all disks:

[nsoffer@sparse examples (reduce-disk)]$ ./backup_vm.py -c engine-dev full 4dc3bb16-f8d1-4f59-9388-a93f68da7cf0
[   0.0 ] Starting full backup for VM 4dc3bb16-f8d1-4f59-9388-a93f68da7cf0
[   0.2 ] Waiting until backup c56b3047-0364-46e2-ac58-78a5614d526b is ready
[   1.2 ] Created checkpoint '33e5bd22-d8dd-4c55-8c3b-674d06950a31' (to use in --from-checkpoint-uuid for the next incremental backup)
[   1.2 ] Creating image transfer for disk 126eea31-c5a2-4c01-a18d-9822b0c05c2a
[   2.3 ] Image transfer eacb3f84-c5a4-4a2e-8e9a-e06873918542 is ready
[ 100.00% ] 6.00 GiB, 24.96 seconds, 246.16 MiB/s                              
[  27.3 ] Finalizing image transfer
[  36.4 ] Creating image transfer for disk 2d53b21e-183a-46c6-a3be-3a588c3a0418
[  37.5 ] Image transfer 4c5e14fc-47be-47db-8a2f-9e27b94d39d8 is ready
[ 100.00% ] 10.00 GiB, 0.06 seconds, 156.03 GiB/s                              
[  37.6 ] Finalizing image transfer
[  39.6 ] Finalizing backup
[  39.7 ] Waiting until backup is finalized
[  39.7 ] Full backup completed successfully

Yuri, is it possible that you did not wait until the hot plug finished?

When using the UI, you need to wait until the disk move to Up state (green arrow)
In the SDK, you need to poll the disk status until the disk status is OK.

See this example:
https://github.com/oVirt/ovirt-engine-sdk/blob/master/sdk/examples/add_vm_disk.py

Comment 9 Nir Soffer 2020-12-09 17:55:23 UTC
I did another test - with same vm:

1. Deactivate the second disk
2. Detach the disk
3. Attach the disk - it became active in less than a second
4. Run full backup for this disk - ok

Hot plug was probably too fast to reproduce the issue, but I think this
may be possible.

Then I did another test:L

1. Deactivate the second disk
2. Remove it permanently
3. Create new disk on nfs and attach to the vm (using the UI)
4. Immediately start full backup for all disks

$ ./backup_vm.py -c engine-dev full 4dc3bb16-f8d1-4f59-9388-a93f68da7cf0
[   0.0 ] Starting full backup for VM 4dc3bb16-f8d1-4f59-9388-a93f68da7cf0
Traceback (most recent call last):
  File "./backup_vm.py", line 431, in <module>
    main()
  File "./backup_vm.py", line 161, in main
    args.command(args)
  File "./backup_vm.py", line 175, in cmd_full
    backup = start_backup(connection, args)
  File "./backup_vm.py", line 298, in start_backup
    backup = backups_service.add(
  File "/home/nsoffer/.local/lib/python3.8/site-packages/ovirtsdk4/services.py", line 33572, in add
    return self._internal_add(backup, headers, query, wait)
  File "/home/nsoffer/.local/lib/python3.8/site-packages/ovirtsdk4/service.py", line 232, in _internal_add
    return future.wait() if wait else future
  File "/home/nsoffer/.local/lib/python3.8/site-packages/ovirtsdk4/service.py", line 55, in wait
    return self._code(response)
  File "/home/nsoffer/.local/lib/python3.8/site-packages/ovirtsdk4/service.py", line 229, in callback
    self._check_fault(response)
  File "/home/nsoffer/.local/lib/python3.8/site-packages/ovirtsdk4/service.py", line 132, in _check_fault
    self._raise_error(response, body)
  File "/home/nsoffer/.local/lib/python3.8/site-packages/ovirtsdk4/service.py", line 118, in _raise_error
    raise error
ovirtsdk4.Error: Fault reason is "Operation Failed". Fault detail is "[Cannot backup VM: The following disks are locked: backup-raw_plugged-disk. Please try again in a few minutes.]". HTTP response code is 409.

This reproduces the issue, but the error is pretty clear, the disk is locked and the
user should try later.

Few seconds after the failure I run asnother full backup and it succeeded.

So this looks like wrong usage, not a bug in the system.

Regardless, I think we have an issue with this engine response - we get a generic
(HTTP 409 error) which is not related to backup, so the user does not have any
way to handle this error. We need to report specific error code that can be
handled by a program using this API. I'll file another bug for this.

Comment 10 Nir Soffer 2020-12-09 17:59:54 UTC
(In reply to Yury.Panchenko from comment #5)
> Nir.
> This issue confirmed for SATA and VIRTIO but not for default VIRTIO-SCSI
> If disk is hotplugged backup always failed, vm reboot not solve that problem.
> 
> > This vm looks normal, but we don't see the hot plugged disk. Looking at 
> > the xml after hot plugging the disk may reveal the issue.
> Checked again, after hoplung no new disk is disks node

This means the hotplug did not finish, or failed. Did  you wait until the
hot plug completed successfully?

I will check again with virtio and sata.

Comment 11 Nir Soffer 2020-12-09 18:05:15 UTC
Created attachment 1737978 [details]
vm with hot plugged disk using virtio interface

Comment 12 Yury.Panchenko 2020-12-09 18:16:52 UTC
Created attachment 1737979 [details]
Disk statuses

So, i wait more then 5 min.
Second disk have red arrow but OK status
And backup stil doesn't work

Comment 13 Yury.Panchenko 2020-12-09 18:31:28 UTC
Nir.
You were right.
If disk marked as inactive (red arrow backup is failed)
In my lab hotplug SATA disk always failed.

But i have different question: if we don't attach disk during its creation. why we left them in VM configuration?

Also, if disk inactive it cant use by VM. So, probably we can just skip such disks, from backup process.

Third thing: After vm shutdown and poweron cycle, this disks still left in inactive state

Comment 14 Nir Soffer 2020-12-09 18:36:52 UTC
I could not reproduce with hot plug disk using virtio
see attachment 1737978 [details]. Backup new disk succeeded after hot plug.

I could not hot plug disk using SATA, the attach operation fails with:

    Error while executing action:

    backup-raw:
    Cannot attach Virtual Disk: The disk interface is not supported by the VM OS:
    Red Hat Enterprise Linux 8.x x64.

It is possible to attach the disk using "Virtio", and then deactivate the disk, and
change the interface to "SATA". In this case the disk remain deactivated.

In the API the disk looks like this:

GET https://engine-dev:8443/ovirt-engine/api/vms/4dc3bb16-f8d1-4f59-9388-a93f68da7cf0/diskattachments

<disk_attachments>
  <disk_attachment href="/ovirt-engine/api/vms/4dc3bb16-f8d1-4f59-9388-a93f68da7cf0/diskattachments/126eea31-c5a2-4c01-a18d-9822b0c05c2a" id="126eea31-c5a2-4c01-a18d-9822b0c05c2a">
    <active>true</active>
    <bootable>true</bootable>
    <interface>virtio_scsi</interface>
    <logical_name>/dev/sda</logical_name>
    <pass_discard>true</pass_discard>
    <read_only>false</read_only>
    <uses_scsi_reservation>false</uses_scsi_reservation>
    <disk href="/ovirt-engine/api/disks/126eea31-c5a2-4c01-a18d-9822b0c05c2a" id="126eea31-c5a2-4c01-a18d-9822b0c05c2a"/>
    <vm href="/ovirt-engine/api/vms/4dc3bb16-f8d1-4f59-9388-a93f68da7cf0" id="4dc3bb16-f8d1-4f59-9388-a93f68da7cf0"/>
  </disk_attachment>
  <disk_attachment href="/ovirt-engine/api/vms/4dc3bb16-f8d1-4f59-9388-a93f68da7cf0/diskattachments/b74df755-e0e6-4c93-b0fc-5c61934c0511" id="b74df755-e0e6-4c93-b0fc-5c61934c0511">
    <active>false</active>
    <bootable>false</bootable>
    <interface>sata</interface>
    <pass_discard>false</pass_discard>
    <read_only>false</read_only>
    <uses_scsi_reservation>false</uses_scsi_reservation>
    <disk href="/ovirt-engine/api/disks/b74df755-e0e6-4c93-b0fc-5c61934c0511" id="b74df755-e0e6-4c93-b0fc-5c61934c0511"/>
    <vm href="/ovirt-engine/api/vms/4dc3bb16-f8d1-4f59-9388-a93f68da7cf0" id="4dc3bb16-f8d1-4f59-9388-a93f68da7cf0"/>
  </disk_attachment>
</disk_attachments>

I think the issue is that SATA does not support hot plug.

Starting backup in this state:

$ ./backup_vm.py -c engine-dev full 4dc3bb16-f8d1-4f59-9388-a93f68da7cf0 --disk-uuid b74df755-e0e6-4c93-b0fc-5c61934c0511
[   0.0 ] Starting full backup for VM 4dc3bb16-f8d1-4f59-9388-a93f68da7cf0
[   0.3 ] Waiting until backup a24a84c4-77d6-43e5-9d76-cc7c3cce6258 is ready
Traceback (most recent call last):
  File "./backup_vm.py", line 312, in start_backup
    backup = backup_service.get()
  File "/home/nsoffer/.local/lib/python3.8/site-packages/ovirtsdk4/services.py", line 33342, in get
    return self._internal_get(headers, query, wait)
  File "/home/nsoffer/.local/lib/python3.8/site-packages/ovirtsdk4/service.py", line 211, in _internal_get
    return future.wait() if wait else future
  File "/home/nsoffer/.local/lib/python3.8/site-packages/ovirtsdk4/service.py", line 55, in wait
    return self._code(response)
  File "/home/nsoffer/.local/lib/python3.8/site-packages/ovirtsdk4/service.py", line 208, in callback
    self._check_fault(response)
  File "/home/nsoffer/.local/lib/python3.8/site-packages/ovirtsdk4/service.py", line 130, in _check_fault
    body = self._internal_read_body(response)
  File "/home/nsoffer/.local/lib/python3.8/site-packages/ovirtsdk4/service.py", line 312, in _internal_read_body
    self._raise_error(response)
  File "/home/nsoffer/.local/lib/python3.8/site-packages/ovirtsdk4/service.py", line 118, in _raise_error
    raise error
ovirtsdk4.NotFoundError: HTTP response code is 404.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./backup_vm.py", line 431, in <module>
    main()
  File "./backup_vm.py", line 161, in main
    args.command(args)
  File "./backup_vm.py", line 175, in cmd_full
    backup = start_backup(connection, args)
  File "./backup_vm.py", line 315, in start_backup
    raise RuntimeError(
RuntimeError: Backup a24a84c4-77d6-43e5-9d76-cc7c3cce6258 failed: {'code': 10792, 'description': 'Backup a24a84c4-77d6-43e5-9d76-cc7c3cce6258 for VM backup-raw failed (User: admin@internal-authz).'}

And on vdsm side we see:

2020-12-09 20:21:25,547+0200 ERROR (jsonrpc/5) [api] FINISH start_backup error=Backup Error: {'vm_id': '4dc3bb16-f8d1-4f59-938
8-a93f68da7cf0', 'backup': <vdsm.virt.backup.BackupConfig object at 0x7ff112757da0>, 'reason': "Failed to find one of the back
up disks: No such drive: '{'domainID': '839db77f-fde3-4e13-bfb6-56be604631ed', 'imageID': 'b74df755-e0e6-4c93-b0fc-5c61934c051
1', 'volumeID': 'a3c31488-3425-4c9a-b4d8-91fe99814bf8'}'"} (api:131)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/virt/backup.py", line 162, in start_backup
    drives = _get_disks_drives(vm, backup_cfg.disks)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/backup.py", line 364, in _get_disks_drives
    'volumeID': disk.vol_id})
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 3903, in findDriveByUUIDs
    raise LookupError("No such drive: '%s'" % drive)
LookupError: No such drive: '{'domainID': '839db77f-fde3-4e13-bfb6-56be604631ed', 'imageID': 'b74df755-e0e6-4c93-b0fc-5c61934c
0511', 'volumeID': 'a3c31488-3425-4c9a-b4d8-91fe99814bf8'}'

So we reproduce the issue with SATA disk.

The root cause is that you tried to backup inactive disk, which is not
possible when the vm is running because the disk is not attached to the vm.

We can support backing up inactive disk not connected to a vm using offline
backup, but I'm not sure that it make sense. If the disk is not connected
to the vm, it has no data anyway.

I think we need to:

- improve the backup_vm.py example to skip inactive disks,
  and fail early with a clear error if user ask to backup inactive disk.

- improve the error code returned by engine

  {
    'code': 10792,
    'description': 'Backup a24a84c4-77d6-43e5-9d76-cc7c3cce6258 for VM backup-raw failed (User: admin@internal-authz).'
  }

  This error is useless. Engine should be able to fail the backup early by validating
  that all disks are active and return specific error code and message.

Comment 15 Nir Soffer 2020-12-09 18:46:38 UTC
(In reply to Yury.Panchenko from comment #13)
> But i have different question: if we don't attach disk during its creation.
> why we left them in VM configuration?

The disk can be activate later after shutting down the vm.

I think this feature - having inactive disk may be useful if you want to 
deactivate a disk temporarily, but prevent attaching this disk to another
vm.

> Also, if disk inactive it cant use by VM. So, probably we can just skip such
> disks, from backup process.

Right, with a log about skipping inactive disks, in case a user forgot
to activate a disk after it was deactivated.

> Third thing: After vm shutdown and poweron cycle, this disks still left in
> inactive state

You need to activate the disk, it remains inactive after the hot plug. I think
this is caused by using SATA, the system does not try to activate the disk,
so it is left inactive.

After activating the disk and starting the vm, backup completed successfully
for the disk.

Comment 16 Ilan Zuckerman 2021-01-07 06:05:59 UTC
Hi Eyal \ Nir, Please describe what there is to verify here?
I can see that the patch added includes a check whether the disks are active prior making a full backup.

So the verification flow would be:

1. Hotplug disk
2. immediately issue a full backup

Expected:
Backup is failed.
A graceful error is thrown on engine log saying that backup is not possible while the disk is locked.

Is that correct?
The same goes for the incremental backup attempt as well?

Comment 17 Eyal Shenitzky 2021-01-07 07:26:51 UTC
Steps to reproduce:

1. Create a VM with two disks - one active and the other not active
2. Run the VM
3. Start a backup for the VM

Expected results:
Backup failed with the following error - ".... The following disks are not active on VM <vm-name>: <disks-ids>"

If the disk will be activated, the backup should succeed.

Comment 18 Ilan Zuckerman 2021-01-07 07:52:43 UTC
Verified on:
rhv-4.4.4-7
ovirt-engine-4.4.4.6-0.1.el8ev.noarch

1. Create a VM with two disks - one active and the other not active
2. Run the VM
3. Start a backup for the VM

Expected results:
Backup failed with the following error - ".... The following disks are not active on VM <vm-name>: <disks-ids>"

Actual: as expected
2021-01-07 09:47:35,624+02 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-16) [] Operation Failed: [Cannot backup VM. The following disks are not active on VM 26769: f882d751-a6f0-42e6-b4d2-8cd4968bdcfd]


4. Activate the disk
5. Start backup for both of the disks

Expected:
Backup succeeds.

Actual:
As expected.

Comment 19 Sandro Bonazzola 2021-01-12 16:23:48 UTC
This bugzilla is included in oVirt 4.4.4 release, published on December 21st 2020.

Since the problem described in this bug report should be resolved in oVirt 4.4.4 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.