Bug 1850220 - Backward compatibility of vm devices monitoring
Summary: Backward compatibility of vm devices monitoring
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.4.1
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.4.1
: 4.4.1.5
Assignee: Liran Rotenberg
QA Contact: Qin Yuan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-23 18:33 UTC by Arik
Modified: 2020-08-17 06:30 UTC (History)
4 users (show)

Fixed In Version: ovirt-engine-4.4.1.5
Doc Type: Bug Fix
Doc Text:
Old virtual machines that have not been restarted since user aliases were introduced in RHV version 4.2 use old device aliases created by libvirt. The current release adds support for those old device aliases and links them to the new user-aliases to prevent correlation issues and devices being unplugged.
Clone Of:
Environment:
Last Closed: 2020-08-05 06:25:06 UTC
oVirt Team: Virt
Embargoed:
pm-rhel: ovirt-4.4+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 109867 0 master MERGED core: backward compatbility to old devices alias 2020-08-17 06:30:08 UTC

Description Arik 2020-06-23 18:33:42 UTC
Engine 4.4 monitors devices only by their user-alias (VmDevicesMonitoring::correlateWithUserAlias).
This is problematic because if we have VMs that were started before user-aliases were introduced (4.2), we won't be able to correlate the devices and they'll be unplugged.
In addition, devices may get unplugged after restoring memory state from a snapshot.

Comment 1 Michal Skrivanek 2020-06-24 05:20:04 UTC
Where/how do you get to monitor a <4.2 VM when we do not support <4.2 cluster levels?

Comment 2 Arik 2020-06-24 05:44:17 UTC
Let's say that a VM started in oVirt 4.1 and then you upgrade to 4.2 -
so you have a VM that was started in 4.1, without user-aliases, running in a 4.1 cluster managed by engine 4.2 (and set with next-run configuration).
Then you upgrade to engine 4.4 - this engine will get this VM.

As long as we don't require users to restart their VMs during the upgrade process we cannot rely on engine >= 4.2 to set their aliases with user-aliases.

Comment 3 Liran Rotenberg 2020-06-24 08:36:04 UTC
Yes, but when the VM creating the domxml, it will take the device id and set it as an alias to that device.
The correlation happens when the VM is reported by libvirt.

I think it will happen only to unmanaged VMs or VMs we update their machine type/chipset(UpdateVmCommand::updateDeviceAddresses)/activating NIC(ActivateDeactivateVmNicCommand::updateDevice).

We may change VmDevicesMonitoring::correlateWithUserAlias, checking the devices ids in case we don't have any alias.

Comment 4 Qin Yuan 2020-07-13 01:10:56 UTC
Verified with:
ovirt-engine-4.4.1.7-0.3.el8ev.noarch

Steps:
1. Create and run a VM having old device alias made by libvirt:
   1) prepare a 4.3 engine
   2) create a data center, set compatibility version to 4.1
   3) add a cluster, set compatibility version to 4.1
   4) add a 4.3 host to the cluster
   5) add nfs storage domain
   6) create a VM, setting custom compatibility version to 4.3
   7) add two nics, two disks to the VM
   8) run the VM

2. Upgrade engine to 4.4:
   1) change the cluster compatibility version from 4.1 to 4.3
   2) change the data center compatibility version from 4.1 to 4.3
   3) back up engine 4.3
   4) fresh install an engine 4.4 on a different machine
   5) copy backup file to engine 4.4 machine and restore it
   6) stop engine 4.3
   7) run engine-setup on engine 4.4
   8) run /usr/share/ovirt-engine/setup/bin/ovirt-engine-rename
   9) login to engine 4.4, check the VM

Results:
1. The VM created in step 1 has the old device alias made by libvirt. See engine log:

2020-07-13 05:51:21,386+08 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateBrokerVDSCommand] (EE-ManagedThreadFactory-engine-Thread-235) [e3180550-60a2-436a-8f23-f402f5b50160] VM {memGuaranteedSize=1024, smpThreadsPerCore=1, cpuType=Opteron_G5, vmId=171586de-8e03-4e80-9f0b-d98049d3dd07, acpiEnable=true, vmType=kvm, smp=1, smpCoresPerSocket=1, emulatedMachine=pc-i440fx-rhel7.6.0, smartcardEnable=false, guestNumaNodes=[{cpus=0, nodeIndex=0, memory=1024}], transparentHugePages=true, numOfIoThreads=1, displayNetwork=ovirtmgmt, vmName=vm_41, maxVCpus=16, kvmEnable=true, devices=[{type=video, specParams={vgamem=16384, heads=1, vram=8192, ram=65536}, device=qxl, deviceId=30329518-6f44-4342-ad68-a757ce40dea1}, {type=graphics, specParams={keyMap=en-us, fileTransferEnable=true, spiceSecureChannels=smain,sinputs,scursor,splayback,srecord,sdisplay,ssmartcard,susbredir, copyPasteEnable=true}, device=vnc, deviceId=b9118f2f-361a-4552-ad86-c845a8d2fece}, {type=graphics, specParams={keyMap=en-us, fileTransferEnable=true, spiceSecureChannels=smain,sinputs,scursor,splayback,srecord,sdisplay,ssmartcard,susbredir, copyPasteEnable=true}, device=spice, deviceId=723c39ee-be2f-467a-86be-8989a0980744}, {iface=ide, shared=false, path=, readonly=true, index=2, type=disk, specParams={path=}, device=cdrom, deviceId=1e48ef64-15de-44fa-95c0-d8e98d43101b}, {discard=false, shared=false, address={bus=0, controller=0, unit=0, type=drive, target=0}, imageID=77ab7183-dd2f-47ed-a051-cc7406eaf589, format=raw, index=0, optional=false, type=disk, deviceId=77ab7183-dd2f-47ed-a051-cc7406eaf589, domainID=fcd24f19-27a5-45ac-9197-a337b48ff3dd, propagateErrors=off, iface=scsi, readonly=false, bootOrder=1, poolID=63d19353-8342-4b63-b52f-bd6331548834, volumeID=d646b47c-ea17-4376-b53e-59b31720dbe9, diskType=file, specParams={}, device=disk}, {discard=false, shared=false, address={bus=0, controller=0, unit=1, type=drive, target=0}, imageID=cf7e4b84-ad54-4038-b04c-0cbc4f5abbde, format=raw, optional=false, type=disk, deviceId=cf7e4b84-ad54-4038-b04c-0cbc4f5abbde, domainID=fcd24f19-27a5-45ac-9197-a337b48ff3dd, propagateErrors=off, iface=scsi, readonly=false, poolID=63d19353-8342-4b63-b52f-bd6331548834, volumeID=32c927d6-39aa-471d-b916-223c8cc72ebf, diskType=file, specParams={}, device=disk}, {filter=vdsm-no-mac-spoofing, nicModel=pv, filterParameters=[], type=interface, specParams={inbound={}, outbound={}}, device=bridge, linkActive=true, deviceId=8e5c02c2-0945-4172-8d29-a1ac160f43c0, macAddr=56:6f:8d:b5:00:01, network=ovirtmgmt}, {filter=vdsm-no-mac-spoofing, nicModel=pv, filterParameters=[], type=interface, specParams={inbound={}, outbound={}}, device=bridge, linkActive=true, deviceId=401091d6-cce7-4987-b834-2811a74e13b7, macAddr=56:6f:8d:b5:00:02, network=ovirtmgmt}, {index=0, model=piix3-uhci, type=controller, specParams={}, device=usb, deviceId=0c2aacc1-762d-4f3b-a572-2fb839323621}, {type=balloon, specParams={model=virtio}, device=memballoon, deviceId=68582f62-688b-4d5b-86fc-0584b06beb7f}, {index=0, model=virtio-scsi, type=controller, specParams={ioThreadId=1}, device=scsi, deviceId=cdf511ef-3948-4635-8868-dbc87fb879b3}, {type=controller, specParams={}, device=virtio-serial, deviceId=85e1c29e-4999-4cc2-bd63-47d7c93bfed5}, {model=virtio, type=rng, specParams={source=urandom}, device=virtio, deviceId=c8ceb23f-ea99-4a93-b81b-a529207bb27b}], custom={}, timeOffset=0, nice=0, maxMemSize=4096, maxMemSlots=16, bootMenuEnable=false, memSize=1024, agentChannelName=ovirt-guest-agent.0}
2020-07-13 05:51:25,728+08 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-10) [] VM '171586de-8e03-4e80-9f0b-d98049d3dd07'(vm_41) moved from 'WaitForLaunch' --> 'PoweringUp'
2020-07-13 05:51:25,770+08 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FullListVDSCommand] (ForkJoinPool-1-worker-10) [] FINISH, FullListVDSCommand, return: [{acpiEnable=true, emulatedMachine=pc-i440fx-rhel7.6.0, vmId=171586de-8e03-4e80-9f0b-d98049d3dd07, guestDiskMapping={}, transparentHugePages=true, timeOffset=0, cpuType=Opteron_G5, smp=1, guestNumaNodes=[{nodeIndex=0, cpus=0, memory=1024}], smartcardEnable=false, custom={}, vmType=kvm, memSize=1024, smpCoresPerSocket=1, vmName=vm_41, nice=0, status=Up...{nicModel=pv, macAddr=56:6f:8d:b5:00:01, linkActive=true, network=ovirtmgmt, filterParameters=[], specParams={inbound={}, outbound={}}, vmid=171586de-8e03-4e80-9f0b-d98049d3dd07, filter=vdsm-no-mac-spoofing, alias=net0, deviceId=8e5c02c2-0945-4172-8d29-a1ac160f43c0, address={slot=0x03, bus=0x00, domain=0x0000, type=pci, function=0x0}, device=bridge, type=interface, vm_custom={}, name=vnet0}, {nicModel=pv, macAddr=56:6f:8d:b5:00:02, linkActive=true, network=ovirtmgmt, filterParameters=[], specParams={inbound={}, outbound={}}, vmid=171586de-8e03-4e80-9f0b-d98049d3dd07, filter=vdsm-no-mac-spoofing, alias=net1, deviceId=401091d6-cce7-4987-b834-2811a74e13b7, address={slot=0x04, bus=0x00, domain=0x0000, type=pci, function=0x0}, device=bridge, type=interface, vm_custom={}, name=vnet1},...format=raw, deviceId=77ab7183-dd2f-47ed-a051-cc7406eaf589, address={bus=0, controller=0, type=drive, target=0, unit=0}, device=disk, path=/rhev/data-center/mnt/x.x.x.x:_home_nfs_data1/fcd24f19-27a5-45ac-9197-a337b48ff3dd/images/77ab7183-dd2f-47ed-a051-cc7406eaf589/d646b47c-ea17-4376-b53e-59b31720dbe9, propagateErrors=off, optional=false, name=sda, vm_custom={}, bootOrder=1, vmid=171586de-8e03-4e80-9f0b-d98049d3dd07, volumeID=d646b47c-ea17-4376-b53e-59b31720dbe9, diskType=file, alias=scsi0-0-0-0, discard=false, volumeChain=[{domainID=fcd24f19-27a5-45ac-9197-a337b48ff3dd, leaseOffset=0, volumeID=d646b47c-ea17-4376-b53e-59b31720dbe9, leasePath=/rhev/data-center/mnt/x.x.x.x:_home_nfs_data1/fcd24f19-27a5-45ac-9197-a337b48ff3dd/images/77ab7183-dd2f-47ed-a051-cc7406eaf589/d646b47c-ea17-4376-b53e-59b31720dbe9.lease, imageID=77ab7183-dd2f-47ed-a051-cc7406eaf589, path=/rhev/data-center/mnt/x.x.x.x:_home_nfs_data1/fcd24f19-27a5-45ac-9197-a337b48ff3dd/images/77ab7183-dd2f-47ed-a051-cc7406eaf589/d646b47c-ea17-4376-b53e-59b31720dbe9}]}, {poolID=63d19353-8342-4b63-b52f-bd6331548834, reqsize=0, index=1, iface=scsi, apparentsize=5368709120, specParams={}, imageID=cf7e4b84-ad54-4038-b04c-0cbc4f5abbde, readonly=False, shared=false, truesize=0, type=disk, domainID=fcd24f19-27a5-45ac-9197-a337b48ff3dd, volumeInfo={path=/rhev/data-center/mnt/x.x.x.x:_home_nfs_data1/fcd24f19-27a5-45ac-9197-a337b48ff3dd/images/cf7e4b84-ad54-4038-b04c-0cbc4f5abbde/32c927d6-39aa-471d-b916-223c8cc72ebf, type=file}, format=raw, deviceId=cf7e4b84-ad54-4038-b04c-0cbc4f5abbde, address={bus=0, controller=0, type=drive, target=0, unit=1}, device=disk, path=/rhev/data-center/mnt/x.x.x.x:_home_nfs_data1/fcd24f19-27a5-45ac-9197-a337b48ff3dd/images/cf7e4b84-ad54-4038-b04c-0cbc4f5abbde/32c927d6-39aa-471d-b916-223c8cc72ebf, propagateErrors=off, optional=false, name=sdb, vm_custom={}, vmid=171586de-8e03-4e80-9f0b-d98049d3dd07, volumeID=32c927d6-39aa-471d-b916-223c8cc72ebf, diskType=file, alias=scsi0-0-0-1...


2. After upgrade engine to 4.4, the VM runs normally, nics, disks are not unplugged.
   The aliases in VM dumpxml are the same as the alias before engine upgrade to 4.4:

    ...
    <disk type='file' device='disk' snapshot='no'>
      <driver name='qemu' type='raw' cache='none' error_policy='stop' io='threads'/>
      <source file='/rhev/data-center/mnt/x.x.x.x:_home_nfs_data1/fcd24f19-27a5-45ac-9197-a337b48ff3dd/images/77ab7183-dd2f-47ed-a051-cc7406eaf589/d646b47c-ea17-4376-b53e-59b31720dbe9'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <backingStore/>
      <target dev='sda' bus='scsi'/>
      <serial>77ab7183-dd2f-47ed-a051-cc7406eaf589</serial>
      <boot order='1'/>
      <alias name='scsi0-0-0-0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='file' device='disk' snapshot='no'>
      <driver name='qemu' type='raw' cache='none' error_policy='stop' io='threads'/>
      <source file='/rhev/data-center/mnt/x.x.x.x:_home_nfs_data1/fcd24f19-27a5-45ac-9197-a337b48ff3dd/images/cf7e4b84-ad54-4038-b04c-0cbc4f5abbde/32c927d6-39aa-471d-b916-223c8cc72ebf'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <backingStore/>
      <target dev='sdb' bus='scsi'/>
      <serial>cf7e4b84-ad54-4038-b04c-0cbc4f5abbde</serial>
      <alias name='scsi0-0-0-1'/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <interface type='bridge'>
      <mac address='56:6f:8d:b5:00:01'/>
      <source bridge='ovirtmgmt'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <filterref filter='vdsm-no-mac-spoofing'/>
      <link state='up'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='56:6f:8d:b5:00:02'/>
      <source bridge='ovirtmgmt'/>
      <target dev='vnet1'/>
      <model type='virtio'/>
      <filterref filter='vdsm-no-mac-spoofing'/>
      <link state='up'/>
      <alias name='net1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </interface>
    ...

Comment 5 Sandro Bonazzola 2020-08-05 06:25:06 UTC
This bugzilla is included in oVirt 4.4.1 release, published on July 8th 2020.

Since the problem described in this bug report should be resolved in oVirt 4.4.1 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.