Hide Forgot
+++ This bug is a downstream clone. The original bug is: +++ +++ bug 1595489 +++ ====================================================================== Description of problem: After a sequence of events, the VM is now started without a cdrom device. Here is the sequence of events: 1. Upgrade CL from 4.1 to 4.2 2018-06-13 09:04:19,029+10 INFO [org.ovirt.engine.core.bll.UpdateClusterCommand] .... 2. Migrate VM to another Host (VM was not power cycled yet) 2018-06-13 09:06:51,626+10 INFO [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [499f4a18] START, MigrateVDSCommand( MigrateVDSCommandParameters:{hostId='5b0143c4-4a74-40af-91f9-2cbdf1372cef', vmId='33f4ee96-4863-48a8-ae9f-f67a341afc62', srcHost='192.168.18.113', dstVdsId='57f984fc-58b3-432e-b5db-61f0a9918a1e', dstHost='192.168.18.114:54321', migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', autoConverge='true', migrateCompressed='false', consoleAddress='null', maxBandwidth='2500', enableGuestEvents='true', maxIncomingMigrations='2', maxOutgoingMigrations='2', convergenceSchedule='[init=[{name=setDowntime, params=[100]}], stalling=[{limit=1, action={name=setDowntime, params=[150]}}, {limit=2, action={name=setDowntime, params=[200]}}, {limit=3, action={name=setDowntime, params=[300]}}, {limit=4, action={name=setDowntime, params=[400]}}, {limit=6, action={name=setDowntime, params=[500]}}, {limit=-1, action={name=abort, params=[]}}]]', dstQemu='192.168.18.114'}), log id: 53ab4a6b 2018-06-13 09:07:06,632+10 WARN [org.ovirt.engine.core.vdsbroker.libvirt.VmDevicesConverter] (EE-ManagedThreadFactory-engineScheduled-Thread-50) [] unmanaged disk with path '' is ignored 2018-06-13 09:07:06,638+10 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmDevicesMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-50) [] VM '33f4ee96-4863-48a8-ae9f-f67a341afc62' managed non pluggable device was removed unexpectedly from libvirt: 'VmDevice:{id='VmDeviceId:{deviceId='923802b9-9b05-4a6b-b67b-f2caa1b65578', vmId='33f4ee96-4863-48a8-ae9f-f67a341afc62'}', device='cdrom', type='DISK', specParams='[path=rhel-server-7.4-x86_64-dvd.iso]', address='', managed='true', plugged='false', readOnly='true', deviceAlias='ide0-1-0', customProperties='[]', snapshotId='null', logicalName='null', hostDevice=''}' 3. Reboot the VM (now its going to be power cycled due to the new cold reboot feature) 2018-06-15 10:24:25,149+10 INFO [org.ovirt.engine.core.bll.RebootVmCommand] (EE-ManagedThreadFactory-engine-Thread-820) [ff2390d2-0726-4946-b60a-5905b584dd62] VM 'NAS03' is performing cold reboot; run once: 'false', running as volatile: 'false', has next run configuration: 'true' 2018-06-15 10:26:01,202+10 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-15) [] add VM '33f4ee96-4863-48a8-ae9f-f67a341afc62'(NAS03) to cold reboot treatment 2018-06-15 10:26:01,308+10 INFO [org.ovirt.engine.core.bll.UpdateVmCommand] (EE-ManagedThreadFactory-engine-Thread-728) [49419eff] Running command: UpdateVmCommand internal: true. Entities affected : ID: 33f4ee96-4863-48a8-ae9f-f67a341afc62 Type: VMAction group EDIT_VM_PROPERTIES with role type USER Look at the timestamp above and the _update_date of the CDROM device, and the device address is NULL. engine=> select device_id,device,address,is_managed,is_plugged,_update_date,alias from vm_device WHERE vm_id ='33f4ee96-4863-48a8-ae9f-f67a341afc62' AND device ='cdrom'; device_id | device | address | is_managed | is_plugged | _update_date | alias --------------------------------------+--------+---------+------------+------------+-------------------------------+---------- 923802b9-9b05-4a6b-b67b-f2caa1b65578 | cdrom | | t | f | 2018-06-15 10:26:01.308944+00 | ide0-1-0 2018-06-15 10:26:01,399+10 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-728) [762c4295] EVENT_ID: SYSTEM_UPDATE_VM(253), VM NAS03 configuration was updated by system. 2018-06-15 10:26:01,997+10 INFO [org.ovirt.engine.core.vdsbroker.CreateVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-41) [2dee0d82] START, CreateVDSCommand( CreateVDSCommandParameters:{hostId='7aadd959-0c4d-4c65-af6d-14c8c958cb55', vmId='33f4ee96-4863-48a8-ae9f-f67a341afc62', vm='VM [NAS03]'}), log id: 11a9d517 2018-06-15 10:26:02,009+10 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateBrokerVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-41) [2dee0d82] VM <?xml version="1.0" encoding="UTF-8"?><domain type="kvm" xmlns:ovirt-tune="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0"> ... <uuid>33f4ee96-4863-48a8-ae9f-f67a341afc62</uuid> ... No cdrom device in the XML Now the VM cannot attach CDROMs anymore. Version-Release number of selected component (if applicable): rhvm-4.2.3.8-0.1.el7.noarch vdsm-4.20.27.2-1.el7ev (Originally by Germano Veit Michel)
Not sure if related: https://gerrit.ovirt.org/#/c/92445/ Working on reproducing it. (Originally by Germano Veit Michel)
Didn't reproduce following the same events on our labs. I'll attach vdsm logs. (Originally by Germano Veit Michel)
(In reply to Germano Veit Michel from comment #3) > Didn't reproduce following the same events on our labs. I'll attach vdsm > logs. Germano, note that this VM must have been started before the engine was updated to 4.2.3 (because the alias of the cd-rom is 'ide0-1-0' rather than a user-lias like 'ua-923802b9-9b05-4a6b-b67b-f2caa1b65578'). It is unlikely to happen for VMs that are started in 4.2.3 and above since we now use user-aliases. So this should be taken into account while trying to reproduce that. (Originally by Arik Hadas)
(In reply to Germano Veit Michel from comment #0) > 2018-06-13 09:07:06,632+10 WARN > [org.ovirt.engine.core.vdsbroker.libvirt.VmDevicesConverter] > (EE-ManagedThreadFactory-engineScheduled-Thread-50) [] unmanaged disk with > path '' is ignored This means that we got a device without ID or with a different ID than it used to have before. > > 2018-06-13 09:07:06,638+10 ERROR > [org.ovirt.engine.core.vdsbroker.monitoring.VmDevicesMonitoring] > (EE-ManagedThreadFactory-engineScheduled-Thread-50) [] VM > '33f4ee96-4863-48a8-ae9f-f67a341afc62' managed non pluggable device was > removed unexpectedly from libvirt: > 'VmDevice:{id='VmDeviceId:{deviceId='923802b9-9b05-4a6b-b67b-f2caa1b65578', > vmId='33f4ee96-4863-48a8-ae9f-f67a341afc62'}', device='cdrom', type='DISK', > specParams='[path=rhel-server-7.4-x86_64-dvd.iso]', address='', > managed='true', plugged='false', readOnly='true', deviceAlias='ide0-1-0', > customProperties='[]', snapshotId='null', logicalName='null', hostDevice=''}' This means that no device was reported with the ID 923802b9-9b05-4a6b-b67b-f2caa1b65578 so we have no correlation for the cd-rom in the database and therefore unplug it. This reminds me of previous bugs we had where the ID of devices of VMs that were started in previous versions was not preserved when those VMs were migrated to a new version of VDSM. Milan, were the fixes in that area already available in 4.2.3? (Originally by Arik Hadas)
Germano, can you please check if the CDROM was ejected? I.e. bug 1593568 (Originally by michal.skrivanek)
(In reply to Arik from comment #6) > (In reply to Germano Veit Michel from comment #3) > > Didn't reproduce following the same events on our labs. I'll attach vdsm > > logs. > > Germano, note that this VM must have been started before the engine was > updated to 4.2.3 (because the alias of the cd-rom is 'ide0-1-0' rather than > a user-lias like 'ua-923802b9-9b05-4a6b-b67b-f2caa1b65578'). It is unlikely > to happen for VMs that are started in 4.2.3 and above since we now use > user-aliases. So this should be taken into account while trying to reproduce > that. Ohh, I would never had noticed this! Thanks, I can try again tomorrow if you think it's necessary. (In reply to Michal Skrivanek from comment #8) > Germano, can you please check if the CDROM was ejected? I.e. bug 1593568 Yes it was ejected (no path). (Originally by Germano Veit Michel)
(In reply to Arik from comment #7) > This reminds me of previous bugs we had where the ID of devices of VMs that > were started in previous versions was not preserved when those VMs were > migrated to a new version of VDSM. > > Milan, were the fixes in that area already available in 4.2.3? Well, that is incorrect since we're talking about cluster level 4.2 so the engine ignores the device IDs. Need to improve the correlation logic for cd-roms. (Originally by Arik Hadas)
(In reply to Michal Skrivanek from comment #8) > Germano, can you please check if the CDROM was ejected? I.e. bug 1593568 This is not related to that bug -- <source> element is present in the domain XML. (Originally by Milan Zamazal)
Verified: rhvm-4.2.5-0.1.el7ev vdsm-4.20.33-1.el7ev.x86_64 libvirt-client-3.9.0-14.el7_5.6.x86_64 qemu-kvm-rhev-2.10.0-21.el7_5.4.x86_64 sanlock-3.6.0-1.el7.x86_64 Verification scenario: 1. Run VM with cdrom device. 2. Upgrade CL from 4.1 to 4.2. 3. Verify cdrom device is still available and accessible from VM. Migrate VM. 4. Verify cdrom device is available and accessible from VM. Reboot VM. 5. After VM reboot completed, verify cdrom device is available and accessible. Verify cdrom device is listed in engine db: engine=# select device_id,device,address,is_managed,is_plugged,_update_date,alias from vm_device WHERE vm_id ='c4608698-f036-4345-b52a-c71c2cb4c00c' AND device ='cdrom'; device_id | device | address | is_managed | is_plugged | _update_date | alias --------------------------------------+--------+-----------------------------------------------------+------------+------------+-------------------------------+---------- be5e4018-4c12-4b4d-81d8-2309ab00cafa | cdrom | {type=drive, bus=1, controller=0, target=0, unit=0} | t | t | 2018-07-10 13:35:44.498705+03 | ide0-1-0 (1 row)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2318