Description of problem: In a relatively recent change, the engine was changed to cache only the UIDs of the running VMs instead of all their properties, to reduce the allocated memory. The problem is that when a running VM is stopped being reported by the hosts it ran on, an NPE is thrown and the engine stops monitoring this host. I set the severity as high and not urgent because this state in which a VM is stopped being reported is supposed to be really rare - typically the VM is reported as Down by the host and only later destroyed. Version-Release number of selected component (if applicable): How reproducible: I tackled this when creating the state mentioned above manually, but I suppose it would happen when restarting a host while there is a VM running on that host. Steps to Reproduce: 1. Run a VM 2. Restart the host that the VM runs on 3. Run another VM on that host Actual results: The second VM will probably stay in WaitForLaunch state because the host is not monitored. Expected results: The second VM should eventually switch to UP state. Additional info:
want to backport this to 4.3?
(In reply to Michal Skrivanek from comment #1) > want to backport this to 4.3? Affirmative
Verification build: rhvm-4.3.7.1-0.1.el7 vdsm-4.30.35-1.el7ev.x86_64 libvirt-4.5.0-23.el7_7.1.x86_64 qemu-kvm-rhev-2.12.0-33.el7_7.4.x86_64 Verification scenario: Repeat bug description: "steps to reproduce" few times.
This bugzilla is included in oVirt 4.3.7 release, published on November 21st 2019. Since the problem described in this bug report should be resolved in oVirt 4.3.7 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.