Created attachment 965529 [details] engine.log, vdsm.log and screeshot Description of problem: There is no event in events log when a guest is resumed automatically from paused (due to EIO of 'no space left errors'). Version-Release number of selected component (if applicable): rhev 3.5 vt13.1 How reproducible: Always Steps to Reproduce: 1. Start a VM, install OS --> 2. During VM installation, block connectivity to the storage server where the storage domain is located, wait for the VM to enter paused 3. Resume connectivity yo the domain, VM should be resumed automatically Actual results: 2014-12-07 11:23:14,713 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] (org.ovirt.thread.pool-7-thread-19) domain c13d6f46-8855-468b-8306-44beccd5f199:gluster7 in problem. vds: green-vdsa 2014-12-07 11:23:17,739 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-76) VM vm-2 bededb8e-ac34-48be-bea8-50b6caa833d5 moved from Up --> Paused 2014-12-07 11:23:17,762 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-76) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM vm-2 has paused due to unknown storage error. 2014-12-07 11:24:37,215 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-75) VM vm-2 bededb8e-ac34-48be-bea8-50b6caa833d5 moved from Paused --> Up 2014-12-07 11:24:46,384 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] (org.ovirt.thread.pool-7-thread-15) Domain c13d6f46-8855-468b-8306-44beccd5f199:gluster7 recovered from problem. vds: green-vdsa While VM is resumed from paused, there is no event for this in the events tab. Expected results: There should be an event regarding the guest resume Additional info: engine.log, vdsm.log and screeshot
Nir - the resuming is done from vdsm's side - do we have a sensible way of reporting this to the engine?
We don't have any infrastructure for notifications from vdsm to engine in 3.5. In 3.6 we should have such infrastructure, so we can report resume events. This looks like virt issue. Storage jobs is finished when we provide events when storage issues are resolved and vm can be resumed. Resuming and monitoring vms is not related to storage.
Moving to virt for consideration. In any event, given the amount of work, this looks more like an RFE than a bug to me.
- we do have events now in 36 - automatic resume is specific to storage flow. We don't want to fire a UI event whenever something happens at vdsm (when you "cont" in the UI it's covered from the UI side), It would probably need some special event... or maybe we can simply explicitly create this backend event whenever we have EIO->Up thoughts?
should be easy to add event (audit log) when engine discover vm return to up from paused
Verify with Setup: RHEVM:3.6.0.3-0.1.el6 vdsm: vdsm-4.17.10.1-0.el7ev libvirt: libvirt-1.2.17-13.el7 Steps to Reproduce: 1. Start a VM, install OS 2. During VM installation, block connectivity to the storage server where the storage domain is located, wait for the VM to enter paused 3. Resume connectivity yo the domain 4. Check VM status Results: After Resume storage connection the VM is back to up, and continue installation.