Bug 1052024
Summary: | After a power outage two VMs marked as HA failed to start automatically, they were required to be started manually. | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Aval <avyadav> | |
Component: | ovirt-engine | Assignee: | Gilad Chaplik <gchaplik> | |
Status: | CLOSED ERRATA | QA Contact: | Artyom <alukiano> | |
Severity: | high | Docs Contact: | ||
Priority: | urgent | |||
Version: | 3.2.0 | CC: | acathrow, avyadav, dfediuck, gchaplik, iheim, lpeer, mavital, pmukhedk, Rhev-m-bugs, sherold, sputhenp, tpoitras, vgaikwad, yeylon | |
Target Milestone: | --- | Keywords: | ZStream | |
Target Release: | 3.4.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | sla | |||
Fixed In Version: | av3 | Doc Type: | Bug Fix | |
Doc Text: |
Previously, some virtual machines did not automatically restart after a power failure. As a result, they would have to be manually restarted. Now, the issue has been corrected and all virtual machines restart as expected.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1074478 (view as bug list) | Environment: | ||
Last Closed: | 2014-06-09 15:08:41 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | SLA | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1074478, 1078909, 1142926 |
Comment 2
Doron Fediuck
2014-01-13 15:25:58 UTC
Created attachment 849684 [details]
engine.log
(In reply to Doron Fediuck from comment #2) > Hi, can you please provide the exact rhev version, and the relevant engine > log files? - Version-Release number of selected component (if applicable): rhevm-3.2.2-0.41.el6ev.noarch - Attached engine.log file Description of problem : After a power outage two of the VMs( out of 8 ) marked as HA failed to start automatically. Customer was required to start them manually after waiting for few hours expecting RHEV-M to handle this automatically. Environment details : - 2 Hypervisors with 24GB RAM each - 10 VMs - Example of HA VM "mastro03srv" start failure ( already two VMs were started successfully before this VM ) . Later customer started it manually without any error. ~~~ 2013-12-29 07:10:25,456 INFO [org.ovirt.engine.core.bll.VdsEventListener] (QuartzScheduler_Worker-13) [33217069] Failed to start Highly Available VM. Attempting to restart. VM Name: mastro03srv, VM Id:c52b7bdb-9c3a-4d76-9f06-42cbb7687a17 2013-12-29 07:10:25,468 INFO [org.ovirt.engine.core.bll.RunVmCommand] (QuartzScheduler_Worker-13) [33217069] Lock Acquired to object EngineLock [exclusiveLocks= key: c52b7bdb-9c3a-4d76-9f06-42cbb7687a17 value: VM , sharedLocks= ] 2013-12-29 07:10:25,476 INFO [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (QuartzScheduler_Worker-13) [33217069] START, IsVmDuringInitiatingVDSCommand( vmId = c52b7bdb-9c3a-4d76-9f06-42cbb7687a17), log id: 583aa7de 2013-12-29 07:10:25,477 INFO [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (QuartzScheduler_Worker-13) [33217069] FINISH, IsVmDuringInitiatingVDSCommand, return: false, log id: 583aa7de 2013-12-29 07:10:25,490 INFO [org.ovirt.engine.core.bll.RunVmCommand] (QuartzScheduler_Worker-13) [33217069] Running command: RunVmCommand internal: true. Entities affected : ID: c52b7bdb-9c3a-4d76-9f06-42cbb7687a17 Type: VM 2013-12-29 07:10:25,514 INFO [org.ovirt.engine.core.bll.RunVmCommand] (QuartzScheduler_Worker-13) [33217069] Lock freed to object EngineLock [exclusiveLocks= key: c52b7bdb-9c3a-4d76-9f06-42cbb7687a17 value: VM , sharedLocks= ] 2013-12-29 07:10:25,514 INFO [org.ovirt.engine.core.bll.RunVmCommand] (QuartzScheduler_Worker-13) [33217069] Failed to run desktop mastro03srv, rerun 2013-12-29 07:10:25,519 INFO [org.ovirt.engine.core.vdsbroker.UpdateVdsDynamicDataVDSCommand] (QuartzScheduler_Worker-13) [33217069] START, UpdateVdsDynamicDataVDSCommand(HostName = rhev-hv01.xxxxxx.com, HostId = 0a4f8d16-ed7e-4d54-8199-ccb3f5e31baf, vdsDynamic=org.ovirt.engine.core.common.businessentities.VdsDynamic@5bdabd1b), log id: 4e7d3ce3 2013-12-29 07:10:25,521 INFO [org.ovirt.engine.core.vdsbroker.UpdateVdsDynamicDataVDSCommand] (QuartzScheduler_Worker-13) [33217069] FINISH, UpdateVdsDynamicDataVDSCommand, log id: 4e7d3ce3 [...] 2013-12-29 07:10:25,549 INFO [org.ovirt.engine.core.bll.RunVmCommand] (QuartzScheduler_Worker-13) [33217069] Lock Acquired to object EngineLock [exclusiveLocks= key: c52b7bdb-9c3a-4d76-9f06-42cbb7687a17 value: VM , sharedLocks= ] 2013-12-29 07:10:25,566 INFO [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (QuartzScheduler_Worker-13) [33217069] START, IsVmDuringInitiatingVDSCommand( vmId = c52b7bdb-9c3a-4d76-9f06-42cbb7687a17), log id: 474630fa 2013-12-29 07:10:25,566 INFO [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (QuartzScheduler_Worker-13) [33217069] FINISH, IsVmDuringInitiatingVDSCommand, return: false, log id: 474630fa 2013-12-29 07:10:25,576 INFO [org.ovirt.engine.core.bll.VdsSelector] (QuartzScheduler_Worker-13) [33217069] VDS rhev-hv01.xxxxxx.com 0a4f8d16-ed7e-4d54-8199-ccb3f5e31baf have failed running this VM in the current selection cycle VDS rhev-hv02.xxxxxx.com 4e942526-ac3a-4a46-b969-4bbe139c67d5 is not in up status or belongs to the VM's cluster 2013-12-29 07:10:25,577 WARN [org.ovirt.engine.core.bll.RunVmCommand] (QuartzScheduler_Worker-13) [33217069] CanDoAction of action RunVm failed. Reasons:VAR__ACTION__RUN,VAR__TYPE__VM,VAR__ACTION__RUN,VAR__TYPE__VM,ACTION_TYPE_FAILED_VDS_VM_CLUSTER 2013-12-29 07:10:25,577 INFO [org.ovirt.engine.core.bll.RunVmCommand] (QuartzScheduler_Worker-13) [33217069] Lock freed to object EngineLock [exclusiveLocks= key: c52b7bdb-9c3a-4d76-9f06-42cbb7687a17 value: VM , sharedLocks= ] ~~~ After the above failure second HA VM "mx" failed with error ~~~ 2013-12-29 07:10:36,839 INFO [org.ovirt.engine.core.bll.VdsSelector] (QuartzScheduler_Worker-13) [33217069] VDS rhev-hv01.xxxxxx.com 0a4f8d16-ed7e-4d54-8199-ccb3f5e31baf has insufficient memory to run the VM VDS rhev-hv02.xxxxxx.com 4e942526-ac3a-4a46-b969-4bbe139c67d5 is not in up status or belongs to the VM's cluster 2013-12-29 07:10:36,839 WARN [org.ovirt.engine.core.bll.RunVmCommand] (QuartzScheduler_Worker-13) [33217069] CanDoAction of action RunVm failed. Reasons:VAR__ACTION__RUN,VAR__TYPE__VM,ACTION_TYPE_FAILED_VDS_VM_MEMORY ~~~ Which customer does not agree with as total required memory for all the VMs is 15GB and they have 2 hosts with 24GB each. Version-Release number of selected component (if applicable): rhevm-3.2.2-0.41.el6ev.noarch How reproducible: No consistent way, it happened after power outage on one the customers setup Steps to Reproduce: 1. 2. 3. Actual results: 2 out of 8 VMs marked as HA failed to automatically start after power outage Expected results: All VMs marked HA should be started automatically by RHEV-M Additional info: Verified on av3 Add host with 16G, and run on it four HA vms, 3 with 4096MB and one with 2048MB, after it powered off host, wait 5 minutes and power on host, all vms start fine. Also test run under 'None' cluster policy Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2014-0506.html |