Bug 1235822

Summary: Cannot run VM because it is in Powering Up status, encountered during phase autostart_destination
Product: Red Hat CloudForms Management Engine Reporter: Jared Deubel <jdeubel>
Component: ProvisioningAssignee: Brandon Dunne <bdunne>
Status: CLOSED ERRATA QA Contact: Taras Lehinevych <tlehinev>
Severity: high Docs Contact:
Priority: high    
Version: 5.3.0CC: bdunne, cpelland, dajohnso, gmccullo, jhardy, mfeifer, obarenbo, psavage, tlehinev
Target Milestone: GA   
Target Release: 5.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 5.5.0.1 Doc Type: Bug Fix
Doc Text:
Previously, virtual machines on Red Hat Enterprise Virtualization Manager started slowly, and as a result the virtual machines could not run because they were in Powering Up status during the autostart_destination phase. This has been fixed by adding a loop in the code for the host to requeue a virtual machine's boot request. The virtual machine will automatically raise an error to fail the provisioning if it has not started after 120 attempts.
Story Points: ---
Clone Of:
: 1240754 1240756 1240847 (view as bug list) Environment:
Last Closed: 2015-12-08 13:19:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1240754, 1240756, 1240847    
Attachments:
Description Flags
pxe_error none

Description Jared Deubel 2015-06-25 20:14:41 UTC
Created attachment 1043274 [details]
pxe_error

Description of problem:
Cannot run VM because it is in Powering Up status, encountered during phase autostart_destination. Using same kickstart, same profile while provisioning in RHEV but only works part of the time. 

Snippet from evm.log
================================================================================
[----] I, [2015-06-23T16:41:50.803534 #18969:107f804]  INFO -- : Q-task_id([miq_provision_312000000000699]) Starting Phase <autostart_destination>
[----] I, [2015-06-23T16:41:50.820298 #18969:107f804]  INFO -- : Q-task_id([miq_provision_312000000000699]) MIQ(MiqProvisionRedhatViaPxe#autostart_destination) Starting Vm id: [312000000000315], name: [dev52sa]
[----] I, [2015-06-23T16:41:50.865517 #18969:107f804]  INFO -- : Q-task_id([miq_provision_312000000000699]) MIQ(Event.raise_evm_event): Event Raised [request_vm_start]
[----] I, [2015-06-23T16:41:50.909010 #18969:107f804]  INFO -- : Q-task_id([miq_provision_312000000000699]) MIQ(Event.raise_evm_event): Alert for Event [request_vm_start]
[----] I, [2015-06-23T16:41:50.909138 #18969:107f804]  INFO -- : Q-task_id([miq_provision_312000000000699]) MIQ(MiqAlert.evaluate_alerts) [request_vm_start] Target: VmRedhat Name: [dev52sa], Id: [312000000000315]
[----] I, [2015-06-23T16:41:50.929055 #18969:107f804]  INFO -- : Q-task_id([miq_provision_312000000000699]) MIQ(EmsRedhat.with_provider_connection) Connecting through EmsRedhat: [devrhevm02]
[----] I, [2015-06-23T16:41:51.077389 #18969:107f804]  INFO -- : Q-task_id([miq_provision_312000000000699]) Starting Phase <provision_error>
[----] E, [2015-06-23T16:41:51.101456 #18969:107f804] ERROR -- : Q-task_id([miq_provision_312000000000699]) MIQ(MiqProvisionRedhatViaPxe#provision_error) [[RhevmApiError]: [Cannot run VM because it is in Powering Up status.]] encountered during phase [autostart_destination]
================================================================================






Version-Release number of selected component (if applicable):
version=5.3.3.2, build=20150217120931_a465215

How reproducible:
50%

Actual results:
Only able to provision 50% of the time.

Expected results:
Should be able to provision all of the time. 

Additional info:

Comment 4 Brandon Dunne 2015-06-30 17:57:27 UTC
Pete,

In order to reproduce the error, you'll need to somehow slow down your RHEVM.  I believe they are encountering this error because there is a delay of several seconds between the response of our call to power on the VM (which gets the correct response from RHEV) and the time that the VM changes from the off state in the provider.  The only thing I can think of to do to reliably reproduce this is to put a sleep in RHEVs code between replying to the caller of the power on and actually executing the power on.

Hope that helps,
Brandon

Comment 6 CFME Bot 2015-07-02 21:02:15 UTC
New commit detected on manageiq/master:
https://github.com/ManageIQ/manageiq/commit/a8b2578b84088d1fb73fbcc94a566248d8057119

commit a8b2578b84088d1fb73fbcc94a566248d8057119
Author:     Brandon Dunne <bdunne>
AuthorDate: Fri Jun 26 07:58:34 2015 -0400
Commit:     Brandon Dunne <bdunne>
CommitDate: Fri Jun 26 07:58:34 2015 -0400

    Handle cases where RHEV is slow to start VM's
    
    Create a loop in the state machine after requesting the VM boot.
    Requeue 120 times, if the VM still hasn't started raise an error to fail
    the provision.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1235822

 vmdb/app/models/miq_provision_redhat/state_machine.rb       | 13 +++++++++++++
 .../models/miq_provision_redhat_via_iso/state_machine.rb    |  5 +----
 .../models/miq_provision_redhat_via_pxe/state_machine.rb    |  5 +----
 3 files changed, 15 insertions(+), 8 deletions(-)

Comment 7 CFME Bot 2015-07-02 21:02:19 UTC
New commit detected on manageiq/master:
https://github.com/ManageIQ/manageiq/commit/e60987d7313e3cda02e5f1bae45375c84817416d

commit e60987d7313e3cda02e5f1bae45375c84817416d
Author:     Brandon Dunne <bdunne>
AuthorDate: Fri Jun 26 07:59:34 2015 -0400
Commit:     Brandon Dunne <bdunne>
CommitDate: Fri Jun 26 07:59:34 2015 -0400

    Add specs for MiqProvisionRedhat#poll_destination_powered_on_in_provider
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1235822

 .../miq_provision_redhat/state_machine_spec.rb     | 26 ++++++++++++++++++++++
 1 file changed, 26 insertions(+)

Comment 8 CFME Bot 2015-07-07 18:12:22 UTC
New commit detected on cfme/5.4.z:
https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=a58a44058c932f00f67d2a869e398f24303f6fe0

commit a58a44058c932f00f67d2a869e398f24303f6fe0
Author:     Brandon Dunne <bdunne>
AuthorDate: Fri Jun 26 07:58:34 2015 -0400
Commit:     Brandon Dunne <bdunne>
CommitDate: Mon Jul 6 15:31:03 2015 -0400

    Handle cases where RHEV is slow to start VM's
    
    Create a loop in the state machine after requesting the VM boot.
    Requeue 120 times, if the VM still hasn't started raise an error to fail
    the provision.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1235822

 vmdb/app/models/miq_provision_redhat/state_machine.rb       | 13 +++++++++++++
 .../models/miq_provision_redhat_via_iso/state_machine.rb    |  5 +----
 .../models/miq_provision_redhat_via_pxe/state_machine.rb    |  5 +----
 3 files changed, 15 insertions(+), 8 deletions(-)

Comment 9 CFME Bot 2015-07-07 18:12:27 UTC
New commit detected on cfme/5.4.z:
https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=19cc069c4f1f59da303827e6985a42b5c5ca5b79

commit 19cc069c4f1f59da303827e6985a42b5c5ca5b79
Author:     Brandon Dunne <bdunne>
AuthorDate: Fri Jun 26 07:59:34 2015 -0400
Commit:     Brandon Dunne <bdunne>
CommitDate: Mon Jul 6 15:31:30 2015 -0400

    Add specs for MiqProvisionRedhat#poll_destination_powered_on_in_provider
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1235822

 .../miq_provision_redhat/state_machine_spec.rb     | 26 ++++++++++++++++++++++
 1 file changed, 26 insertions(+)

Comment 10 CFME Bot 2015-07-07 21:43:52 UTC
New commit detected on cfme/5.3.5:
https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=8f8ea73dd7b7604bb4b1ae50b7a62cc35e5fb4e9

commit 8f8ea73dd7b7604bb4b1ae50b7a62cc35e5fb4e9
Author:     Brandon Dunne <bdunne>
AuthorDate: Fri Jun 26 07:58:34 2015 -0400
Commit:     Brandon Dunne <bdunne>
CommitDate: Tue Jul 7 17:01:21 2015 -0400

    Handle cases where RHEV is slow to start VM's
    
    Create a loop in the state machine after requesting the VM boot.
    Requeue 120 times, if the VM still hasn't started raise an error to fail
    the provision.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1235822

 vmdb/app/models/miq_provision_redhat/state_machine.rb       | 13 +++++++++++++
 .../models/miq_provision_redhat_via_iso/state_machine.rb    |  5 +----
 .../models/miq_provision_redhat_via_pxe/state_machine.rb    |  5 +----
 3 files changed, 15 insertions(+), 8 deletions(-)

Comment 11 CFME Bot 2015-07-07 21:43:55 UTC
New commit detected on cfme/5.3.5:
https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=278bf197341c1c10d054aaf16c843958f2d0baaa

commit 278bf197341c1c10d054aaf16c843958f2d0baaa
Author:     Brandon Dunne <bdunne>
AuthorDate: Tue Jul 7 16:05:27 2015 -0400
Commit:     Brandon Dunne <bdunne>
CommitDate: Tue Jul 7 17:01:34 2015 -0400

    Add specs for MiqProvisionRedhat#poll_destination_powered_on_in_provider
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1235822

 .../miq_provision_redhat/state_machine_spec.rb     | 26 ++++++++++++++++++++++
 1 file changed, 26 insertions(+)

Comment 12 CFME Bot 2015-07-07 21:44:31 UTC
New commit detected on cfme/5.3.z:
https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=2dc3b2460c10e17db1c7558f2cfadf587913b7a4

commit 2dc3b2460c10e17db1c7558f2cfadf587913b7a4
Author:     Brandon Dunne <bdunne>
AuthorDate: Fri Jun 26 07:58:34 2015 -0400
Commit:     Brandon Dunne <bdunne>
CommitDate: Tue Jul 7 17:02:11 2015 -0400

    Handle cases where RHEV is slow to start VM's
    
    Create a loop in the state machine after requesting the VM boot.
    Requeue 120 times, if the VM still hasn't started raise an error to fail
    the provision.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1235822

 vmdb/app/models/miq_provision_redhat/state_machine.rb       | 13 +++++++++++++
 .../models/miq_provision_redhat_via_iso/state_machine.rb    |  5 +----
 .../models/miq_provision_redhat_via_pxe/state_machine.rb    |  5 +----
 3 files changed, 15 insertions(+), 8 deletions(-)

Comment 13 CFME Bot 2015-07-07 21:44:35 UTC
New commit detected on cfme/5.3.z:
https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=30b8d2e382a37e8cabe5947b3390d43f1195fd33

commit 30b8d2e382a37e8cabe5947b3390d43f1195fd33
Author:     Brandon Dunne <bdunne>
AuthorDate: Tue Jul 7 16:05:27 2015 -0400
Commit:     Brandon Dunne <bdunne>
CommitDate: Tue Jul 7 17:02:14 2015 -0400

    Add specs for MiqProvisionRedhat#poll_destination_powered_on_in_provider
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1235822

 .../miq_provision_redhat/state_machine_spec.rb     | 26 ++++++++++++++++++++++
 1 file changed, 26 insertions(+)

Comment 14 Brandon Dunne 2015-07-07 22:05:53 UTC
*** Bug 1240847 has been marked as a duplicate of this bug. ***

Comment 15 Taras Lehinevych 2015-11-18 12:17:14 UTC
According to comment #4 the way to reliably reproduce this bug is to put a sleep in RHEVs code between replying to the caller of the power on and actually executing the power on. 
Need information is about where exactly in RHEVs code should be added sleep.

Comment 16 Brandon Dunne 2015-11-18 15:17:05 UTC
Taras,  You would probably have to talk to someone on the RHEV team.  I don't know where that would be.

Comment 18 errata-xmlrpc 2015-12-08 13:19:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:2551