Bug 1318019 - PXE provision booting up with missing pxe boot entry
Summary: PXE provision booting up with missing pxe boot entry
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Provisioning
Version: 5.5.0
Hardware: All
OS: All
unspecified
high
Target Milestone: GA
: 5.6.0
Assignee: Brandon Dunne
QA Contact: Shveta
URL:
Whiteboard:
Depends On:
Blocks: 1318436
TreeView+ depends on / blocked
 
Reported: 2016-03-15 18:55 UTC by Josh Carter
Modified: 2019-10-10 11:33 UTC (History)
11 users (show)

Fixed In Version: 5.6.0.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1318436 (view as bug list)
Environment:
Last Closed: 2016-06-29 15:42:55 UTC
Category: ---
Cloudforms Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
automation.log (11.71 MB, text/plain)
2016-04-17 21:28 UTC, Shveta
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1348 0 normal SHIPPED_LIVE CFME 5.6.0 bug fixes and enhancement update 2016-06-29 18:50:04 UTC

Description Josh Carter 2016-03-15 18:55:56 UTC
Description of problem:

Case Summary

"the pxe boot entry is deleted after 30 seconds, before the vm can even complete its boot (vmware under high load when that happens, not always the same though)

pxe configured from infrastructure > vm

high probablility that there is a problem with the provision checks that run from the fact that the pxe has been ran (30 seconds, and the kickstart can't even run, there is no OS installed on the system, so the system does not have the time to run through the installation of the os before it reaches the wget callback to cloudforms.

cloudforms keeps waiting that the service is awaiting provision an to "retry", which if it actually waits for the kickstart callback, would explain the problem... and then the email error is just a side problem we should not focus on in this case and has no direct tie to what we are looking into.

the issue happens more often when HA is enabled in vmware, but not only then (it also happens when it isn't)

placement is not automatic, and set to go to a host known to not be highly loaded in the current setup. When we try to reproduce the issue it is moved to a highly used environment." 

Version-Release number of selected component (if applicable): 5.5.2


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 5 CFME Bot 2016-03-16 21:10:58 UTC
New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/568b380ff83ed31c80a5f137dba13bbe40fd6bd2

commit 568b380ff83ed31c80a5f137dba13bbe40fd6bd2
Author:     Brandon Dunne <bdunne>
AuthorDate: Tue Mar 15 16:33:23 2016 -0400
Commit:     Brandon Dunne <bdunne>
CommitDate: Wed Mar 16 11:20:46 2016 -0400

    Extract #powered_on_in_provider? in preparation for refactoring
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1318019

 .../providers/redhat/infra_manager/provision/state_machine.rb       | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comment 6 CFME Bot 2016-03-16 21:11:02 UTC
New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/5e5e3306bb1ed5ae7e5237b9a180a5d1506c245b

commit 5e5e3306bb1ed5ae7e5237b9a180a5d1506c245b
Author:     Brandon Dunne <bdunne>
AuthorDate: Tue Mar 15 17:21:36 2016 -0400
Commit:     Brandon Dunne <bdunne>
CommitDate: Wed Mar 16 11:20:58 2016 -0400

    Wait for the VM to be powered on in the provider
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1318019

 .../redhat/infra_manager/provision/state_machine.rb         | 13 -------------
 .../vmware/infra_manager/provision/state_machine.rb         |  4 ++++
 .../vmware/infra_manager/provision_via_pxe/state_machine.rb |  5 +----
 app/models/miq_provision/state_machine.rb                   | 13 +++++++++++++
 4 files changed, 18 insertions(+), 17 deletions(-)

Comment 7 CFME Bot 2016-03-16 21:11:07 UTC
New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/cf2266d14fff4cead426edb3fc4a9fb4d15d4cac

commit cf2266d14fff4cead426edb3fc4a9fb4d15d4cac
Author:     Brandon Dunne <bdunne>
AuthorDate: Wed Mar 16 11:20:01 2016 -0400
Commit:     Brandon Dunne <bdunne>
CommitDate: Wed Mar 16 11:21:04 2016 -0400

    Move polling destination power status specs to shared examples
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1318019

 spec/factories/miq_provision_vmware.rb             |  5 ++++
 .../infra_manager/provision/state_machine_spec.rb  | 35 +---------------------
 .../infra_manager/provision/state_machine_spec.rb  | 16 ++++++++++
 .../provision_via_pxe/state_machine_spec.rb        | 16 ++++++++++
 ...ared_examples_for_provisioning_state_machine.rb | 35 ++++++++++++++++++++++
 5 files changed, 73 insertions(+), 34 deletions(-)
 create mode 100644 spec/models/manageiq/providers/vmware/infra_manager/provision/state_machine_spec.rb
 create mode 100644 spec/models/manageiq/providers/vmware/infra_manager/provision_via_pxe/state_machine_spec.rb

Comment 8 CFME Bot 2016-03-16 21:11:11 UTC
New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/1a6747c78e45a12b17f2fe88ab990e05c45e5859

commit 1a6747c78e45a12b17f2fe88ab990e05c45e5859
Author:     Brandon Dunne <bdunne>
AuthorDate: Tue Mar 15 17:22:30 2016 -0400
Commit:     Brandon Dunne <bdunne>
CommitDate: Wed Mar 16 11:21:01 2016 -0400

    Wait for the VM to be powered off in the provider
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1318019

 .../providers/redhat/infra_manager/provision/state_machine.rb  | 10 ----------
 .../providers/vmware/infra_manager/provision/state_machine.rb  |  4 ++++
 app/models/miq_provision/state_machine.rb                      | 10 ++++++++++
 3 files changed, 14 insertions(+), 10 deletions(-)

Comment 11 CFME Bot 2016-03-17 16:01:25 UTC
New commit detected on cfme/5.5.z:
https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=dd89b827fae79e665d0ef4ce5803c9eecedaaa25

commit dd89b827fae79e665d0ef4ce5803c9eecedaaa25
Author:     Brandon Dunne <bdunne>
AuthorDate: Tue Mar 15 17:21:36 2016 -0400
Commit:     Brandon Dunne <bdunne>
CommitDate: Wed Mar 16 17:32:11 2016 -0400

    Wait for the VM to be powered on in the provider
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1318019

 .../redhat/infra_manager/provision/state_machine.rb         | 13 -------------
 .../vmware/infra_manager/provision/state_machine.rb         |  4 ++++
 .../vmware/infra_manager/provision_via_pxe/state_machine.rb |  5 +----
 app/models/miq_provision/state_machine.rb                   | 13 +++++++++++++
 4 files changed, 18 insertions(+), 17 deletions(-)

Comment 12 CFME Bot 2016-03-17 16:01:29 UTC
New commit detected on cfme/5.5.z:
https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=97926ee7a08e40597980be2069da482803469912

commit 97926ee7a08e40597980be2069da482803469912
Author:     Brandon Dunne <bdunne>
AuthorDate: Tue Mar 15 17:22:30 2016 -0400
Commit:     Brandon Dunne <bdunne>
CommitDate: Wed Mar 16 17:32:12 2016 -0400

    Wait for the VM to be powered off in the provider
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1318019

 .../providers/redhat/infra_manager/provision/state_machine.rb  | 10 ----------
 .../providers/vmware/infra_manager/provision/state_machine.rb  |  4 ++++
 app/models/miq_provision/state_machine.rb                      | 10 ++++++++++
 3 files changed, 14 insertions(+), 10 deletions(-)

Comment 13 CFME Bot 2016-03-17 16:01:34 UTC
New commit detected on cfme/5.5.z:
https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=c0fee3b009a9931ccd1451f93ee1d7d58bc59ac6

commit c0fee3b009a9931ccd1451f93ee1d7d58bc59ac6
Author:     Brandon Dunne <bdunne>
AuthorDate: Thu Mar 17 09:48:26 2016 -0400
Commit:     Brandon Dunne <bdunne>
CommitDate: Thu Mar 17 09:48:26 2016 -0400

    Move polling destination power status specs to shared examples
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1318019

 spec/factories/miq_provision_vmware.rb             |  5 +++
 .../infra_manager/provision/state_machine_spec.rb  | 47 ++--------------------
 .../infra_manager/provision/state_machine_spec.rb  | 18 +++++++++
 .../provision_via_pxe/state_machine_spec.rb        | 18 +++++++++
 ...ared_examples_for_provisioning_state_machine.rb | 46 +++++++++++++++++++++
 5 files changed, 90 insertions(+), 44 deletions(-)
 create mode 100644 spec/models/manageiq/providers/vmware/infra_manager/provision/state_machine_spec.rb
 create mode 100644 spec/models/manageiq/providers/vmware/infra_manager/provision_via_pxe/state_machine_spec.rb
 create mode 100644 spec/support/examples_group/shared_examples_for_provisioning_state_machine.rb

Comment 14 CFME Bot 2016-03-17 16:01:38 UTC
New commit detected on cfme/5.5.z:
https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=55924864b5bdc392112b681f21b6867fd6e16907

commit 55924864b5bdc392112b681f21b6867fd6e16907
Author:     Brandon Dunne <bdunne>
AuthorDate: Tue Mar 15 16:33:23 2016 -0400
Commit:     Brandon Dunne <bdunne>
CommitDate: Wed Mar 16 17:32:11 2016 -0400

    Extract #powered_on_in_provider? in preparation for refactoring
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1318019

 .../providers/redhat/infra_manager/provision/state_machine.rb       | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comment 17 Shveta 2016-04-17 21:28:22 UTC
Created attachment 1148082 [details]
automation.log

I tried to verify this bug by PXE provisioning but it failed with error :"The operation is not allowed in the current state.')]" 
Attached are automation logs

Comment 18 Brandon Dunne 2016-04-18 14:11:28 UTC
Shveta,

The automation log just tells us that something went wrong.  We would need the evm.log and possibly the provider log to understand what went wrong.

Comment 19 Shveta 2016-04-18 20:33:06 UTC
Tried on another system .
PXE provision succeeded.
Verified in 5.6.0.1-beta2.20160413141124_e25ac0e

Comment 21 errata-xmlrpc 2016-06-29 15:42:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1348


Note You need to log in before you can comment on or make changes to this bug.