Bug 1307072 - CFME deployment stuck retrying CloudForms::WaitForConsole after 2000+ attempts
CFME deployment stuck retrying CloudForms::WaitForConsole after 2000+ attempts
Status: NEW
Product: Red Hat Quickstart Cloud Installer
Classification: Red Hat
Component: Installation - CloudForms (Show other bugs)
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: John Matthews
Dave Johnson
Dan Macpherson
: Triaged
Depends On:
  Show dependency treegraph
Reported: 2016-02-12 10:51 EST by Landon LaSmith
Modified: 2016-09-23 14:50 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Truncated deployment.log but it's over 8000 lines of "No route to host - connect(2)" (40.61 KB, text/plain)
2016-02-12 10:51 EST, Landon LaSmith
no flags Details

  None (edit)
Description Landon LaSmith 2016-02-12 10:51:33 EST
Created attachment 1123563 [details]
Truncated deployment.log but it's over 8000 lines of "No route to host - connect(2)"

Description of problem:
During a OSP+CFME deployment the CloudForms install is stuck at 61.1% while attempting to retry Actions::Fusor::Deployment::CloudForms::WaitForConsole over 2000 times.  It's failing on the subtask Actions::Fusor::Deployment::CloudForms::UpdateAdminPassword due to "No route to host - connect(2)".

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Install Sat and OOO from ISO
2. Deploy OSP + CFME

Actual results:
OSP+CFME install stuck at 61.1% while failing to update the CFME admin password

Expected results:
Deployment should fail when satellite is unable to contact CFME after a max number of attempts.

Additional info:
ISO Installer Versions:

Network Info Used for install:
Sat Prov - - GW:
OSP Prov - - GW:
OSP Pub - - GW:
Comment 1 John Matthews 2016-02-12 11:08:01 EST

Thank you for the log file, unfortunately this is not enough for us to determine what went wrong.  We need more info.  What you saw, with the task failing at WaitForConsole is not a frequent error...I suspect something went wrong in CFME itself, starting the web service.

If you have the deployment up please debug this further so we may learn what is wrong to fix.

The WaitForConsole task assumes the VM is up and that appliance-console-cli succeeded,  this task should be waiting for the CFME WebService itself to come up and be ready.

Are you able to access the CFME WebUI?  
If not, please log into the CFME VM and see if you can find more information of what went wrong, look for CFME log files, I think the log files will be at /var/www/miq/vmdb/log
Comment 2 Landon LaSmith 2016-02-12 16:33:08 EST

I checked the CFME instance and it was unresponsive on the console.  There was also an issue with the overcloud hypervisor.  After restarting both, Satellite automatically contacted CFME and continued with the deployment successfully.  I'm going to try to reproduce the issue but it appears that Satellite would have continued to try to reach CFME without checking for a max number of failed attempts.

Note You need to log in before you can comment on or make changes to this bug.