Red Hat Bugzilla – Bug 1307072
CFME deployment stuck retrying CloudForms::WaitForConsole after 2000+ attempts
Last modified: 2016-09-23 14:50:23 EDT
Created attachment 1123563 [details]
Truncated deployment.log but it's over 8000 lines of "No route to host - connect(2)"
Description of problem:
During a OSP+CFME deployment the CloudForms install is stuck at 61.1% while attempting to retry Actions::Fusor::Deployment::CloudForms::WaitForConsole over 2000 times. It's failing on the subtask Actions::Fusor::Deployment::CloudForms::UpdateAdminPassword due to "No route to host - connect(2)".
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install Sat and OOO from ISO
2. Deploy OSP + CFME
OSP+CFME install stuck at 61.1% while failing to update the CFME admin password
Deployment should fail when satellite is unable to contact CFME after a max number of attempts.
ISO Installer Versions:
Network Info Used for install:
Sat Prov - 192.168.155.0/24 - GW: 192.168.155.1
OSP Prov - 192.0.2.0/24 - GW: 192.0.2.1
OSP Pub - 192.168.156.0/24 - GW: 192.168.156.1
Thank you for the log file, unfortunately this is not enough for us to determine what went wrong. We need more info. What you saw, with the task failing at WaitForConsole is not a frequent error...I suspect something went wrong in CFME itself, starting the web service.
If you have the deployment up please debug this further so we may learn what is wrong to fix.
The WaitForConsole task assumes the VM is up and that appliance-console-cli succeeded, this task should be waiting for the CFME WebService itself to come up and be ready.
Are you able to access the CFME WebUI?
If not, please log into the CFME VM and see if you can find more information of what went wrong, look for CFME log files, I think the log files will be at /var/www/miq/vmdb/log
I checked the CFME instance and it was unresponsive on the console. There was also an issue with the overcloud hypervisor. After restarting both, Satellite automatically contacted CFME and continued with the deployment successfully. I'm going to try to reproduce the issue but it appears that Satellite would have continued to try to reach CFME without checking for a max number of failed attempts.