Created attachment 1040639 [details] logs from the undercloud Description of problem: The rdo-management delorean promotion job is failing w/ the following errors https://ci.centos.org/view/rdo/job/rdo_manager-promote-build-delorean-rdo_management-kilo/5/consoleFull overcloud deployment fails w/ "Message: No valid host was found. There are not enough hosts available., Code: 500" 18:40:13 2015-06-18 18:29:13.947 12244 TRACE heat.engine.resource ResourceInError: Went to status ERROR due to "Message: No valid host was found. There are not enough hosts available., Code: 500" 18:40:13 2015-06-18 18:29:13.961 12244 TRACE heat.engine.resource ResourceInError: Went to status ERROR due to "Message: No valid host was found. There are not enough hosts available., Code: 500" 18:40:13 2015-06-18 18:35:42.380 12244 ERROR heat.engine.resources.openstack.heat.software_deployment [-] Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1 18:40:12 "status": "FAILED", 18:40:12 "server_id": "d9a2a65c-b5bc-4b4d-b86e-894507934542", 18:40:12 "config_id": "b01c8c68-c066-45cb-8421-5e84909fbbee", 18:40:12 "output_values": { 18:40:12 "deploy_stdout": "", 18:40:12 "deploy_stderr": "\u001b[1;31mError: Invalid parameter service_manage on Class[Mongodb::Server] at /var/lib/heat-config/heat-config-puppet/b01c8c68-c066-45cb-8421-5e84909fbbee.pp:102 on node ov-c3uyjet2kzt-0-jcmm6xugu5vt-controller-qbjxjzy6hjsy.localdomain\nWrapped exception:\nInvalid parameter service_manage\u001b[0m\n\u001b[1;31mError: Invalid parameter service_manage on Class[Mongodb::Server] at /var/lib/heat-config/heat-config-puppet/b01c8c68-c066-45cb-8421-5e84909fbbee.pp:102 on node ov-c3uyjet2kzt-0-jcmm6xugu5vt-controller-qbjxjzy6hjsy.localdomain\u001b[0m\n", 18:40:12 "deploy_status_code": 1 18:40:12 }, 18:40:12 "creation_time": "2015-06-18T18:35:00Z", 18:40:12 "updated_time": "2015-06-18T18:35:41Z", 18:40:12 "input_values": {}, 18:40:12 "action": "CREATE", 18:40:12 "status_reason": "deploy_status_code : Deployment exited with non-zero status code: 1", 18:40:12 "id": "9b2d184d-c94f-4ce7-b3f2-c0a106c97263" 18:40:12 }
Unfortunately, the 'no valid hosts' error can be reached in many ways. In this case, it is a symptom rather than a cause of our woes. Fortunately, we have awesome output from Heat. I think this commit in openstack-puppet-modules is what is breaking us: https://github.com/redhat-openstack/openstack-puppet-modules/commit/d1b3b94d750c6c08822d3af1dee0cb9d1c9a5b18 That is from 23 days ago, which seems a bit far, but it is the commit that added the service_manage parameter that we are missing.
It's not the commit in the puppet module which breaks, is that we don't get installed in the images a version of the module fresh enough to support the service_manage parameter (which we use from the manifest). How do we get openstack-puppet-modules updated for RDO?
We also have troubleshooting docs to go through on no valid host found error: https://repos.fedorapeople.org/repos/openstack-m/docs/master/troubleshooting/troubleshooting-overcloud.html#no-valid-host-found-error
This bug is against a Version which has reached End of Life. If it's still present in supported release (http://releases.openstack.org), please update Version and reopen.