Description of problem: tried to deploy 7.2 on clean Virt env with 3 controllers, 1 compute ans 1 ceph using the following command: openstack overcloud deploy --templates --control-scale 3 --ceph-storage-scale 1 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml --ntp-server clock.redhat.com --libvirt-type qemu without any changes in yaml files or any additional config. deployment failed on 3 controllers Version-Release number of selected component (if applicable): openstack-tripleo-heat-templates-0.8.6-87.el7ost.noarch How reproducible: 100% reproduced to me 3 times with this setup Steps to Reproduce: 1.install latest (Dec 4th) undercloud on RHEL7.2 2.upload pre-built latest images (from Dec 4), register nodes, (follow the guide) 3. deploy the overcloud with 3 controllers, 1 compute and 1 ceph using the following command: openstack overcloud deploy --templates --control-scale 3 --ceph-storage-scale 1 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml --ntp-server clock.redhat.com --libvirt-type qemu Actual results: Deployment failed: $ openstack overcloud deploy --templates --control-scale 3 --ceph-storage-scale 1 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml --ntp-server clock.redhat.com --libvirt-type qemu Deploying templates in the directory /usr/share/openstack-tripleo-heat-templates Stack failed with status: Resource CREATE failed: MessagingTimeout: resources.Controller.resources[2]: Timed out waiting for a reply to message ID 30aa11ea9d39478d9cefd9243a86f291 ERROR: openstack Heat Stack create failed. Expected results: Overcloud successfully deployed Additional info: from heat-api.log on instack machine: 2015-12-08 10:42:26.921 28372 INFO oslo_messaging._drivers.impl_rabbit [req-554a14d0-b2af-4ffb-b064-42c4d1e49d38 c6e 5f48f6a6a4fdd8000ca2822088472 110b5499f44a48f19495ed8d9cc11ea9] Connected to AMQP server on 192.0.2.1:5672 2015-12-08 10:42:26.976 28372 DEBUG heat.common.serializers [req-554a14d0-b2af-4ffb-b064-42c4d1e49d38 c6e5f48f6a6a4f dd8000ca2822088472 110b5499f44a48f19495ed8d9cc11ea9] JSON response : {"explanation": "The resource could not be foun d.", "code": 404, "error": {"message": "The Stack (overcloud) could not be found.", "traceback": "Traceback (most re cent call last):\n\n File \"/usr/lib/python2.7/site-packages/heat/common/context.py\", line 300, in wrapped\n re turn func(self, ctx, *args, **kwargs)\n\n File \"/usr/lib/python2.7/site-packages/heat/engine/service.py\", line 43 4, in identify_stack\n raise exception.StackNotFound(stack_name=stack_name)\n\nStackNotFound: The Stack (overclou d) could not be found.\n", "type": "StackNotFound"}, "title": "Not Found"} to_json /usr/lib/python2.7/site-packages/ heat/common/serializers.py:42 from heat-engine.log on instack machine: 2015-12-08 10:50:57.703 28339 INFO heat.engine.resource [-] CREATE: ResourceGroup "Controller" [802e0f08-a865-455b-a 55a-27f08a97118b] Stack "overcloud" [cc05d2af-aa97-47be-bfa2-054e85172bde] 2015-12-08 10:50:57.703 28339 TRACE heat.engine.resource Traceback (most recent call last): 2015-12-08 10:50:57.703 28339 TRACE heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resour ce.py", line 528, in _action_recorder 2015-12-08 10:50:57.703 28339 TRACE heat.engine.resource yield 2015-12-08 10:50:57.703 28339 TRACE heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resour ce.py", line 598, in _do_action 2015-12-08 10:50:57.703 28339 TRACE heat.engine.resource yield self.action_handler_task(action, args=handler_arg s) 2015-12-08 10:50:57.703 28339 TRACE heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/schedu ler.py", line 313, in wrapper 2015-12-08 10:50:57.703 28339 TRACE heat.engine.resource step = next(subtask) 2015-12-08 10:50:57.703 28339 TRACE heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resour ce.py", line 572, in action_handler_task 2015-12-08 10:50:57.703 28339 TRACE heat.engine.resource while not check(handler_data): 2015-12-08 10:50:57.703 28339 TRACE heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resour ces/stack_resource.py", line 299, in check_create_complete 2015-12-08 10:50:57.703 28339 TRACE heat.engine.resource return self._check_status_complete(resource.Resource.CR EATE) 2015-12-08 10:50:57.703 28339 TRACE heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resour ces/stack_resource.py", line 340, in _check_status_complete 2015-12-08 10:50:57.703 28339 TRACE heat.engine.resource action=action) 2015-12-08 10:50:57.703 28339 TRACE heat.engine.resource ResourceFailure: MessagingTimeout: resources.Controller.res ources[2]: Timed out waiting for a reply to message ID 30aa11ea9d39478d9cefd9243a86f291 2015-12-08 10:50:57.703 28339 TRACE heat.engine.resource
How much memory does your undercloud have. The product documentation states that 6GB RAM is the minimum but I suspect there are still testers running with 4GB underclouds, which would definitely cause random undercloud failures.
(In reply to Steve Baker from comment #3) > How much memory does your undercloud have. The product documentation states > that 6GB RAM is the minimum but I suspect there are still testers running > with 4GB underclouds, which would definitely cause random undercloud > failures. instack has 8gb and each VM 5gb This setup worked for me with earlier OSP releases.
reproduce to me with latest rhel-osp-director-puddle-2015-12-16-1 It seems that at some point deployment fails on second controller node and doesn't try to deploy on third at all. same env but this time undercloud vm has 7gb and each vm 5gb
This sounds similar to https://bugzilla.redhat.com/show_bug.cgi?id=1290949 In that case, the root cause wasn't memory, but attempting to run with a single CPU undercloud - workarounds are noted in that bz.
(In reply to Steven Hardy from comment #7) > This sounds similar to https://bugzilla.redhat.com/show_bug.cgi?id=1290949 > > In that case, the root cause wasn't memory, but attempting to run with a > single CPU undercloud - workarounds are noted in that bz. You right, if I change to 4 CPU and now it's works. Of course I add also some worker on nova.
Per comment #8, closing this as a duplicate of bz #1290949
*** This bug has been marked as a duplicate of bug 1290949 ***