Description of problem: deployed overcloud with 1 controller. tried to scale to 1 compute, 3 controllers and 1 ceph and failed for Controller resource: resource_status_reason | ResourceUnknownStatus: Resource failed - Unknown status FAILED due to "AttributeError: 'module' object has no attribute 'MessagingTimeout'" Version-Release number of selected component (if applicable): python-rdomanager-oscplugin-0.0.8-1.el7ost.noarch rhos-release-0.62-1.noarch openstack-heat-engine-2015.1.0-3.el7ost.noarch openstack-heat-api-cfn-2015.1.0-3.el7ost.noarch openstack-heat-templates-0-0.6.20150605git.el7ost.noarch openstack-heat-api-2015.1.0-3.el7ost.noarch heat-cfntools-1.2.8-2.el7.noarch openstack-heat-common-2015.1.0-3.el7ost.noarch python-heatclient-0.6.0-1.el7ost.noarch openstack-heat-api-cloudwatch-2015.1.0-3.el7ost.noarch openstack-tripleo-heat-templates-0.8.6-9.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1.deploy overcloud with 1 compute and 1 controller 2.scale the overcloud using 'openstack overcloud deploy --control-scale 3 --ceph-storage-scale 1 --plan-uuid' Actual results: scale failed Expected results: overcloud was successfully scaled Additional info: [stack@instack ~]$ heat stack-list +--------------------------------------+------------+---------------+----------------------+ | id | stack_name | stack_status | creation_time | +--------------------------------------+------------+---------------+----------------------+ | 2d463b67-856f-4985-84b5-dac704274803 | overcloud | UPDATE_FAILED | 2015-06-21T17:38:57Z | +--------------------------------------+------------+---------------+----------------------+ [stack@instack ~]$ heat resource-list 2d463b67-856f-4985-84b5-dac704274803 +-----------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+----------------------+ | resource_name | physical_resource_id | resource_type | resource_status | updated_time | +-----------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+----------------------+ | BlockStorageAllNodesDeployment | edd4e0e9-9c1c-4888-aeea-dc1658d0e57d | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | BlockStorageNodesPostDeployment | ccf03844-1afb-4bf4-a876-0976dd2971f9 | OS::TripleO::BlockStoragePostDeployment | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | CephClusterConfig | 52aec691-4fb2-4282-8900-46cf188c7946 | OS::TripleO::CephClusterConfig::SoftwareConfig | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | CephStorageAllNodesDeployment | 4b2db21f-3a53-4cac-9213-3c725a94dfdb | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | CephStorageCephDeployment | e61f20c5-d0ad-4b34-a920-996d9ae920cc | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | CephStorageNodesPostDeployment | 4ef49e76-0780-490a-96e5-75acde342489 | OS::TripleO::CephStoragePostDeployment | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | ComputeAllNodesDeployment | 409f0fd2-ead3-4833-a8aa-2fa26b3ef7d5 | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | ComputeCephDeployment | 08b2da3a-7894-488b-857e-d3bf60a1fa8d | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | ComputeNodesPostDeployment | 4f62ba20-7ba1-4165-be8f-98d714ea247d | OS::TripleO::ComputePostDeployment | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | ControlVirtualIP | 903789a3-9ef5-4697-a6e2-c15ad9c79539 | OS::Neutron::Port | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | ControllerAllNodesDeployment | e15edc7d-03bb-4fcf-85f2-31ade7f540b9 | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | ControllerBootstrapNodeConfig | 723accfb-03bf-4e7b-bc90-3e6008dc53aa | OS::TripleO::BootstrapNode::SoftwareConfig | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | ControllerBootstrapNodeDeployment | f986daa9-eb49-47f0-bf67-62482aae9ce2 | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | ControllerCephDeployment | e129b764-f327-4555-9a2f-7cedcb9db989 | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | ControllerClusterConfig | 31e6e498-12e8-48c9-84d1-d53e5cb2494c | OS::Heat::StructuredConfig | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | ControllerClusterDeployment | 67e4b50d-73e8-4ee2-83a0-dd27b4d44ae8 | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | ControllerIpListMap | 2cdeae4c-f208-4acd-b457-61970d23dba0 | OS::TripleO::Network::Ports::NetIpListMap | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | ControllerNodesPostDeployment | eb272c31-c6c7-4dc8-bcd8-36129b2b7405 | OS::TripleO::ControllerPostDeployment | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | ControllerSwiftDeployment | 6b1dfe3e-be77-4c18-9bf2-776509afda0e | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | HeatAuthEncryptionKey | overcloud-HeatAuthEncryptionKey-x5l5jmvqeowp | OS::Heat::RandomString | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | HorizonSecret | overcloud-HorizonSecret-pivvbmqbtgrn | OS::Heat::RandomString | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | MysqlClusterUniquePart | overcloud-MysqlClusterUniquePart-binf3mzcjnbs | OS::Heat::RandomString | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | MysqlRootPassword | overcloud-MysqlRootPassword-phobq6i6y7iy | OS::Heat::RandomString | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | ObjectStorageAllNodesDeployment | 47839a59-7eb2-4777-9e86-024df964586f | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | ObjectStorageNodesPostDeployment | 9e43c9ce-e9fe-48ca-9d00-3e82630d0b14 | OS::TripleO::ObjectStoragePostDeployment | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | ObjectStorageSwiftDeployment | c501cb8c-04ad-41bf-9e44-f5be7e4d2377 | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | PcsdPassword | overcloud-PcsdPassword-sgkx4az7hdek | OS::Heat::RandomString | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | PublicVirtualIP | 6dd64bbc-7f0b-4f8b-b24c-1c7192121f37 | OS::Neutron::Port | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | RabbitCookie | overcloud-RabbitCookie-jcz5yb27svc5 | OS::Heat::RandomString | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | SwiftDevicesAndProxyConfig | 9c9dda32-a136-44c3-9d77-c9d938ed67b3 | OS::TripleO::SwiftDevicesAndProxy::SoftwareConfig | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | VipDeployment | ed555594-e54a-4cbf-90bf-275d8509553a | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | allNodesConfig | 779a0c2e-9fd9-4196-8e1c-24752258d5ff | OS::TripleO::AllNodes::SoftwareConfig | CREATE_COMPLETE | 2015-06-21T17:38:57Z | | VipConfig | 8f91fc88-5cd4-459d-aebf-5041ad1ee49d | OS::TripleO::VipConfig | UPDATE_COMPLETE | 2015-06-21T18:17:27Z | | Ceph-Storage | c1b1f54f-8021-42b6-a87c-85393a1fb3f1 | OS::Heat::ResourceGroup | UPDATE_COMPLETE | 2015-06-21T18:17:28Z | | Networks | 39a6e787-b647-4731-aed9-c583dc996655 | OS::TripleO::Network | UPDATE_COMPLETE | 2015-06-21T18:17:30Z | | Swift-Storage | 2392fa7e-4e67-44a0-aece-df794f394caf | OS::Heat::ResourceGroup | UPDATE_COMPLETE | 2015-06-21T18:17:38Z | | InternalApiVirtualIP | caf88942-50c1-4ca2-b9c9-03308adbbf86 | OS::TripleO::Controller::Ports::InternalApiPort | UPDATE_COMPLETE | 2015-06-21T18:18:00Z | | StorageMgmtVirtualIP | 9cd3edf0-94db-4e15-b162-29c0c4012eef | OS::TripleO::Controller::Ports::StorageMgmtPort | UPDATE_COMPLETE | 2015-06-21T18:18:05Z | | RedisVirtualIP | d9d54244-5f76-4085-860c-3d8efc2d94d4 | OS::TripleO::Controller::Ports::RedisVipPort | UPDATE_COMPLETE | 2015-06-21T18:18:07Z | | StorageVirtualIP | 8efcadb9-99d5-46e7-9e7c-8882c202e88a | OS::TripleO::Controller::Ports::StoragePort | UPDATE_COMPLETE | 2015-06-21T18:18:09Z | | VipMap | 80d2539f-9a40-40f5-9ce0-2d3bb43bb39c | OS::TripleO::Network::Ports::NetIpMap | UPDATE_COMPLETE | 2015-06-21T18:18:13Z | | Compute | 0fa47c06-cefb-4624-ad51-c998079421ca | OS::Heat::ResourceGroup | UPDATE_COMPLETE | 2015-06-21T18:18:15Z | | Controller | 862960c9-0d2d-4525-97c0-b839dfd147a0 | OS::Heat::ResourceGroup | UPDATE_FAILED | 2015-06-21T18:18:18Z | | Cinder-Storage | ef694b10-b9c4-4b15-8b0d-395e5bc7a676 | OS::Heat::ResourceGroup | UPDATE_COMPLETE | 2015-06-21T18:18:31Z | +-----------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+----------------------+ [stack@instack ~]$ nova list +--------------------------------------+-------------------------+--------+------------+-------------+---------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------------------------+--------+------------+-------------+---------------------+ | 0a652eb7-0f0c-406d-a72a-48973d985ac0 | overcloud-cephstorage-0 | ACTIVE | - | Running | ctlplane=192.0.2.11 | | c49e5616-dcaa-412c-8406-2da201beb591 | overcloud-compute-0 | ACTIVE | - | Running | ctlplane=192.0.2.9 | | 601e74ed-cec6-44a0-bf60-868b080e920e | overcloud-controller-0 | ACTIVE | - | Running | ctlplane=192.0.2.10 | | 4b79329e-8b1e-4cfd-aa1e-90cda268fad2 | overcloud-controller-2 | ACTIVE | - | Running | ctlplane=192.0.2.12 | +--------------------------------------+-------------------------+--------+------------+-------------+---------------------+ [stack@instack ~]$ ironic node-list +--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+ | UUID | Name | Instance UUID | Power State | Provision State | Maintenance | +--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+ | 14f6ce5d-3821-4be4-bc85-c758bb76e4fc | None | 4b79329e-8b1e-4cfd-aa1e-90cda268fad2 | power on | active | False | | 1021710a-9e61-46f1-b417-7d943af31839 | None | 0a652eb7-0f0c-406d-a72a-48973d985ac0 | power on | active | False | | d8219c79-2ace-4865-8220-e1853611060d | None | None | power off | available | False | | 457b9740-79ef-43cb-abf7-09391fa1cde5 | None | c49e5616-dcaa-412c-8406-2da201beb591 | power on | active | False | | 60a64d3a-5d82-46e8-a653-30c7f2942975 | None | 601e74ed-cec6-44a0-bf60-868b080e920e | power on | active | False | +--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+ [stack@instack ~]$ heat resource-show 2d463b67-856f-4985-84b5-dac704274803 Controller +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+ | Property | Value | +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+ | attributes | { | | | "attributes": null, | | | "refs": null | | | } | | description | | | links | http://192.0.2.1:8004/v1/7b4f213db6ee4517aaee45249cee5fc8/stacks/overcloud/2d463b67-856f-4985-84b5-dac704274803/resources/Controller (self) | | | http://192.0.2.1:8004/v1/7b4f213db6ee4517aaee45249cee5fc8/stacks/overcloud/2d463b67-856f-4985-84b5-dac704274803 (stack) | | | http://192.0.2.1:8004/v1/7b4f213db6ee4517aaee45249cee5fc8/stacks/overcloud-Controller-5jnzh27yznnh/862960c9-0d2d-4525-97c0-b839dfd147a0 (nested) | | logical_resource_id | Controller | | physical_resource_id | 862960c9-0d2d-4525-97c0-b839dfd147a0 | | required_by | ControllerCephDeployment | | | ControllerBootstrapNodeDeployment | | | ControllerNodesPostDeployment | | | SwiftDevicesAndProxyConfig | | | ControllerClusterConfig | | | ControllerClusterDeployment | | | CephClusterConfig | | | ControllerAllNodesDeployment | | | allNodesConfig | | | ControllerIpListMap | | | ControllerBootstrapNodeConfig | | | ControllerSwiftDeployment | | | VipDeployment | | resource_name | Controller | | resource_status | UPDATE_FAILED | | resource_status_reason | ResourceUnknownStatus: Resource failed - Unknown status FAILED due to "AttributeError: 'module' object has no attribute 'MessagingTimeout'" | | resource_type | OS::Heat::ResourceGroup | | updated_time | 2015-06-21T18:18:18Z | +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
From the output it seems that nova failed to find suitable host for one of controller hosts. After checking ironic nodes on this deployment it seems that the problem is that the one remaining host which should be used (but wasn't matched by nova filter) has wrong capabilities settings: +--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+ | UUID | Name | Instance UUID | Power State | Provision State | Maintenance | +--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+ | 14f6ce5d-3821-4be4-bc85-c758bb76e4fc | None | 4b79329e-8b1e-4cfd-aa1e-90cda268fad2 | power on | active | False | | 1021710a-9e61-46f1-b417-7d943af31839 | None | 0a652eb7-0f0c-406d-a72a-48973d985ac0 | power on | active | False | | d8219c79-2ace-4865-8220-e1853611060d | None | None | power off | available | False | | 457b9740-79ef-43cb-abf7-09391fa1cde5 | None | c49e5616-dcaa-412c-8406-2da201beb591 | power on | active | False | | 60a64d3a-5d82-46e8-a653-30c7f2942975 | None | 601e74ed-cec6-44a0-bf60-868b080e920e | power on | active | False | +--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+ [stack@instack ~]$ ironic node-show d8219c79-2ace-4865-8220-e1853611060d <snip> | reservation | None | | properties | {u'memory_mb': u'4096', u'cpu_arch': u'x86_64', u'local_gb': u'40', | | instance_uuid | None | (IOW the hash is terminated in middle) in compare to another "valid" node: [stack@instack ~]$ ironic node-show 60a64d3a-5d82-46e8-a653-30c7f2942975 <snip> | reservation | None | | properties | {u'memory_mb': u'4096', u'cpu_arch': u'x86_64', u'local_gb': u'40', | | | u'cpus': u'1', u'capabilities': u'boot_option:local'} | | instance_uuid | 601e74ed-cec6-44a0-bf60-868b080e920e |
The error looks similar to this upstream heat bug: https://bugs.launchpad.net/heat/+bug/1466239 It'd be useful to see the undercloud heat-engine logs so we can confirm if it's the same issue.
Disregard my comment #3 - although the ironic capabalities output is wrong, it seems to be irrelevant to the MessageTimeout error, also jfoucal has just reproduced this error on a different setup where ironic ndoe settings is OK.
I had the same issue today, ryansb is looking into my deployment.
The fix for this has landed in upstream master https://review.openstack.org/#/c/192938
The fix is there, but this issue is actually the result of a timeout happening during nova server creation. In the logs, I still see a traceback (now with the correct error message telling us which message timed out) followed two minutes later by a response to the message that timed out. It seems that with more machines sharing the same host that nova startup is delayed. The temp fix is to increase the RPC reply timeout.
@Ryan: ACK. The problem is in the timeout Ryan described. As a workaround I tried to edit /etc/heat/heat.conf increased the timeout to rpc_response_timeout = 600 (uncomment!), restart openstack-heat-engine and the deployment passed.
Here's an instack-undercloud patch that bumps the timeout for you. https://code.engineering.redhat.com/gerrit/#/c/51906/
could be a dupe of bz#1231825?
Added
This also requires https://code.engineering.redhat.com/gerrit/#/c/51906/ before it can be closed.
the rpc_response_timeout is still 60 in /etc/heat/heat.conf in the latest puddle from July 10 patch from comment 14 is failing Jenkins build. Returning to Modified.
Ola, Can you tell me what puddle version you were on, and (if possible) the OPM and instack-undercloud versions? I tried on puddle 2015-07-13.1 (today's puddle) and the rpc_response_timeout is correct.
I had puddle form July 10 e.g 2015-07-10.1
now, in puddle 2015-07-13.1 its # Seconds to wait for a response from a call. (integer value) #rpc_response_timeout = 60 rpc_response_timeout = 600 i assume the fix was dropped from previous puddle, but the bug was set ON_QA...
After IRC discussion & verifying myself, setting back to ON_QA for verification.
Verified with puddle 2015-07-17-1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2015:1548