Bug 1291047

Summary: (RDO Mitaka) Overcloud deployment failed: Exceeded max scheduling attempts
Product: [Community] RDO Reporter: Ronnie Rasouli <rrasouli>
Component: openstack-heatAssignee: Zane Bitter <zbitter>
Status: CLOSED EOL QA Contact: Amit Ugol <augol>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: LibertyCC: jpeeler, rrasouli, srevivo, stefan.heusi
Target Milestone: ---   
Target Release: Kilo   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-05-19 15:31:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ronnie Rasouli 2015-12-13 06:22:20 UTC
Description of problem:

An attempt to deploy RDO with RDO-Manager failed with timeout error

openstack overcloud deploy --templates ~/templates/my-overcloud -e ~/templates/my-overcloud/environments/network-isolation.yaml -e ~/templates/network-environment.yaml -e ~/templates/my-overcloud/compute.yaml  --control-scale 1 --compute-scale 1 --ntp-server clock.redhat.com --libvirt-type qemu --debug --log-file=overcloud_deploy.log

nova list

+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+
| ID                                   | Name                    | Status | Task State | Power State | Networks            |
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+
| 5746f7fe-c3df-4eaf-93a0-2a580c6e36ba | overcloud-controller-0  | ERROR  | -          | NOSTATE     | ctlplane=192.0.2.22 |
| a7ee15c2-9dd9-497b-a4bd-ad9b9144b032 | overcloud-novacompute-0 | ERROR  | -          | NOSTATE     | ctlplane=192.0.2.21 |
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+

| fault                                | {"message": "Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance a7ee15c2-9dd9-497b-a4bd-ad9b9144b032. Last exception: [u'Traceback (most recent call last):\                   |
|                                      | ', u'  File \"/usr/lib/python2.7/site-packages/nova/compute/manager.py\", line 1", "code": 500, "details": "  File \"/usr/lib/python2.7/site-packages/nova/conductor/manager.py\", line 737, in build_instances |
|                                      |     instances[0].uuid)                                                              

     2015-12-10 14:34:26.803 12005 DEBUG oslo_messaging._drivers.amqpdriver [req-9b18a425-9fbb-4036-bb01-218f4a934b0c 020427c3b76646d9b00d1e9804865ee7 e87cbee607ac4117b27f8d3888c8d058] MSG_ID is 
6916026ffa704a9b949f5179acdf0b45 _send /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:392
2015-12-10 14:34:26.822 12005 DEBUG heat.common.serializers [req-9b18a425-9fbb-4036-bb01-218f4a934b0c 020427c3b76646d9b00d1e9804865ee7 e87cbee607ac4117b27f8d3888c8d058] JSON response : {"exp
lanation": "The resource could not be found.", "code": 404, "error": {"message": "The Stack (overcloud) could not be found.", "traceback": "Traceback (most recent call last):\n\n  File \"/us
r/lib/python2.7/site-packages/heat/common/context.py\", line 308, in wrapped\n    return func(self, ctx, *args, **kwargs)\n\n  File \"/usr/lib/python2.7/site-packages/heat/engine/service.py\
", line 441, in identify_stack\n    raise exception.StackNotFound(stack_name=stack_name)\n\nStackNotFound: The Stack (overcloud) could not be found.\n", "type": "StackNotFound"}, "title": "N
ot Found"} to_json /usr/lib/python2.7/site-packages/heat/common/serializers.py:42
-

From nova.api
                                                                      
2015-12-13 05:26:06.935 11379 DEBUG nova.compute.api [req-d05aa865-4d2a-440e-892b-eff8a280b01c 49d48f34fec646498456aae09377a534 e87cbee607ac4117b27f8d3888c8d058 - - -] Searching by: {u'chang
es-since': datetime.datetime(2015, 12, 13, 5, 16, 7, 12592, tzinfo=<FixedOffset u'+00:00' datetime.timedelta(0)>)} get_all /usr/lib/python2.7/site-packages/nova/compute/api.py:2075
2015-12-13 05:26:07.003 11379 DEBUG nova.policy [req-d05aa865-4d2a-440e-892b-eff8a280b01c 49d48f34fec646498456aae09377a534 e87cbee607ac4117b27f8d3888c8d058 - - -] Policy check for os_compute_api:os-hide-server-addresses failed with credentials {'domain': None, 'project_name': u'admin', 'project_domain': None, 'timestamp': '2015-12-13T05:26:06.927067', 'remote_address': '192.0.2.1', 'quota_class': None, 'resource_uuid': None, 'is_admin': True, 'user': u'49d48f34fec646498456aae09377a534', 'service_catalog': [], 'tenant': u'e87cbee607ac4117b27f8d3888c8d058', 'read_only': False, 'project_id': u'e87cbee607ac4117b27f8d3888c8d058', 'user_id': u'49d48f34fec646498456aae09377a534', 'show_deleted': False, 'roles': [u'admin'], 'user_identity': '49d48f34fec646498456aae09377a534 e87cbee607ac4117b27f8d3888c8d058 - - -', 'read_deleted': 'no', 'request_id': 'req-d05aa865-4d2a-440e-892b-eff8a280b01c', 'instance_lock_checked': False, 'user_domain': None, 'user_name': u'ceilometer'} enforce /usr/lib/python2.7/site-packages/nova/policy.py:104
--
The resource could not be found.

   
 _http_log_response /usr/lib/python2.7/site-packages/keystoneclient/session.py:215
2015-12-13 05:26:08.254 11382 DEBUG neutronclient.v2_0.client [req-9058f5f6-f414-4e5f-a5a7-152dec32a3af 49d48f34fec646498456aae09377a534 e87cbee607ac4117b27f8d3888c8d058 - - -] Error message: 404 Not Found

                                            |
|                                      |   File \"/usr/lib/python2.7/site-packages/nova/scheduler/utils.py\", line 178, in populate_retry                                                                                                                |
|                                      |     raise exception.MaxRetriesExceeded(reason=msg)                                                                                                                                                       


Version-Release number of selected component (if applicable):
RDO Mitaka

python-heatclient-0.8.1-dev2.el7.centos.noarch
openstack-heat-api-5.0.1-dev49.el7.centos.noarch

How reproducible:

100%

Steps to Reproduce:
1.deploy overcloud with the above command
2. wait few hours until it failed
3.

Actual results:

Failure in deployment stack errored

Expected results:

Resources should be found, and error should be raised earlier than 8 hours of deployment

Additional info:

Comment 1 Ronnie Rasouli 2015-12-13 06:31:26 UTC
Version tested CentOS Linux release 7.1.1503 (Core)

Comment 2 Ronnie Rasouli 2015-12-13 06:39:29 UTC
| stack_status          | CREATE_FAILED                                                                                                                                                       
                                                                                                                                                                                              
                                                                                                                                                                                              
                                                                                                                                                                                              
                                                                                                                                                                                              
                                                                                                                                                                                              
                                                                                                                  |
| stack_status_reason   | Resource CREATE failed: resources.Compute: BadRequest:                                                                                                              
                                                                                                                                                                                              
                                                                                                                                                                                              
                                                                                                                                                                                              
                                                                                                                                                                                              
                                                                                                                                                                                                                                                                                                                |
|                       | resources[0].resources.NovaCompute: Expecting to find                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
|                       | username or userId in passwordCredentials - the server                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                       | could not comply with the request since it is either                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
|                       | malformed or otherwise incorrect. The client is assumed                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
|                       | to be in error. (HTTP 400)

Comment 3 Zane Bitter 2016-01-07 19:50:31 UTC
This is not a Heat bug. The error "Exceeded max scheduling attempts" comes from Nova, and likely indicates that there are no servers available from Ironic to schedule.

Comment 4 Chandan Kumar 2016-05-19 15:31:01 UTC
This bug is against a Version which has reached End of Life.
If it's still present in supported release (http://releases.openstack.org), please update Version and reopen.

Comment 5 Ronnie Rasouli 2017-07-23 13:27:32 UTC
EOL