Bug 1518221
| Summary: | [UPDATES] Error response from Zaqar. Code: 503. Title: Service temporarily unavailable | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Yurii Prokulevych <yprokule> |
| Component: | python-tripleoclient | Assignee: | mathieu bultel <mbultel> |
| Status: | CLOSED ERRATA | QA Contact: | Yurii Prokulevych <yprokule> |
| Severity: | high | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 12.0 (Pike) | CC: | apevec, augol, dbecker, hbrock, jpichon, jschluet, jslagle, lbezdick, lhh, mbracho, mbultel, mburns, morazi, rhel-osp-director-maint, rrasouli, sclewis, tvignaud, yprokule |
| Target Milestone: | ga | Keywords: | Triaged |
| Target Release: | 12.0 (Pike) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | python-tripleoclient-7.3.3-7.el7ost | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-12-13 22:23:28 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
From zaqar.log: ... 2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims [(None,) 664ef39f4cff49ec8109f901af05eff8 8793d5e72bf74354b8b8194940c56daa - - -] Queue update does not exist for project 8793d5e72bf74354b8b81 94940c56daa: QueueDoesNotExist: Queue update does not exist for project 8793d5e72bf74354b8b8194940c56daa 2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims Traceback (most recent call last): 2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims File "/usr/lib/python2.7/site-packages/zaqar/transport/wsgi/v2_0/claims.py", line 85, in on_post 2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims **claim_options) 2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims File "/usr/lib/python2.7/site-packages/zaqar/common/pipeline.py", line 97, in consumer 2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims tmp = target(*args, **kwargs) 2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims File "/usr/lib/python2.7/site-packages/zaqar/storage/swift/claims.py", line 107, in create 2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims include_claimed=False) 2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims File "/usr/lib/python2.7/site-packages/zaqar/storage/swift/messages.py", line 102, in _list 2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims raise errors.QueueDoesNotExist(queue, project) 2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims QueueDoesNotExist: Queue update does not exist for project 8793d5e72bf74354b8b8194940c56daa 2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims Can you elaborate on how frequently this occurs and whether or not modifying timeouts would like resolve the issue? I'm leaning towards saying this is not a blocker but it would help to understand the frequency + impact that bug actually has before making that statement. And the other question is -- will a re-run of update reliably fix this? Link to spec change on pike-rdo: https://review.rdoproject.org/r/#/c/10741/ Verified with python-tripleoclient-7.3.3-7.el7ost.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:3462 |
Description of problem: ----------------------- During minor update of RHOS-12 got error: openstack overcloud update stack --nodes Networker ... u'TError response from Zaqar. Code: 503. Title: Service temporarily unavailable. Description: Claim could not be created. Please try again in a few seconds.. ASK [Set host puppet debugging fact string] ***********************************', u'skipping: [192.168.24.8]', u'', u'TASK [Write the config_step hieradata] *****************************************', u'changed: [192.168.24.8]', u'', u'TASK [Run puppet host configuration for step 4] ********************************', u'changed: [192.168.24.8]'] and this cause playbook to fail: ... SG: non-zero return code changed: [undercloud-0] => (item=Messaging) msg: All items completed to retry, use: --limit @/root/IR2/IR-SEALUSA-7/plugins/tripleo-upgrade/infrared_plugin/main.retry PLAY RECAP ******************************************************************************************************************************************************************************************************** undercloud-0 : ok=17 changed=2 unreachable=0 failed=1 ERROR Playbook "/root/IR2/IR-SEALUSA-7/plugins/tripleo-upgrade/infrared_plugin/main.yml" failed! Version-Release number of selected component (if applicable): ------------------------------------------------------------- openstack-zaqar-5.0.0-3.el7ost.noarch python-zaqarclient-1.7.0-1.el7ost.noarch puppet-zaqar-11.3.0-3.el7ost.noarch openstack-tripleo-puppet-elements-7.0.1-1.el7ost.noarch openstack-tripleo-common-containers-7.6.3-4.el7ost.noarch python-tripleoclient-7.3.3-5.el7ost.noarch puppet-tripleo-7.4.3-9.el7ost.noarch openstack-tripleo-common-7.6.3-4.el7ost.noarch openstack-tripleo-ui-7.4.3-4.el7ost.noarch openstack-tripleo-validations-7.4.2-1.el7ost.noarch openstack-tripleo-heat-templates-7.0.3-13.el7ost.noarch openstack-tripleo-image-elements-7.0.1-1.el7ost.noarch Steps to Reproduce: ------------------- 1. Run update of composable deployment (~15nodes) 2. Unfortunately this is not always reproducable Actual results: --------------- Update fails and has to be re-run Expected results: ----------------- Such events/tracebacks are handled and retried Additional info: ---------------- Virtual setup: 3controllers + 3messaging + 3database + 2networker + 2computes + 3ceph