Bug 1437016
Summary: | tripleo client stuck in IN_PROGRESS in overcloud update run | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Martin Schuppert <mschuppe> | |
Component: | openstack-tripleo-common | Assignee: | Julie Pichon <jpichon> | |
Status: | CLOSED ERRATA | QA Contact: | Julie Pichon <jpichon> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 13.0 (Queens) | CC: | afariasa, aschultz, cpaquin, dbecker, hbrock, hjensas, jjoyce, jpichon, jslagle, mburns, mburrows, mschuppe, pneedle, rhel-osp-director-maint, slinaber, therve, yprokule | |
Target Milestone: | async | Keywords: | OtherQA, ZStream | |
Target Release: | 13.0 (Queens) | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | openstack-tripleo-common-5.4.1-6.el7ost | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1457208 (view as bug list) | Environment: | ||
Last Closed: | 2019-02-14 15:13:21 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1434509, 1457208, 1520109 |
Description
Martin Schuppert
2017-03-29 09:47:21 UTC
This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release. It looks like the main logic behind this lives in tripleo-common: https://github.com/openstack/python-tripleoclient/blob/stable/newton/tripleoclient/v1/overcloud_update.py https://github.com/openstack/tripleo-common/blob/stable/newton/tripleo_common/update.py https://github.com/openstack/tripleo-common/blob/stable/newton/tripleo_common/_stack_update.py Martin, thank you for the environment information. There are multiple people connected to it and a stack update in progress, so I assume it is used as part of other bugs as well? Despite my best efforts I've been unable to reproduce the bug exactly so it's difficult for me to confirm if the patch upstream will fix this particular case. Is it possible to apply the patch in your lab or another test environment where the issue has been confirmed? 1st run: * put stack in failed state as mentioned in description [stack@undercloud-0 ~]$ ./overcloud_update_plan_only.sh Removing the current plan files Uploading new plan files Started Mistral Workflow. Execution ID: ec001f0d-49b7-4afb-b3cf-e4d9a3a5f287 Plan updated Deploying templates in the directory /tmp/tripleoclient-vj1ILn/tripleo-heat-templates Started Mistral Workflow. Execution ID: 4b506ad6-f6ae-4396-816d-d8fbe9f9c0b0 Overcloud Endpoint: http://10.0.0.103:5000/v2.0 Overcloud Deployed [stack@undercloud-0 ~]$ heat stack-list WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead +--------------------------------------+------------+---------------+----------------------+----------------------+ | id | stack_name | stack_status | creation_time | updated_time | +--------------------------------------+------------+---------------+----------------------+----------------------+ | 0add0f72-8693-424b-bf28-06b11402340d | overcloud | UPDATE_FAILED | 2017-03-18T23:18:30Z | 2017-03-31T11:10:57Z | +--------------------------------------+------------+---------------+----------------------+----------------------+ * applied https://review.openstack.org/#/c/451725/3/tripleo_common/_stack_update.py [stack@undercloud-0 ~]$ diff -u _stack_update.py _stack_update.py-fix --- _stack_update.py 2017-03-31 11:50:01.356143531 +0000 +++ _stack_update.py-fix 2017-03-31 11:03:14.169127182 +0000 @@ -160,9 +160,9 @@ state = 'on_breakpoint' elif ev.resource_status_reason == hook_clear_reason: state = 'in_progress' - elif ev.resource_status == 'UPDATE_IN_PROGRESS': + elif ev.resource_status in ('CREATE_IN_PROGRESS', 'UPDATE_IN_PROGRESS'): state = 'in_progress' - elif ev.resource_status == 'UPDATE_COMPLETE': + elif ev.resource_status in ('CREATE_COMPLETE', 'UPDATE_COMPLETE'): state = 'completed' resources[state][res.physical_resource_id] = res * update was successful: [stack@undercloud-0 ~]$ openstack overcloud update stack -i overcloud starting package update on stack overcloud WAITING not_started: [u'controller-1'] on_breakpoint: [u'compute-0', u'controller-2', u'controller-0'] Breakpoint reached, continue? Regexp or Enter=proceed (will clear 1d95309a-cc90-4e1a-b3ae-5168c5aef841), no=cancel update, C-c=quit interactive mode: compute-0 IN_PROGRESS IN_PROGRESS WAITING completed: [u'compute-0'] on_breakpoint: [u'controller-1', u'controller-2', u'controller-0'] Breakpoint reached, continue? Regexp or Enter=proceed (will clear 1d95309a-cc90-4e1a-b3ae-5168c5aef841), no=cancel update, C-c=quit interactive mode: controller-0 IN_PROGRESS IN_PROGRESS IN_PROGRESS IN_PROGRESS WAITING completed: [u'compute-0', u'controller-0'] on_breakpoint: [u'controller-1', u'controller-2'] Breakpoint reached, continue? Regexp or Enter=proceed (will clear 8963c6f9-ac10-4937-adc7-62114739a845), no=cancel update, C-c=quit interactive mode: IN_PROGRESS IN_PROGRESS IN_PROGRESS IN_PROGRESS IN_PROGRESS IN_PROGRESS WAITING completed: [u'controller-2', u'compute-0', u'controller-0'] on_breakpoint: [u'controller-1'] Breakpoint reached, continue? Regexp or Enter=proceed (will clear 0e4c9349-9a54-4371-8fc6-f4f0b9428744), no=cancel update, C-c=quit interactive mode: IN_PROGRESS IN_PROGRESS IN_PROGRESS IN_PROGRESS ... IN_PROGRESS IN_PROGRESS IN_PROGRESS COMPLETE update finished with status COMPLETE * a second test run was also successful * in a 3rd run I reverted the patch and the update is stuck again: [stack@undercloud-0 ~]$ openstack overcloud update stack -i overcloud starting package update on stack overcloud WAITING not_started: [u'controller-0'] on_breakpoint: [u'controller-1', u'controller-2', u'compute-0'] Breakpoint reached, continue? Regexp or Enter=proceed (will clear c820338e-79c5-4f13-8a13-0646911d07a9), no=cancel update, C-c=quit interactive mode: compute-0 IN_PROGRESS IN_PROGRESS IN_PROGRESS IN_PROGRESS IN_PROGRESS IN_PROGRESS IN_PROGRESS IN_PROGRESS IN_PROGRESS IN_PROGRESS ... Confirmed that the fix is included in the "Fixed in Version" rpm and completed an update successfully locally. A build containing this fix was also confirmed to resolve the problem in environments that displayed the issue, cf. comment 6. $ rpm -qa openstack-tripleo-common openstack-tripleo-common-5.4.1-6.el7ost.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:1242 |