Bug 1267558
| Summary: | Breakpoints are not deleted after stack-update operation | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Jan Provaznik <jprovazn> |
| Component: | python-rdomanager-oscplugin | Assignee: | Jan Provaznik <jprovazn> |
| Status: | CLOSED ERRATA | QA Contact: | Marius Cornea <mcornea> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | high | ||
| Version: | 7.0 (Kilo) | CC: | calfonso, dnavale, jprovazn, jslagle, jstransk, mburns, mcornea, ohochman, rhel-osp-director-maint, sasha, vcojot, zbitter |
| Target Milestone: | y2 | Keywords: | Triaged |
| Target Release: | 7.0 (Kilo) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-tripleo-common-0.0.1.dev6-4.git49b57eb.el7ost python-rdomanager-oscplugin-0.0.10-12.el7ost | Doc Type: | Bug Fix |
| Doc Text: |
Previously, breakpoints were not removed when an update operation failed. If a user ran the "openstack overcloud update" command and it failed, it is possible that the subsequent stack-update command (e.g. "openstack overcloud deploy") might be stuck in the 'IN_PROGRESS' state waiting for the removal of breakpoints.
With this update, all existing CLI commands explicitly remove any existing breakpoints when running a stack-update operation. As a result, the stack-update operations do not get stuck in the 'IN_PROGRESS' state while waiting for the breakpoint removal.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-12-21 16:49:41 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Jan Provaznik
2015-09-30 12:06:13 UTC
Just to clarify, the issue here is not so much that the breakpoints remain set across operations (they don't) but that the breakpoints, which we use as a temporary thing to apply to the current operation, are configured in the environment and the environment is now maintained between operations instead of re-sent each time. Jan confirmed that the way we set the breakpoints now is to generate a snippet of JSON that gets merged into the environment file to send. One option to stop this from sticking around would be for all other commands (i.e. the ones that don't want breakpoints) to generate a similar snippet with *no* breakpoints set and merge that into the environment, so that it overrides any stored breakpoint configuration. This isn't a great long-term solution because it means that every command has to know about every other command's use of breakpoints. However, we agreed that this is probably the best short-term solution. I was able to update my overcloud successfully with: openstack overcloud update stack --templates -e <yaml> -i overcloud. Then, when I tried to re-run the overcloud deployment command (only without the yaml files) - it got stuck. I see the following for a long time: heat resource-list -n 5 overcloud|grep -v COMPLE +---------------------------------------------+-----------------------------------------------+---------------------------------------------------+--------------------+----------------------+---------------------------------------------+ | resource_name | physical_resource_id | resource_type | resource_status | updated_time | parent_resource | +---------------------------------------------+-----------------------------------------------+---------------------------------------------------+--------------------+----------------------+---------------------------------------------+ | Controller | deba9153-d96c-4fae-8061-eb2cbe5ce390 | OS::Heat::ResourceGroup | UPDATE_IN_PROGRESS | 2015-10-01T13:33:14Z | | | 2 | 33504937-1746-434a-bb58-b76b8923bb81 | OS::TripleO::Controller | UPDATE_IN_PROGRESS | 2015-10-01T13:33:21Z | Controller | | Compute | 1f8c1164-ddaf-4ff7-981a-e9c3c61c7094 | OS::Heat::ResourceGroup | UPDATE_IN_PROGRESS | 2015-10-01T13:33:21Z | | | 0 | bd529704-fec6-4de5-857f-ceada7c21d78 | OS::TripleO::Compute | UPDATE_IN_PROGRESS | 2015-10-01T13:33:24Z | Compute | | 0 | 8ad0cb1b-efbc-4cda-9cde-f7b7f932a24f | OS::TripleO::Controller | UPDATE_IN_PROGRESS | 2015-10-01T13:33:29Z | Controller | | 1 | 3ab44c19-87f6-4a4e-8702-be08b222698b | OS::TripleO::Controller | UPDATE_IN_PROGRESS | 2015-10-01T13:33:40Z | Controller | +---------------------------------------------+-----------------------------------------------+---------------------------------------------------+--------------------+----------------------+---------------------------------------------+ This is starting to look like a y1 blocker. This bug will manifest itself on every operation that does a Heat stack update (e.g. a subsequent overcloud deploy, scaling up or down, removing a node) after a package update. The workaround for now is to manually clear the breakpoints (hooks) using the "heat clear-hook" command. This will have to be repeated each time. *** Bug 1268252 has been marked as a duplicate of this bug. *** *** Bug 1279544 has been marked as a duplicate of this bug. *** We can see that the breakpoints get cleared during the update process: stack@instack:~>>> heat hook-poll -n5 overcloud +------------------+--------------------------------------+------------------------------------------------+-----------------+----------------------+---------------------------------------------------+ | resource_name | id | resource_status_reason | resource_status | event_time | stack_name | +------------------+--------------------------------------+------------------------------------------------+-----------------+----------------------+---------------------------------------------------+ | UpdateDeployment | 183b4a66-4521-4c70-90b9-4733ce017b3d | UPDATE paused until Hook pre-update is cleared | CREATE_COMPLETE | 2015-12-12T22:44:21Z | overcloud-Controller-gm34wwhcni7u-0-sm4gdmxjfesg | | UpdateDeployment | 409c6074-25d1-4901-97e8-689e81f61e62 | UPDATE paused until Hook pre-update is cleared | CREATE_COMPLETE | 2015-12-12T22:43:58Z | overcloud-Controller-gm34wwhcni7u-1-zgqpydv2oawr | | UpdateDeployment | c8ed8054-4ebd-40a5-9d62-b665f3a1b754 | UPDATE paused until Hook pre-update is cleared | CREATE_COMPLETE | 2015-12-12T22:43:46Z | overcloud-CephStorage-n3ft7va6txum-2-zfayegy6nz5u | | UpdateDeployment | 19fcc094-9b79-46de-8fca-2c2a39cf61cb | UPDATE paused until Hook pre-update is cleared | CREATE_COMPLETE | 2015-12-12T22:43:39Z | overcloud-CephStorage-n3ft7va6txum-1-dmkkixd5whud | | UpdateDeployment | 0d0fa35e-6bdd-4e3f-81b0-43fdcb43bd66 | UPDATE paused until Hook pre-update is cleared | CREATE_COMPLETE | 2015-12-12T22:43:49Z | overcloud-Compute-ia5kvmciy4x2-0-se7fq5l3alx5 | +------------------+--------------------------------------+------------------------------------------------+-----------------+----------------------+---------------------------------------------------+ stack@instack:~>>> stack@instack:~>>> stack@instack:~>>> heat hook-poll -n5 overcloud +------------------+--------------------------------------+------------------------------------------------+-----------------+----------------------+---------------------------------------------------+ | resource_name | id | resource_status_reason | resource_status | event_time | stack_name | +------------------+--------------------------------------+------------------------------------------------+-----------------+----------------------+---------------------------------------------------+ | UpdateDeployment | 183b4a66-4521-4c70-90b9-4733ce017b3d | UPDATE paused until Hook pre-update is cleared | CREATE_COMPLETE | 2015-12-12T22:44:21Z | overcloud-Controller-gm34wwhcni7u-0-sm4gdmxjfesg | | UpdateDeployment | 409c6074-25d1-4901-97e8-689e81f61e62 | UPDATE paused until Hook pre-update is cleared | CREATE_COMPLETE | 2015-12-12T22:43:58Z | overcloud-Controller-gm34wwhcni7u-1-zgqpydv2oawr | | UpdateDeployment | c8ed8054-4ebd-40a5-9d62-b665f3a1b754 | UPDATE paused until Hook pre-update is cleared | CREATE_COMPLETE | 2015-12-12T22:43:46Z | overcloud-CephStorage-n3ft7va6txum-2-zfayegy6nz5u | | UpdateDeployment | 0d0fa35e-6bdd-4e3f-81b0-43fdcb43bd66 | UPDATE paused until Hook pre-update is cleared | CREATE_COMPLETE | 2015-12-12T22:43:49Z | overcloud-Compute-ia5kvmciy4x2-0-se7fq5l3alx5 | +------------------+--------------------------------------+------------------------------------------------+-----------------+----------------------+---------------------------------------------------+ stack@instack:~>>> heat hook-poll -n5 overcloud +------------------+--------------------------------------+------------------------------------------------+-----------------+----------------------+---------------------------------------------------+ | resource_name | id | resource_status_reason | resource_status | event_time | stack_name | +------------------+--------------------------------------+------------------------------------------------+-----------------+----------------------+---------------------------------------------------+ | UpdateDeployment | 183b4a66-4521-4c70-90b9-4733ce017b3d | UPDATE paused until Hook pre-update is cleared | CREATE_COMPLETE | 2015-12-12T22:44:21Z | overcloud-Controller-gm34wwhcni7u-0-sm4gdmxjfesg | | UpdateDeployment | c8ed8054-4ebd-40a5-9d62-b665f3a1b754 | UPDATE paused until Hook pre-update is cleared | CREATE_COMPLETE | 2015-12-12T22:43:46Z | overcloud-CephStorage-n3ft7va6txum-2-zfayegy6nz5u | | UpdateDeployment | 0d0fa35e-6bdd-4e3f-81b0-43fdcb43bd66 | UPDATE paused until Hook pre-update is cleared | CREATE_COMPLETE | 2015-12-12T22:43:49Z | overcloud-Compute-ia5kvmciy4x2-0-se7fq5l3alx5 | +------------------+--------------------------------------+------------------------------------------------+-----------------+----------------------+---------------------------------------------------+ stack@instack:~>>> heat hook-poll -n5 overcloud +------------------+--------------------------------------+------------------------------------------------+-----------------+----------------------+---------------------------------------------------+ | resource_name | id | resource_status_reason | resource_status | event_time | stack_name | +------------------+--------------------------------------+------------------------------------------------+-----------------+----------------------+---------------------------------------------------+ | UpdateDeployment | 183b4a66-4521-4c70-90b9-4733ce017b3d | UPDATE paused until Hook pre-update is cleared | CREATE_COMPLETE | 2015-12-12T22:44:21Z | overcloud-Controller-gm34wwhcni7u-0-sm4gdmxjfesg | | UpdateDeployment | c8ed8054-4ebd-40a5-9d62-b665f3a1b754 | UPDATE paused until Hook pre-update is cleared | CREATE_COMPLETE | 2015-12-12T22:43:46Z | overcloud-CephStorage-n3ft7va6txum-2-zfayegy6nz5u | +------------------+--------------------------------------+------------------------------------------------+-----------------+----------------------+---------------------------------------------------+ stack@instack:~>>> heat hook-poll -n5 overcloud +------------------+--------------------------------------+------------------------------------------------+-----------------+----------------------+--------------------------------------------------+ | resource_name | id | resource_status_reason | resource_status | event_time | stack_name | +------------------+--------------------------------------+------------------------------------------------+-----------------+----------------------+--------------------------------------------------+ | UpdateDeployment | 183b4a66-4521-4c70-90b9-4733ce017b3d | UPDATE paused until Hook pre-update is cleared | CREATE_COMPLETE | 2015-12-12T22:44:21Z | overcloud-Controller-gm34wwhcni7u-0-sm4gdmxjfesg | +------------------+--------------------------------------+------------------------------------------------+-----------------+----------------------+--------------------------------------------------+ I don't think the verification above is sufficient for this BZ. This BZ address the issue when "openstack overcloud update" fails which causes that some breakpoints are left on stack, then if some *other* command is executed on the stack in failed state, it would be hanging on the breakpoints which were left there from the update command. So the right verification of this BZ should be something like this: 1) run "openstack overcloud update" and cause it fails on e.g. first node 2) check with "heat hook-poll -n5 overcloud" that some breakpoints were left on the stack 3) run "openstack overcloud deploy" (or "openstack overcloud node delete") - if this BZ is fixed, breakpoints should be cleared when you run this command, if it's not fixed, this command will be hanging and you will see breakpoints still set on the stack with "heat hook-poll -n5 overcloud" (In reply to Jan Provaznik from comment #16) > I don't think the verification above is sufficient for this BZ. This BZ > address the issue when "openstack overcloud update" fails which causes that > some breakpoints are left on stack, then if some *other* command is executed > on the stack in failed state, it would be hanging on the breakpoints which > were left there from the update command. So the right verification of this > BZ should be something like this: > 1) run "openstack overcloud update" and cause it fails on e.g. first node > 2) check with "heat hook-poll -n5 overcloud" that some breakpoints were left > on the stack > 3) run "openstack overcloud deploy" (or "openstack overcloud node delete") - > if this BZ is fixed, breakpoints should be cleared when you run this > command, if it's not fixed, this command will be hanging and you will see > breakpoints still set on the stack with "heat hook-poll -n5 overcloud" 1. I triggered an update that failed on the first node of the stack overcloud | UPDATE_FAILED 2. I couldn't check the breakpoints status because at this point: Stack status UPDATE_FAILED not IN_PROGRESS 3. I reran the initial deploy command and checking the breakpoints the list shows empty: stack@instack:~>>> heat hook-poll -n5 overcloud +----+------------------------+-----------------+------------+------------+ | id | resource_status_reason | resource_status | event_time | stack_name | +----+------------------------+-----------------+------------+------------+ +----+------------------------+-----------------+------------+------------+ But the deploy command fails: stack@instack:~>>> time bash deploy.command Deploying templates in the directory /home/stack/templates/my-overcloud Stack failed with status: resources.Compute: Stack overcloud-Compute-fpeywqgic2le already has an action (UPDATE) in progress. ERROR: openstack Heat Stack update failed. I'm not sure if this relates to the initial report of this bug or it's another issue (there are some heat stacks which are currently UPDATE_IN_PROGRESS). Can you confirm that the steps that I did were enough for the verification of this ticket? Thanks. Hi Marius, I agree that the issue you hit is unrelated to this BZ and the fact that "heat hook-poll -n5 overcloud" after re-running the deploy command proves that this issue is fixed, thanks Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2015:2650 |