Bug 1789822
| Summary: | Controller replacement breaks Swift config | |||
|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | David Rosenfeld <drosenfe> | |
| Component: | openstack-tripleo-heat-templates | Assignee: | Christian Schwede (cschwede) <cschwede> | |
| Status: | CLOSED ERRATA | QA Contact: | David Rosenfeld <drosenfe> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 16.1 (Train) | CC: | apetrich, aschultz, cschwede, gfidente, joflynn, jvisser, knoha, ltoscano, mabrams, mburns, mgarciac, mvalsecc, sasha, satmakur, slinaber, tkajinam | |
| Target Milestone: | zstream | Keywords: | Triaged | |
| Target Release: | 16.1 (Train on RHEL 8.2) | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | openstack-tripleo-heat-templates-11.3.2-1.20200905153422.e621f61.el8ost | Doc Type: | Known Issue | |
| Doc Text: |
Replacement of an overcloud Controller might cause swift rings to become inconsistent across nodes. This results in decreased availability of Object Storage service.
+
Workaround: Log in to the previously existing Controller node using SSH, deploy the updated rings, and restart the Object Storage containers:
```
(undercloud) [stack@undercloud-0 ~]$ source stackrc
(undercloud) [stack@undercloud-0 ~]$ nova list
...
| 3fab687e-99c2-4e66-805f-3106fb41d868 | controller-1 | ACTIVE | - | Running | ctlplane=192.168.24.17 |
| a87276ea-8682-4f27-9426-6b272955b486 | controller-2 | ACTIVE | - | Running | ctlplane=192.168.24.38 |
| a000b156-9adc-4d37-8169-c1af7800788b | controller-3 | ACTIVE | - | Running | ctlplane=192.168.24.35
+
(undercloud) [stack@undercloud-0 ~]$ for ip in 192.168.24.17 192.168.24.38 192.168.24.35; do ssh $ip 'sudo podman restart swift_copy_rings ; sudo podman restart $(sudo podman ps -a --format="{{.Names}}" --filter="name=swift_*")'; done
```
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1793684 (view as bug list) | Environment: | ||
| Last Closed: | 2020-12-15 18:35:44 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1793684, 1794758 | |||
|
Description
David Rosenfeld
2020-01-10 14:07:35 UTC
I found the regression in Train/RHOSP16, and opened an upstream bug [1] and proposed a patch to fix it [2]. [1] https://bugs.launchpad.net/tripleo/+bug/1892674 [2] https://review.opendev.org/#/c/747621/ This only applies to Stein and Train. I'm not sure if this is the same reason Takashi found on RHOSP13, but I will look into that next. After debugging this further, it shows that this is not a regression, and also affects OSP13 as Takashi noticed. I updated the Launchpad bug entry and the patch on Gerrit, this needs to be applied to our downstream releases as well. Patch merged on master, proposed backports: https://review.opendev.org/#/c/749883/ Train https://review.opendev.org/#/c/749884/ Ussuri https://review.opendev.org/#/c/749885/ Stein https://review.opendev.org/#/c/749886/ Rocky https://review.opendev.org/#/c/749887/ Queens Yes, controller job is passing with current build. All the storage tempest tests that originally failed during the controller replacement job now pass. RHOS-16.1-RHEL-8-20201021.n.0 was used. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.1.3 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:5413 |