Bug 1631848
Summary: | OSP 12->13 upgrade - ceph OSD containers are killed during removal of ceph-osd rpm | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Matt Flusche <mflusche> |
Component: | openstack-tripleo-common | Assignee: | Giulio Fidente <gfidente> |
Status: | CLOSED ERRATA | QA Contact: | Yogev Rabl <yrabl> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 13.0 (Queens) | CC: | gabrioux, gfidente, johfulto, lmarsh, mburns, mcornea, slinaber, yprokule |
Target Milestone: | z3 | Keywords: | Triaged, ZStream |
Target Release: | 13.0 (Queens) | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | openstack-tripleo-common-8.6.3-16.el7ost | Doc Type: | Bug Fix |
Doc Text: |
When upgrading from Red Hat OpenStack Platform 12 to 13 the ceph-osd package is removed. The package removal stopped the running OSDs even though they were running in containers and shouldn't have required the package. This release removes the playbook that removes the package during the upgrade and Ceph OSDs are not unintentionally stopped during upgrade.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-11-13 22:28:50 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Matt Flusche
2018-09-21 17:32:08 UTC
I've suggested manually removing the ceph-osd package prior the the upgrade as a work-around. rpm -e --noscripts ceph-osd (In reply to Matt Flusche from comment #3) > I've suggested manually removing the ceph-osd package prior the the upgrade > as a work-around. > > rpm -e --noscripts ceph-osd hi Matt, thanks for reporting this bug! Can you see if by re-enabling and re-starting the systemd units the OSDs get back up in working state? I don't think we can remove the package before upgrading because during FFU we haven't migrated the OSDs into containers until the ceph-ansible run finished so removing the package installed locally would hit the running cluster anyway. I believe we can start by making a change in the ceph-ansible workflow so that it does not remove the package at all from the OSD nodes, then see where/how is best to resolve this which seems something to be addressed in either ceph-ansible or the ceph RPMs uninstall scripts. (In reply to Giulio Fidente from comment #4) > > thanks for reporting this bug! Can you see if by re-enabling and re-starting > the systemd units the OSDs get back up in working state? > Yes, re-starting the systemd units recovered the OSDs. Also, the manual work-around of removing the ceph-osd package (rpm -e --noscripts ceph-osd) prior to the upgrade worked successfully in the production environment. verified Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3587 |