Description of problem: During ceph upgrade, all the osds went down on 4 nodes. Version-Release number of selected component (if applicable): less installed-rpms |grep tripleoansible-tripleo-ipsec-8.1.1-0.20190513184007.7eb892c.el7ost.noarch Wed Oct 7 07:14:55 2020 openstack-tripleo-common-8.7.1-20.el7ost.noarch Wed Oct 7 07:16:47 2020 openstack-tripleo-common-containers-8.7.1-20.el7ost.noarch Wed Oct 7 07:14:32 2020 openstack-tripleo-heat-templates-8.4.1-58.1.el7ost.noarch Wed Oct 7 07:16:48 2020 openstack-tripleo-image-elements-8.0.3-1.el7ost.noarch Wed Oct 7 07:14:43 2020 openstack-tripleo-puppet-elements-8.1.1-2.el7ost.noarch Wed Oct 7 07:14:35 2020 openstack-tripleo-ui-8.3.2-3.el7ost.noarch Wed Oct 7 07:22:42 2020 openstack-tripleo-validations-8.5.0-4.el7ost.noarch Wed Oct 7 07:14:55 2020 puppet-tripleo-8.5.1-14.el7ost.noarch Wed Oct 7 07:14:30 2020 python-tripleoclient-9.3.1-7.el7ost.noarch Wed Oct 7 07:16:52 2020 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: ceph upgrade failed Expected results: Ceph upgrade should not fail Additional info: After the reboot of storage nodes, OSDs came up
Hi @ykarel, Sofer from DFG:Upgrades, If I get this correctly, during the update_tasks of the ceph-osd we had the docker restart that happens because [1] was true, ie docker needed to be updated. This, in turn, caused the ceph osd service to restart and started to rebalance. This takes time and eventually led to all OSD down as we progressed into the update[2]. If we add those commands[3] during the update, (in the file mentioned in [3] but for the "update_tasks") and in step_1 so that it happens before docker restart in step_2, then we would avoid this kind of issue, right ? I wonder if we would need to check for docker update or if we could just do that all the time, as those command should be harmless during that process, provided we put the flag back in step_4 for instance. WDYT? [1] https://github.com/openstack/tripleo-heat-templates/blob/stable/queens/puppet/services/docker.yaml#L161-L162 [2] I'm calling "update" that command "openstack overcloud upgrade run --nodes CephStorage" [3] https://github.com/openstack/tripleo-heat-templates/blob/stable/queens/puppet/services/ceph-osd.yaml#L92-L99
(In reply to Sofer Athlan-Guyot from comment #7) > Hi @ykarel, Sofer from DFG:Upgrades, > > If I get this correctly, during the update_tasks of the ceph-osd we had the > docker restart that happens because [1] was true, ie docker needed to be > updated. > Actually, not restart but stop docker, upgrade docker and start docker. If it would have been restart, then systemd would have restarted ceph osd's as part of docker restart. But this doesn't happen when stop/start happens for docker. > This, in turn, caused the ceph osd service to restart and started to > rebalance. This takes time and eventually led to all OSD down as we > progressed into the update[2]. > No, not restart, but stop all ceph osds(as ceph osd daemons have Requires: docker.service) and since docker is stopped all osd's stops too(due to Requires: as part of bz 1846830). docker get's started as part of [1] leaving osd's in stopped state. And since OSD's are down rebalance of pgs get's triggered and since one by one all osd's get's down ceph get's to an unhealthy state which will not recover until osds are up again. As part of "ceph-upgrade run" osd's get started/restarted[2][3] on 1 ceph storage node(as it run's serially) and as pgs are not in "active+clean" state Task "waiting for clean pgs..."[7] fails and aborts upgrade. To clear this osd's need to be started again and let rebalance to complete and get pgs back to "active+clean" state. The workaround should avoid rebalancing leaving pgs to be in active+clean and help "ceph-upgrade run" succeed as "waiting for clean pgs..." Task will not fail and upgrade continues to all the storage nodes. Also it has nothing to do with time as osd's are stopped as part of "openstack overcloud upgrade run --nodes CephStorage", and osds are being started as part of "openstack overcloud ceph-upgrade run". [1] https://github.com/openstack/tripleo-heat-templates/blob/stable/queens/puppet/services/docker.yaml#L188-L195 [2] https://github.com/ceph/ceph-ansible/blob/v3.2.52/roles/ceph-osd/tasks/start_osds.yml#L117-L123 [3] https://github.com/ceph/ceph-ansible/blob/v3.2.52/infrastructure-playbooks/rolling_update.yml#L372-L378 [4] https://github.com/ceph/ceph-ansible/blob/v3.2.52/infrastructure-playbooks/rolling_update.yml#L410-L421 > If we add those commands[3] during the update, (in the file mentioned in [3] > but for the "update_tasks") and in step_1 so that it happens before docker > restart in step_2, > then we would avoid this kind of issue, right ? > No just that shouldn't help much as set and unset of flags are done without osd's getting started. > I wonder if we would need to check for docker update or if we could just do > that all the time, as those command should be harmless during that process, > provided we put the > flag back in step_4 for instance. > > WDYT? With respect to fix, i think it needs to be done as part of docker upgrade step as that's the place where osd's get's stopped but not started, Something like detect all "ceph osd services which were in started state", if docker get's upgraded then post docker starts start all ceph osd's too which were in started state. "flags" can be set before docker stops and unset after osds are started, someone from ceph should confirm/comment on this theory and suggest the best way to handle start/restarts of ceph services. It also applies to other ceph services as well(mon/mgr etc) so need to fix for those as well, as i see customer faced issues(monitor down, and manual restart was done) during monitor upgrade as well i didn't dig on it initially as focused only on osd's down as part of RCA, my bad should have considered other services as well instead of just OSDs which was asked as part of RCA. As part of OSD's down RCA we shared workaround for upgrade, but not for other ceph services(mon/mgr) which will also face the issue(service stop and no start) due to docker stop/start, so would be good to get customer updated on this as they are planning more upgrade activities on other environment. > > [1] > https://github.com/openstack/tripleo-heat-templates/blob/stable/queens/ > puppet/services/docker.yaml#L161-L162 > [2] I'm calling "update" that command "openstack overcloud upgrade run > --nodes CephStorage" > [3] > https://github.com/openstack/tripleo-heat-templates/blob/stable/queens/ > puppet/services/ceph-osd.yaml#L92-L99
Hi, John, thanks for the clearing the sequential issue out, must have been one particular env on my side. Yatin, thanks for the clarification, I think I get the full picture now. You are right then, the problem is not specific to ceph-osd container but to any docker container that would be stopped and not restarted when docker is stopped. It seems it affect all ceph containers, but maybe there are others. The solution would then be: 1. get the list of services that will be stopped if docker is stopped (the dependant services); 2. start/stop docker; 3. restart each dependant services; Relative to the review that definitively means that this has to got in the docker service file. The first point, may prove complicated as, in general "calculated" actions are fragile. So maybe we go we a list of containers to restart.
(In reply to Yatin Karel from comment #10) > With respect to fix, i think it needs to be done as part of docker upgrade > step as that's the place where osd's get's stopped but not started, Thanks Sofer for this patch which is run after the docker update: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/769393/2/puppet/services/docker.yaml > Something > like detect all "ceph osd services which were in started state", if docker > get's upgraded then post docker starts start all ceph osd's too which were in > started state. "flags" can be set before docker stops and unset after osds > are started, someone from ceph should confirm/comment on this theory and > suggest the best > way to handle start/restarts of ceph services. > > It also applies to other ceph services as well(mon/mgr etc) so need to fix > for those as well, as i see customer faced issues(monitor down, and manual > restart was done) during monitor upgrade as well i didn't dig on it > initially as focused only on osd's down as part of RCA, my bad should have > considered other services as well instead of just OSDs which was asked as > part of RCA. > As part of OSD's down RCA we shared workaround for upgrade, but not for > other ceph services(mon/mgr) which will also face the issue(service stop and > no start) due to docker stop/start, so would be good to get customer updated > on this as they are planning more upgrade activities on other environment. Updated https://access.redhat.com/solutions/5679791 accordingly to cover all ceph services
*** Bug 1911620 has been marked as a duplicate of this bug. ***
Hi, started the downport of the patch. By the way thank Bogdan, we end up using that command to get a nice list of services.
*** Bug 1926821 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 13.0 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:0932