Bug 1976820
Summary: | [cephadm] 5.0 - Stopping mgr service using orch command is making cluster inaccessible - We need warning message and --force option for the "stop" service command | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Preethi <pnataraj> |
Component: | Cephadm | Assignee: | Adam King <adking> |
Status: | CLOSED ERRATA | QA Contact: | Sunil Kumar Nagaraju <sunnagar> |
Severity: | urgent | Docs Contact: | Mary Frances Hull <mhull> |
Priority: | urgent | ||
Version: | 5.0 | CC: | adking, agunn, asakthiv, gsitlani, mhackett, sewagner, sunnagar, tserlin, vereddy, vumrao |
Target Milestone: | --- | ||
Target Release: | 5.0z1 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | ceph-16.2.0-125.el8cp | Doc Type: | Bug Fix |
Doc Text: |
.Users are no longer able to remove the Ceph Manager service using `cephadm`
Previously, if a user ran a `ceph orch rm mgr` command, it would cause `cephadm` to remove all the Ceph Manager daemons in the storage cluster, making the storage cluster inaccessible.
With this release, attempting to remove the Ceph Manager, a Ceph Monitor, or a Ceph OSD service using the `ceph orch rm _SERVICE_NAME_` command displays a warning message stating that it is not safe to remove these services, and results in no actions taken.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2021-11-02 16:38:26 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1959686 |
Description
Preethi
2021-06-28 10:25:10 UTC
moving this to 5.0z1 We can recover the cluster by following the steps - The issue is seen only for MON and MGR now. Cluster will be in accessible state for other services like OSDs, RGW etc but all daemons of that services will be down. Hence, we need warning msgs for all type of services when we use ceph orch stop <service name> option How to recover, Go to /var/lib/ceph/fsid and note the mon service And perform systemctl start ceph-<fsid>@monservicename - what we have in /var/lib/ceph/mon service → repeat this for all mon/mgr nodes Check ps -ef | grep ceph-mon/ceph-mgr ---> verify the process id created for ceph-mon/ceph-mgr Ex: systemctl start ceph-f64f341c-655d-11eb-8778-fa163e914bcc Hi Mike, Can you review from a support perspective if the recovery procedure is good to differ this BZ to 5.0z1? PR is merged in upstream, but not yet in z1 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 5.0 Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4105 |