.A health warning status is reported when no Ceph Managers or OSDs are in the storage cluster
In previous {storage-product} releases, the storage cluster health status was `HEALTH_OK` even though there were no Ceph Managers or OSDs in the storage cluster. With this release, this health status has changed, and will report a health warning if a storage cluster is not set up with Ceph Managers, or if all the Ceph Managers go down. Because {storage-product} heavily relies on the Ceph Manager to deliver key features, it is not advisable to run a Ceph storage cluster without Ceph Managers or OSDs.
Description of problem: When a cluster has no OSDs or no managers health is reported as HEALTH_OK
cluster:
id: 97ce8ce8-811c-46ce-9682-ce535d9859ab
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 11m)
mgr: no daemons active
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
Version-Release number of selected component (if applicable): 14.2.X (any including the latest 14.2.4)
How reproducible: all the time
Steps to Reproduce:
1. Deploy Ceph with no managers or OSDs
2.
3.
Actual results: report is HEALTH_OK
Expected results: report is HEALTH_WARN or HEALTH_ERR
Additional info:
Comment 1RHEL Program Management
2019-10-14 13:12:39 UTC
This was by design when ceph-mgr was created - the idea at the time was to avoid spurious warnings during cluster setup. and at that point ceph-mgr was not necessary for much functionality. At this point, ceph-mgr is doing much more. Currently the health status is only affected if there was ever a mgr running - it seems removing this condition so you get an error after there's no mgr for some time would resolve this.
Moving to 4.1 since this is not a blocker for 4.0 (same behavior as 3.x).
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2020:2231
Description of problem: When a cluster has no OSDs or no managers health is reported as HEALTH_OK cluster: id: 97ce8ce8-811c-46ce-9682-ce535d9859ab health: HEALTH_OK services: mon: 3 daemons, quorum a,b,c (age 11m) mgr: no daemons active osd: 0 osds: 0 up, 0 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 B usage: 0 B used, 0 B / 0 B avail pgs: Version-Release number of selected component (if applicable): 14.2.X (any including the latest 14.2.4) How reproducible: all the time Steps to Reproduce: 1. Deploy Ceph with no managers or OSDs 2. 3. Actual results: report is HEALTH_OK Expected results: report is HEALTH_WARN or HEALTH_ERR Additional info: