Bug 1988773
| Summary: | [RFE] Provide warning when the 'require-osd-release' flag does not match current release. | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Michael J. Kidd <linuxkidd> | |
| Component: | RADOS | Assignee: | Sridhar Seshasayee <sseshasa> | |
| Status: | CLOSED ERRATA | QA Contact: | Tintu Mathew <tmathew> | |
| Severity: | high | Docs Contact: | Akash Raj <akraj> | |
| Priority: | unspecified | |||
| Version: | 4.2 | CC: | akraj, akupczyk, bhubbard, ceph-eng-bugs, mmuench, ngangadh, nojha, pasik, pdhiran, rzarzyns, skanta, sseshasa, tserlin, vereddy, vumrao | |
| Target Milestone: | --- | Keywords: | FutureFeature | |
| Target Release: | 5.2 | |||
| Hardware: | All | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | ceph-16.2.8-2.el8cp | Doc Type: | Bug Fix | |
| Doc Text: |
.Ceph cluster issues a health warning if the `require-osd-release` flag is not set to the appropriate release after a cluster upgrade.
Previously, the logic in the code that detects the `require-osd-release` flag mismatch after an upgrade was inadvertently removed during a code refactoring effort. Since the warning was not raised in the `ceph -s` output post an upgrade, any change made to the cluster without setting the flag to the appropriate release resulted in issues, such as, placement groups (PG) stuck in certain states, excessive Ceph process memory consumption, slow requests, among many other issues.
With this fix, the Ceph cluster issues a health warning if the `require-osd-release` flag is not set to the appropriate release after a cluster upgrade.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 2033078 (view as bug list) | Environment: | ||
| Last Closed: | 2022-08-09 17:35:53 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2033078, 2126050 | |||
|
Description
Michael J. Kidd
2021-08-01 01:50:18 UTC
This issue is verified while upgrading 4.3 [ceph version 14.2.22-110.el8cp (2e0d97dbe192cca7419bbf3f8ee6b7abb42965c4) nautilus (stable)] to 5.2 [ceph version 16.2.8-65.el8cp (79f0367338897c8c6d9805eb8c9ad24af0dcd9c7) pacific (stable)].
When the upgrade is in progress and the osds are upgraded to pacific, the warning message was shown up in ceph status.
[root@ceph-relosd-q0ejl3-node1-installer cephuser]# ceph versions
{
"mon": {
"ceph version 16.2.8-65.el8cp (79f0367338897c8c6d9805eb8c9ad24af0dcd9c7) pacific (stable)": 3
},
"mgr": {
"ceph version 16.2.8-65.el8cp (79f0367338897c8c6d9805eb8c9ad24af0dcd9c7) pacific (stable)": 3
},
"osd": {
"ceph version 16.2.8-65.el8cp (79f0367338897c8c6d9805eb8c9ad24af0dcd9c7) pacific (stable)": 12
},
"mds": {
"ceph version 16.2.8-65.el8cp (79f0367338897c8c6d9805eb8c9ad24af0dcd9c7) pacific (stable)": 1
},
"rgw": {
"ceph version 14.2.22-110.el8cp (2e0d97dbe192cca7419bbf3f8ee6b7abb42965c4) nautilus (stable)": 2
},
"rgw-nfs": {
"ceph version 14.2.22-110.el8cp (2e0d97dbe192cca7419bbf3f8ee6b7abb42965c4) nautilus (stable)": 1
},
"overall": {
"ceph version 14.2.22-110.el8cp (2e0d97dbe192cca7419bbf3f8ee6b7abb42965c4) nautilus (stable)": 3,
"ceph version 16.2.8-65.el8cp (79f0367338897c8c6d9805eb8c9ad24af0dcd9c7) pacific (stable)": 19
}
}
[root@ceph-relosd-q0ejl3-node1-installer cephuser]# ceph osd dump | grep require_osd_release
require_osd_release nautilus
[root@ceph-relosd-q0ejl3-node1-installer cephuser]# ceph status
cluster:
id: 85dba4a0-4e42-403f-b180-96bc8dab5f64
health: HEALTH_WARN
mons are allowing insecure global_id reclaim
all OSDs are running pacific or later but require_osd_release < pacific
2 pools have too few placement groups
services:
mon: 3 daemons, quorum ceph-relosd-q0ejl3-node3,ceph-relosd-q0ejl3-node2,ceph-relosd-q0ejl3-node1-installer (age 10m)
mgr: ceph-relosd-q0ejl3-node1-installer(active, since 7m), standbys: ceph-relosd-q0ejl3-node2, ceph-relosd-q0ejl3-node3
mds: 1/1 daemons up
osd: 12 osds: 12 up (since 2m), 12 in (since 3d)
rgw: 4 daemons active (2 hosts, 1 zones)
rgw-nfs: 1 daemon active (1 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 8 pools, 153 pgs
objects: 251 objects, 15 KiB
usage: 498 MiB used, 239 GiB / 240 GiB avail
pgs: 153 active+clean
io:
client: 153 KiB/s rd, 0 B/s wr, 152 op/s rd, 88 op/s wr
[root@ceph-relosd-q0ejl3-node1-installer cephuser]#
By the time the cluster was fully upgraded to 5.2 the warning was disappeared and the require_osd_release flag was set to pacific.
[root@ceph-relosd-q0ejl3-node1-installer cephuser]# ceph osd dump | grep require_osd_release
require_osd_release pacific
[root@ceph-relosd-q0ejl3-node1-installer cephuser]# ceph status
cluster:
id: 85dba4a0-4e42-403f-b180-96bc8dab5f64
health: HEALTH_WARN
mons are allowing insecure global_id reclaim
1 pools have too few placement groups
services:
mon: 3 daemons, quorum ceph-relosd-q0ejl3-node3,ceph-relosd-q0ejl3-node2,ceph-relosd-q0ejl3-node1-installer (age 15m)
mgr: ceph-relosd-q0ejl3-node1-installer(active, since 13m), standbys: ceph-relosd-q0ejl3-node2, ceph-relosd-q0ejl3-node3
mds: 1/1 daemons up
osd: 12 osds: 12 up (since 7m), 12 in (since 3d)
rgw: 4 daemons active (2 hosts, 1 zones)
rgw-nfs: 1 daemon active (1 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 8 pools, 177 pgs
objects: 251 objects, 15 KiB
usage: 528 MiB used, 239 GiB / 240 GiB avail
pgs: 177 active+clean
io:
client: 2.5 KiB/s rd, 2 op/s rd, 0 op/s wr
[root@ceph-relosd-q0ejl3-node1-installer cephuser]# ceph versions
{
"mon": {
"ceph version 16.2.8-65.el8cp (79f0367338897c8c6d9805eb8c9ad24af0dcd9c7) pacific (stable)": 3
},
"mgr": {
"ceph version 16.2.8-65.el8cp (79f0367338897c8c6d9805eb8c9ad24af0dcd9c7) pacific (stable)": 3
},
"osd": {
"ceph version 16.2.8-65.el8cp (79f0367338897c8c6d9805eb8c9ad24af0dcd9c7) pacific (stable)": 12
},
"mds": {
"ceph version 16.2.8-65.el8cp (79f0367338897c8c6d9805eb8c9ad24af0dcd9c7) pacific (stable)": 1
},
"rgw": {
"ceph version 16.2.8-65.el8cp (79f0367338897c8c6d9805eb8c9ad24af0dcd9c7) pacific (stable)": 4
},
"rgw-nfs": {
"ceph version 16.2.8-65.el8cp (79f0367338897c8c6d9805eb8c9ad24af0dcd9c7) pacific (stable)": 1
},
"overall": {
"ceph version 16.2.8-65.el8cp (79f0367338897c8c6d9805eb8c9ad24af0dcd9c7) pacific (stable)": 24
}
}
[root@ceph-relosd-q0ejl3-node1-installer cephuser]#
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat Ceph Storage Security, Bug Fix, and Enhancement Update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5997 |