Bug 2016936

Summary: [RFE] Support for staggered/control upgrade of ceph nodes by role, hosts/rack
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Veera Raghava Reddy <vereddy>
Component: CephadmAssignee: Adam King <adking>
Status: CLOSED ERRATA QA Contact: Manasa <mgowri>
Severity: high Docs Contact: Akash Raj <akraj>
Priority: unspecified    
Version: 5.0CC: adking, akraj, asriram, gjose, lithomas, mgowri, mhackett, tserlin, vashastr, vumrao
Target Milestone: ---Keywords: FutureFeature
Target Release: 5.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-16.2.8-14.el8cp Doc Type: Enhancement
Doc Text:
.{storage-product} 5.2 supports staggered upgrade Starting with {storage-product} 5.2, you can selectively upgrade large Ceph clusters in `cephadm` in multiple smaller steps. The `ceph orch upgrade start` command accepts the following parameters: - `--daemon-types` - `--hosts` - `--services` - `--limit` These parameters selectively upgrade daemons that match the provided values. NOTE: These parameters will be rejected if they would cause `cephadm` to upgrade daemons out of the supported order. NOTE: These upgrade parameters will only be accepted if your active Ceph Manager daemon is on a 5.2 build. Upgrades to 5.2 from an earlier version will not support these parameters.
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-08-09 17:36:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2102272    

Description Veera Raghava Reddy 2021-10-25 08:32:16 UTC
Description of problem:
For rolling upgrade of large Ceph clusters having 1000s of OSDs, need an option to manage upgrade in Cephadm in a controlled way. Option to select upgrade by selecting specific roles, mong, mgrs, ODSS, RGW nodes.
Option to upgrade a particular set of Hosts.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Sebastian Wagner 2021-10-26 08:13:21 UTC
The upgrade works by upgrading daemons in an arbitrary order each single service type one by one. I don't think we should deviate form this order, but I think that we add a way to automatically pause the upgrade after one service type or a number of daemons.

Comment 2 Vikhyat Umrao 2021-10-29 15:24:10 UTC
(In reply to Sebastian Wagner from comment #1)
> The upgrade works by upgrading daemons in an arbitrary order each single
> service type one by one. I don't think we should deviate form this order,
> but I think that we add a way to automatically pause the upgrade after one
> service type or a number of daemons.

If I understand it correctly, the user has the flexibility to choose `daemon-type` and when he/she runs the upgrade for that `daemon-type` upgrade will be serialized for that `daemon-type` and it will upgrade all daemons of that type in one go and user will not have to control it by any way of `failure domains it could be host or rack.

Comment 4 Veera Raghava Reddy 2022-03-10 07:10:29 UTC
https://issues.redhat.com/browse/RHCEPH-3605

Comment 18 errata-xmlrpc 2022-08-09 17:36:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage Security, Bug Fix, and Enhancement Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5997