Bug 2111224 - [RFE] ceph orch upgrade should set noout, nodeep-scrub, and noscrub and unset when the upgrade will complete
Summary: [RFE] ceph orch upgrade should set noout, nodeep-scrub, and noscrub and unset...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Cephadm
Version: 5.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 9.1
Assignee: Adam King
QA Contact: Manasa
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-07-26 19:06 UTC by Vikhyat Umrao
Modified: 2025-08-29 07:40 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2025-08-29 07:40:40 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 56670 0 None None None 2022-07-26 19:06:48 UTC
Red Hat Issue Tracker RHCEPH-4934 0 None None None 2022-07-26 19:07:31 UTC

Description Vikhyat Umrao 2022-07-26 19:06:49 UTC
Description of problem:
[RFE] ceph orch upgrade should set noout, nodeep-scrub, and noscrub and unset when the upgrade will complete

Version-Release number of selected component (if applicable):
RHCS 5 and above

Upstream tracker - https://tracker.ceph.com/issues/56670

- This was the case when we used to use ceph-ansible
- This is a kind of feature parity b/w ceph-ansible and cephadm

- This feature can be designed as optional with default as True so if some users/admins do not want then can set it to false.

Benefits:

1. Less load from scrubbing during the upgrade when we expect to have recovery in the cluster
2. If an OSD is taking longer to reboot -> boot due to different issues1 or slow boot

[1] For example, PG dups issue - https://tracker.ceph.com/issues/53729 - it takes approx 7 to 8 minutes for an NVMe OSD to boot with 50M dups and approx 12-15 minutes for hybrid HDD OSDs
and if an OSD takes more than 10 minutes the Monitor marks the down OSD out and we will have backfill/recovery in the cluster when the upgrade is running and we do not want that
There can be multiple examples hence running the upgrade with the following flags is recommended:

noscrub
nodeep-scrub
noout

Comment 2 Vikhyat Umrao 2022-08-17 18:44:44 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1982056 - another RFE on the same line to take care of pg_autoscaler and balancer during upgrade!

Comment 11 Abhishek Kane 2025-06-10 04:15:18 UTC
Moving all the RFEs out of 9.0 as they were not prioritized by PMs, and engineering team doesn't have bandwidth to take them up.

Comment 12 Sahina Bose 2025-08-29 07:40:40 UTC
Closing this bug as part of bulk closing of bugs that have been open for more than a year without any significant updates. Please reopen with justification if you think this bug is still relevant and needs to be addressed in an upcoming release


Note You need to log in before you can comment on or make changes to this bug.