1200967 – [RFE] Provide method to automatically suspend scrubs during backfill and recovery

Bug 1200967 - [RFE] Provide method to automatically suspend scrubs during backfill and recovery

Summary: [RFE] Provide method to automatically suspend scrubs during backfill and reco...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	RADOS
Sub Component:
Version:	1.2.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	low
Target Milestone:	rc
Target Release:	2.3
Assignee:	David Zafman
QA Contact:	Parikshith
Docs Contact:	Erin Donnelly
URL:
Whiteboard:
Depends On:
Blocks:	1258382 1437916
TreeView+	depends on / blocked

Reported:	2015-03-11 17:53 UTC by Tupper Cole
Modified:	2019-04-16 14:42 UTC (History)
CC List:	14 users (show)
Fixed In Version:	RHEL: ceph-10.2.7-2.el7cp Ubuntu: ceph_10.2.7-3redhat1xenial
Doc Type:	Enhancement
Doc Text:	.Scrub processes can now be disabled during recovery A new option `osd_scrub_during_recovery` has been added with this release. Setting this option to `false` in the Ceph configuration file disables starting new scrub processes during recovery. As a result, the speed of the recovery is enhanced.
Clone Of:
Environment:
Last Closed:	2017-06-19 13:24:55 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Ceph Project Bug Tracker	17866	0	None	None	None	2017-04-12 01:56:05 UTC
Red Hat Product Errata	RHBA-2017:1497	0	normal	SHIPPED_LIVE	Red Hat Ceph Storage 2.3 bug fix and enhancement update	2017-06-19 17:24:11 UTC

Description Tupper Cole 2015-03-11 17:53:29 UTC

Description of problem:Manually suspending scrubs and deep-scubs seems to improve backfill, recovery, rebalancing seems to enhance speed and reduce the number of slow requests logged. A method to make this the default behavior is desirable to customers that suffer poor performance during these operations is requested. 


Version-Release number of selected component (if applicable):Firefly


How reproducible:Consistent behavior


Steps to Reproduce:
1.Add, remove OSDs during scrub operations. 
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Federico Lucifredi 2015-03-26 01:13:35 UTC

This is a committed Tufnell feature.

Comment 2 Anthony D'Atri 2015-06-02 17:57:51 UTC

Thanks.  What's the timeline for Tufnell?  T seems a long ways down the alphabet, does the change from the cephalopod naming scheme to Spinal Tap skip a bunch of letters?

Comment 3 Neil Levine 2015-06-02 18:25:48 UTC

The Cephalopod names refer to the upstream, community releases which downstream is based.  

Tufnell is the codename for Red Hat Ceph Storage v2.0 - ie the downstream product. This will likely be based on Ceph v10 (Jewel) due out later this year. Tufnell is likely to arrive shortly after in Q1 2016.

Comment 4 Anthony D'Atri 2015-06-02 18:40:54 UTC

Ah, gotcha, thanks.  Had heard something about community and RCS forking but had not known of the naming roadmap.

-- aad

Comment 8 Erin Donnelly 2017-05-22 17:02:31 UTC

Hi David,

I’m proposing to add this BZ to the 2.3 release notes. If you agree, could you set the “Doc Type” and “Doc Text” fields?

Thanks,
Erin

Comment 11 Parikshith 2017-05-25 04:04:14 UTC

Hello,

When Recovery/backfill operation starts, will the scrubbing be suspended for scheduled scrub or manual scrub(will manual scrub override it?) or both?

Comment 16 John Poelstra 2017-05-31 15:11:17 UTC

discussed at program meeting, nobody is clear what should happen to this bug for 2.3 ... Neil to discuss with engineering and figure out next steps

Comment 19 Parikshith 2017-06-01 11:37:40 UTC

As per the clarification given I ran following steps:

1. Started long recovery and ran scrub/deep scrub on several osds
2. Monitored cluster status and "ceph pg dump", found no PGs both recovering and scrubbing.

Comment 20 David Zafman 2017-06-13 01:22:30 UTC

This has now been verified so I believe that any issues from me have been resolved.

Comment 22 errata-xmlrpc 2017-06-19 13:24:55 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1497

Note You need to log in before you can comment on or make changes to this bug.