Bug 1860196

Summary: Ability to cancel on-going scrubs
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Neha Ojha <nojha>
Component: RADOSAssignee: Neha Ojha <nojha>
Status: CLOSED ERRATA QA Contact: skanta
Severity: medium Docs Contact:
Priority: medium    
Version: 3.3CC: akupczyk, bhubbard, ceph-eng-bugs, kdreyer, nojha, pdhiran, rzarzyns, skanta, sseshasa, vereddy
Target Milestone: ---   
Target Release: 5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-16.0.0-8633.el8cp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-30 08:26:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Neha Ojha 2020-07-23 23:51:03 UTC
Although it's possible to prevent initiating new scrubs, we don't have a way cancel on-going ones. The problem pops up during maintenance of large deployments and can also hurt performance testing.

Comment 3 skanta 2021-05-04 11:13:45 UTC
   Followed the steps to test the scenario

1.Started the deep-scrub on the cluster using the command "ceph osd pool deep-scrub <pool-name>"
  ceph osd pool deep-scrub testbench

2. Executed the pgdump to verify that the scrubbing is started or not
   ceph pg dump pgs | grep "scrub"
  
   output:- 
          [ceph: root@magna048 /]# ceph pg dump pgs | grep "scrub"
dumped pgs
* NOTE: Omap statistics are gathered during deep scrub and may be inaccurate soon afterwards depending on utilization. See http://docs.ceph.com/en/latest/dev/placement-group/#omap-statistics for further details.
[ceph: root@magna048 /]# ceph osd pool deep-scrub testbench
[ceph: root@magna048 /]# ceph pg dump pgs | grep "scrub"
dumped pgs
2.16       44028                   0         0          0        0  360677376            0           0  10028     10028  active+clean+scrubbing+deep  2021-05-04T10:40:43.162721+0000  375'44028  375:44420  [13,11,17]          13  [13,11,17]              13   375'44028  2021-05-04T10:38:34.854170+0000              0'0  2021-05-04T01:27:36.270603+0000              0
2.4        43614                   0         0          0        0  357285888            0           0  10014     10014  active+clean+scrubbing+deep  2021-05-04T10:40:43.694362+0000  375'43614  375:43983  [21,14,12]          21  [21,14,12]              21   375'43614  2021-05-04T10:38:32.220660+0000              0'0  2021-05-04T01:27:36.270603+0000              0
2.d        44057                   0         0          0        0  360914944            0           0  10057     10057  active+clean+scrubbing+deep  2021-05-04T10:40:44.123293+0000  375'44057  375:44334    [5,10,3]           5    [5,10,3]               5   375'44057  2021-05-04T10:38:24.456627+0000              0'0  2021-05-04T01:27:36.270603+0000              0
* NOTE: Omap statistics are gathered during deep scrub and may be inaccurate soon afterwards depending on utilization. See http://docs.ceph.com/en/latest/dev/placement-group/#omap-statistics for further details.
[ceph: root@magna048 /]#

3. To stop the scrubbing executed " ceph osd set deep-scrub" not the aborting message.

    In Health status and logs got the messages as "nodeep-scrub flag(s) set"

4. Once I unset the flag the scrubbing is in progress.

The message mentioned in method "_scrub_abort()" at https://github.com/ceph/ceph/pull/35909/files#diff-b64d961f30043bfd4ac456027555f0dc96d58c4482bbc8186f70b46258802309 file not occurred in logs.

Please let me know the procedure is correct or any issues exist in the feature.

Comment 6 skanta 2021-07-16 14:57:10 UTC
Able to schedule the scrub removing the dependent bug.

Comment 11 errata-xmlrpc 2021-08-30 08:26:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.0 bug fix and enhancement), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3294