Bug 1639712

Summary: dynamic bucket resharding unexpected behavior
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: John Harrigan <jharriga>
Component: RGWAssignee: Mark Kogan <mkogan>
Status: CLOSED ERRATA QA Contact: Tejas <tchandra>
Severity: medium Docs Contact: Bara Ancincova <bancinco>
Priority: medium    
Version: 3.1CC: anharris, cbodley, ceph-eng-bugs, dfuller, edonnell, hnallurv, ivancich, kbader, khartsoe, mbenjamin, mhackett, mkogan, pasik, sweil, tserlin, vakulkar
Target Milestone: rc   
Target Release: 3.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: RHEL: ceph-12.2.12-30.el7cp Ubuntu: ceph_12.2.12-26redhat1 Doc Type: Bug Fix
Doc Text:
.Bucket resharding status is now displayed in plain language Previously, the `radosgw-admin reshard status --bucket _bucket_name_` command used identifier-like tokens as follows to display the resharding status of a bucket: * CLS_RGW_RESHARD_NONE * CLS_RGW_RESHARD_IN_PROGRESS * CLS_RGW_RESHARD_DONE With this update, the command uses plain language to display the status: * not-resharding * in-progress * done
Story Points: ---
Clone Of:
: 1659647 (view as bug list) Environment:
Last Closed: 2019-08-21 15:10:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1641792, 1659647, 1726135    

Description John Harrigan 2018-10-16 12:39:19 UTC
Description of problem:
Documentation indicates that by default dynamic bucket resharding is enabled by default in Luminous release. I installed RHCS 3.1 GA software, ran RGW workloads
and was not able to invoke this activity.

Version-Release number of selected component (if applicable):
RHCS 3.1 ceph version 12.2.5-42.el7cp (82d52d7efa6edec70f6a0fc306f40b89265535fb) luminous (stable)

How reproducible:
Always

Steps to Reproduce:
1. The documentation here
http://docs.ceph.com/docs/mimic/radosgw/dynamicresharding/ states
  Enable/Disable Dynamic bucket index resharding:
  -rgw_dynamic_resharding: true/false, default: true.
  rgw_max_objs_per_shard: maximum number of objects per bucket index shard, 
       default: 100000 objects.
2. Monitored workload which created a pool with five buckets and 22M objects total using this cmd I noticed the queue is always empty
  # radosgw-admin reshard list
3. Ran other workloads which created more objects and never saw any entries
in 'reshard list' output

Actual results:
Running various workloads I never saw the 'reshard list' cmd have any entries

Expected results:
  As stated in the documentation:
<BEGIN>
Each bucket index shard can handle its entries efficiently up until reaching a certain threshold of entries. If this threshold is exceeded the system could encounter performance issues. The dynamic resharding feature detects this situation and increases automatically the number of shards used by the bucket index, resulting in the reduction of the number of entries in each bucket index shard. This process is transparent to the user.

The detection process runs: 1. When new objects are added to the bucket 2. In a background process that periodically scans all the buckets This is needed in order to deal with existng buckets in the system that are not being updated.
<END> 

Additional info:
On my cluster dynamic bucket sharding related settings have these values:

    "rgw_dynamic_resharding": "true"
    "rgw_override_bucket_index_max_shards": "0"

which leaves the feature disabled as indicated by "num_shards":-1 :
radosgw-admin reshard status --bucket mycontainers5
[
    {
        "reshard_status": 0,
        "new_bucket_instance_id": "",
        "num_shards": -1
    },
    ...  <repeated four times> ....
]

Comment 8 John Harrigan 2018-11-30 18:38:07 UTC
thanks Mark.
so is the issue with this cmd?
 # radosgw-admin reshard list

This BZ looks related to my observations
https://bugzilla.redhat.com/show_bug.cgi?id=1479801

- John

Comment 9 J. Eric Ivancich 2018-12-05 15:41:19 UTC
Since Mark Kogan is working on this, making him the assignee.

Comment 32 errata-xmlrpc 2019-08-21 15:10:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2538