Bug 1611763
| Summary: | RGW Dynamic bucket index resharding keeps resharding same buckets | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Scoots Hamilton <schamilt> | ||||
| Component: | RGW | Assignee: | J. Eric Ivancich <ivancich> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Vidushi Mishra <vimishra> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 3.0 | CC: | assingh, cbodley, ceph-eng-bugs, ceph-qe-bugs, dzafman, hnallurv, ivancich, kbader, kchai, mbenjamin, mhackett, mmanjuna, nojha, pasik, sweil, tchandra, tserlin, vimishra, vumrao | ||||
| Target Milestone: | rc | ||||||
| Target Release: | 3.2 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | RHEL: ceph-12.2.8-26.el7cp Ubuntu: ceph_12.2.8-25redhat1 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2019-01-03 19:01:46 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Scoots Hamilton
2018-08-02 16:52:52 UTC
The upstream PR is here, currently DNM, cleaning up:
https://github.com/ceph/ceph/pull/24406
Pushed to ceph-3.2-rhel-patches. Here are some of the tests I used to verify the cases. All tests are based on inserting code at the top of the inner-most loop in RGWBucketReshard::do_reshard. In each case we're also testing resharding on a bucket with more than 30 objects and with the rgw_reshard_bucket_lock_duration set to 30. Test 1: Insert sleep(1); -- this tests whether renewing the lock works when the resharding is taking somewhat longer than expected. Test 2: Insert sleep(32); -- this tests proper recovery when we're unable to renew the lock before it expires. Test 3: Insert static int i = 0; if (++i > 10) exit(1); -- this tests crashing of the radosgw code leaving things in a non-cleaned up state. I then restart everything and make sure I can read/write the bucket index (e.g., list the bucket, remove an object from the bucket). Furthermore I checked the radosgw log file to see if it includes "apparently successfully cleared resharding flags for bucket...". *** Bug 1644212 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0020 |