Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1423402 - Three-way Multisite: Zonegroup rename ends up in incorrect state
Three-way Multisite: Zonegroup rename ends up in incorrect state
Status: ASSIGNED
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: RGW (Show other bugs)
2.2
Unspecified Unspecified
urgent Severity medium
: rc
: 3.*
Assigned To: Casey Bodley
shilpa
Erin Donnelly
:
Depends On:
Blocks: 1412948 1437916 1494421
  Show dependency treegraph
 
Reported: 2017-02-17 04:14 EST by shilpa
Modified: 2018-10-18 13:03 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Known Issue
Doc Text:
.Old zone group name is sometimes displayed alongside with the new one In a multi-site configuration when a zone group is renamed, other zones can in some cases continue to display the old zone group name in the output of the `radosgw-admin zonegroup list` command. To work around this issue: . Verify that the new zone group name is present on each cluster. . Remove the old zone group name: + ---- $ rados -p .rgw.root rm zonegroups_names.<old-name> ---- //
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description shilpa 2017-02-17 04:14:04 EST
Description of problem:
Rename zonegroup on master zone and do a period update commit. The non-master zones ends up with both the old and new zonegroup names.. However period gets updated correctly on all the zones. 

Version-Release number of selected component (if applicable):
ceph-radosgw-10.2.5-22.el7cp.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Configure 3-way multisite. 
2. Change zonegroup name on master zone and update the changes on all zones:

radosgw-admin zonegroup rename --rgw-zonegroup=us --zonegroup-new-name=US --master --default  --endpoints=http://magna039:8080

radosgw-admin period update --commit

  "period_map": {
        "id": "55c0334d-2fab-4eb5-b73a-532244717cb3",
        "zonegroups": [
            {
                "id": "7a07c13c-85ea-4660-9ae9-e70fa57ee2dc",
                "name": "US",
                "api_name": "us",
                "is_master": "true",
                "endpoints": [
                    "http:\/\/magna039:8080"



Actual results:

 After this, sync status errors out with:
# radosgw-admin sync status --debug-rgw=0
          realm 3d6b536c-0e74-446e-9f4e-08cd3ab01a6b (movies)
      zonegroup 7a07c13c-85ea-4660-9ae9-e70fa57ee2dc (us)
           zone 931420c9-f70c-4099-a843-b34c4ba3da7a (us-west)
  metadata sync syncing
                full sync: 0/64 shards
                metadata is caught up with master
                incremental sync: 64/64 shards
2017-02-17 08:59:22.141421 7f56eb6819c0  0 ERROR: failed to fetch datalog info
      data sync source: 94b94a1a-6aa1-4944-9064-a5ae68bf3811 (us-east)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is caught up with source
                source: b4607000-b77f-48c5-bd70-85913a788035 (us-central)
                        failed to retrieve sync info: (5) Input/output error


All swift/S3 commands hang.


Additional info:

On the non-master zones we end up with both older and new zonegroup names :
{
    "default_info": "7a07c13c-85ea-4660-9ae9-e70fa57ee2dc",
    "zonegroups": [
        "us",
        "US",
        "default"
    ]
}
Comment 6 shilpa 2017-02-23 00:55:46 EST
Ok, I tried to reproduce it. What I don't see is swift/S3 commands hanging. But I do see that on the non-master zones, we end up having both old and the new zonegroup names. 

On master after renaming zg from 'us' to 'US':

#radosgw-admin zonegroup list
{
    "default_info": "8eb889a4-9716-4bba-96db-231b260f6f61",
    "zonegroups": [
        "us",
        "default"
    ]
}

On non-master zones:

{
    "default_info": "8eb889a4-9716-4bba-96db-231b260f6f61",
    "zonegroups": [
        "us",
        "US",
        "default"
    ]
}


But the period shows the zonegroup new name alone, which is correct.

"period_map": {
        "id": "c2a1459d-eecc-4231-a340-0549f71b2d42",
        "zonegroups": [
            {
                "id": "8eb889a4-9716-4bba-96db-231b260f6f61",
                "name": "US",
                "api_name": "us",
                "is_master": "true",
                "endpoints": [
                    "http:\/\/magna039:8080"
Comment 7 Casey Bodley 2017-02-27 10:36:40 EST
Okay, thanks Shilpa. So the issue is a leak of the old zonegroup name object on non-master zones. I can confirm that radosgw doesn't have any logic that tries to clean these up when it gets a new period.

As a workaround, the rados tool can remove the 'zonegroups_names.us' object from the .rgw.root pool on non-master zones:

$ rados -p .rgw.root rm zonegroups_names.us
Comment 15 Ken Dreyer (Red Hat) 2017-04-06 19:06:44 EDT
Casey, will this be fixed in Ceph v10.2.7, or will we need more downstream patches on top of that release?
Comment 16 Casey Bodley 2017-04-17 15:10:38 EDT
Ken, this has not been fixed upstream.

Note You need to log in before you can comment on or make changes to this bug.