.Old zone group name is sometimes displayed alongside with the new one
In a multi-site configuration when a zone group is renamed, other zones can in some cases continue to display the old zone group name in the output of the `radosgw-admin zonegroup list` command.
To work around this issue:
. Verify that the new zone group name is present on each cluster.
. Remove the old zone group name:
$ rados -p .rgw.root rm zonegroups_names.<old-name>
Description of problem:
Rename zonegroup on master zone and do a period update commit. The non-master zones ends up with both the old and new zonegroup names.. However period gets updated correctly on all the zones.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Configure 3-way multisite.
2. Change zonegroup name on master zone and update the changes on all zones:
radosgw-admin zonegroup rename --rgw-zonegroup=us --zonegroup-new-name=US --master --default --endpoints=http://magna039:8080
radosgw-admin period update --commit
After this, sync status errors out with:
# radosgw-admin sync status --debug-rgw=0
realm 3d6b536c-0e74-446e-9f4e-08cd3ab01a6b (movies)
zonegroup 7a07c13c-85ea-4660-9ae9-e70fa57ee2dc (us)
zone 931420c9-f70c-4099-a843-b34c4ba3da7a (us-west)
metadata sync syncing
full sync: 0/64 shards
metadata is caught up with master
incremental sync: 64/64 shards
2017-02-17 08:59:22.141421 7f56eb6819c0 0 ERROR: failed to fetch datalog info
data sync source: 94b94a1a-6aa1-4944-9064-a5ae68bf3811 (us-east)
full sync: 0/128 shards
incremental sync: 128/128 shards
data is caught up with source
source: b4607000-b77f-48c5-bd70-85913a788035 (us-central)
failed to retrieve sync info: (5) Input/output error
All swift/S3 commands hang.
On the non-master zones we end up with both older and new zonegroup names :
Ok, I tried to reproduce it. What I don't see is swift/S3 commands hanging. But I do see that on the non-master zones, we end up having both old and the new zonegroup names.
On master after renaming zg from 'us' to 'US':
#radosgw-admin zonegroup list
On non-master zones:
But the period shows the zonegroup new name alone, which is correct.
Okay, thanks Shilpa. So the issue is a leak of the old zonegroup name object on non-master zones. I can confirm that radosgw doesn't have any logic that tries to clean these up when it gets a new period.
As a workaround, the rados tool can remove the 'zonegroups_names.us' object from the .rgw.root pool on non-master zones:
$ rados -p .rgw.root rm zonegroups_names.us
Casey, will this be fixed in Ceph v10.2.7, or will we need more downstream patches on top of that release?
Ken, this has not been fixed upstream.
I have closed this issue because it has been inactive for some time now. If you feel this still deserves attention feel free to reopen it.