Bug 1359712 - A master zone switch requires radosgw to be restarted
Summary: A master zone switch requires radosgw to be restarted
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RGW
Version: 2.0
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: rc
: 2.1
Assignee: Casey Bodley
QA Contact: Rachana Patel
Bara Ancincova
URL:
Whiteboard:
Depends On:
Blocks: 1322504 1383917
TreeView+ depends on / blocked
 
Reported: 2016-07-25 10:39 UTC by shilpa
Modified: 2022-02-21 18:03 UTC (History)
11 users (show)

Fixed In Version: RHEL: ceph-10.2.3-2.el7cp Ubuntu: ceph_10.2.3-3redhat1xenial
Doc Type: Bug Fix
Doc Text:
.A restart of the radosgw process is no longer required after switching the zone from master to non-master When a non-master zone was promoted to the master zone, all I/0 requests became unresponsive until the `radosgw` process was restarted on both zones. Consequently, the I/0 requests timed out. The underlying source code has been modified, and restarting `radosgw` is no longer required in the described situation.
Clone Of:
Environment:
Last Closed: 2016-11-22 19:28:53 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2016:2815 0 normal SHIPPED_LIVE Moderate: Red Hat Ceph Storage security, bug fix, and enhancement update 2017-03-22 02:06:33 UTC

Description shilpa 2016-07-25 10:39:27 UTC
Description of problem:
When the zone is switched from master to non-master, all I/O requests hang until a rgw process restart on both the zones

Version-Release number of selected component (if applicable):
ceph-radosgw-10.2.2-27.el7cp.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Modify non-master zone with '--master' flag.
radosgw-admin zone modify --rgw-zonegroup=us --rgw-zone=us-2
--access_key=secret --secret=secret --endpoints=http://magna059:80 --default
--master
2. Update and commit the period


Actual results:
A radosgw restart should not be expected with a period configuration change. However all the I/O's hang until a process restart.

Additional info:

Traceback (most recent call last):
  File "s3del.py", line 21, in <module>
    conn.delete_bucket(buck.name)
  File "/usr/lib/python2.7/site-packages/boto/s3/connection.py", line 641, in delete_bucket
    response = self.make_request('DELETE', bucket, headers=headers)
  File "/usr/lib/python2.7/site-packages/boto/s3/connection.py", line 668, in make_request
    retry_handler=retry_handler
  File "/usr/lib/python2.7/site-packages/boto/connection.py", line 1071, in make_request
    retry_handler=retry_handler)
  File "/usr/lib/python2.7/site-packages/boto/connection.py", line 1028, in _mexe
    raise BotoServerError(response.status, response.reason, body)
boto.exception.BotoServerError: BotoServerError: 504 Gateway Time-out
<html><body><h1>504 Gateway Time-out</h1>
The server didn't respond in time.
</body></html>

Comment 7 Casey Bodley 2016-09-08 18:01:57 UTC
Hi Shilpa,

We have upstream testing that makes multiple changes to the master zone without restarting gateways, and we haven't seen it hit this issue. Can you try to reproduce the issue with the latest build?

Comment 8 shilpa 2016-09-09 08:10:27 UTC
(In reply to Casey Bodley from comment #7)
> Hi Shilpa,
> 
> We have upstream testing that makes multiple changes to the master zone
> without restarting gateways, and we haven't seen it hit this issue. Can you
> try to reproduce the issue with the latest build?

Hi Casey, 

Sure, I will try it on 2.0 Async build?

Comment 11 shilpa 2016-11-04 10:29:57 UTC
Verified on 10.2.3-12. No gateway restart is required to switch master.

Comment 14 shilpa 2016-11-22 13:44:51 UTC
Hi Bara,

It looks good to me.

Comment 16 errata-xmlrpc 2016-11-22 19:28:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2815.html


Note You need to log in before you can comment on or make changes to this bug.