1363689 – [RGW] After a cluster reboot, a rgw service restart is needed on the clients

Bug 1363689 - [RGW] After a cluster reboot, a rgw service restart is needed on the clients

Summary: [RGW] After a cluster reboot, a rgw service restart is needed on the clients

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	RGW
Sub Component:
Version:	2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	rc
Target Release:	2.2
Assignee:	Matt Benjamin (redhat)
QA Contact:	shilpa
Docs Contact:	Bara Ancincova
URL:
Whiteboard:
Depends On:
Blocks:	1322504 1383917 1412948
TreeView+	depends on / blocked

Reported:	2016-08-03 11:05 UTC by Tejas
Modified:	2022-02-21 18:17 UTC (History)
CC List:	9 users (show)
Fixed In Version:	RHEL: ceph-10.2.5-14.el7cp Ubuntu: ceph_10.2.5-7redhat1xenial
Doc Type:	Bug Fix
Doc Text:	.Restart of the `radosgw` service on clients is no longer needed after rebooting the cluster Previously, after rebooting the Ceph cluster, it was necessary to restart the `radosgw` service on the Ceph Object Gateway clients to restore the connection with the cluster. With this update, the restart of `radosgw` is no longer needed.
Clone Of:
Environment:
Last Closed:	2017-03-14 15:44:52 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1406047	0	unspecified	CLOSED	[RGW - Ubuntu] - RGW service doesn't start automatically after reboot	2021-02-22 00:41:40 UTC
Red Hat Product Errata	RHBA-2017:0514	0	normal	SHIPPED_LIVE	Red Hat Ceph Storage 2.2 bug fix and enhancement update	2017-03-21 07:24:26 UTC

Internal Links: 1406047

Description Tejas 2016-08-03 11:05:04 UTC

Description of problem:
A rgw service restart is needed on all the RGW clients post a Ceph cluster reboot.

Version-Release number of selected component (if applicable):
ceph version 10.2.2-21redhat1xenial

How reproducible:
Always

Steps to Reproduce:
1. RGW IO in progress on an 2 way multisite active active 
2. Reboot the primary cluster
3. After the primary comes up, the RGW client has lost connection with the cluster, and needs a service restart

Expected results:
Not sure, but atleast needs a mention to the customer.


root@magna086:~# systemctl status ceph-radosgw.service 
* ceph-radosgw.service - Ceph rados gateway
   Loaded: loaded (/lib/systemd/system/ceph-radosgw@.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2016-08-01 09:37:43 UTC; 2 days ago
 Main PID: 3423 (radosgw)
   CGroup: /system.slice/system-ceph\x2dradosgw.slice/ceph-radosgw.service
           `-3423 /usr/bin/radosgw -f --cluster master --name client.rgw.magna086 --setuser ceph --setgroup ceph

Aug 01 09:37:43 magna086 systemd[1]: Started Ceph rados gateway.
Aug 02 07:43:14 magna086 radosgw[3423]: 2016-08-02 07:43:14.837382 7f135effd700 -1 RGWWatcher::handle_error cookie 94141816127056 err (110) Connection timed out
Aug 02 07:43:14 magna086 radosgw[3423]: 2016-08-02 07:43:14.838232 7f135effd700 -1 RGWWatcher::handle_error cookie 94141816151104 err (110) Connection timed out
Aug 03 09:35:06 magna086 radosgw[3423]: 2016-08-03 09:35:06.263131 7f135effd700 -1 RGWWatcher::handle_error cookie 94141816111696 err (107) Transport endpoint is not connected
Aug 03 09:40:06 magna086 radosgw[3423]: 2016-08-03 09:40:06.295158 7f135effd700 -1 RGWWatcher::handle_error cookie 94141816148448 err (107) Transport endpoint is not connected
root@magna086:~#

Comment 5 Matt Benjamin (redhat) 2016-10-03 17:37:36 UTC

We'd like more information on the amount of delay required before the secondary/primary radosgw process can be successfully restarted.

Comment 6 Ken Dreyer (Red Hat) 2017-01-20 23:29:48 UTC

Matt, who can provide that information?

Comment 7 Matt Benjamin (redhat) 2017-01-20 23:56:17 UTC

(In reply to Ken Dreyer (Red Hat) from comment #6)
> Matt, who can provide that information?

I don't recall the need for this info--I'll coordinate w/Shilpa Monday.

Comment 11 shilpa 2017-02-20 09:23:04 UTC

Verified on  ceph-10.2.5-27

Comment 15 errata-xmlrpc 2017-03-14 15:44:52 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0514.html

Note You need to log in before you can comment on or make changes to this bug.