Bug 1347625 - [geo-rep] Stopped geo-rep session gets started automatically once all the master nodes are upgraded
Summary: [geo-rep] Stopped geo-rep session gets started automatically once all the mas...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: geo-replication
Version: rhgs-3.1
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: RHGS 3.2.0
Assignee: Saravanakumar
QA Contact: Rahul Hinduja
URL:
Whiteboard:
Depends On:
Blocks: 1351071 1351522 1351530 1368053 1368055
TreeView+ depends on / blocked
 
Reported: 2016-06-17 09:31 UTC by Rahul Hinduja
Modified: 2017-03-23 05:37 UTC (History)
5 users (show)

Fixed In Version: glusterfs-3.8.4-1
Doc Type: Bug Fix
Doc Text:
If geo-replication status was requested after an upgrade but before glusterd was started again, an empty monitor.status was created and the session status was listed as 'Started'. This meant that when glusterd restarted, because monitor.status was empty, a fresh geo-replication session started instead of the previous session being resumed. This has been corrected so that an empty monitor.status results in an error, and geo-replication status is listed as 'Stopped' after an upgrade but before glusterd restarts.
Clone Of:
: 1351071 (view as bug list)
Environment:
Last Closed: 2017-03-23 05:37:09 UTC
Embargoed:
sarumuga: needinfo+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:0486 0 normal SHIPPED_LIVE Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update 2017-03-23 09:18:45 UTC

Description Rahul Hinduja 2016-06-17 09:31:29 UTC
Description of problem:
=======================

Found the situation, where stopped the existing geo-replication session and upgraded each master node one by one. Upgraded nodes needed reboot since, they upgraded the kernel version. 

Had 3 master nodes: <n1,n2 and n3> all with 3.1.2 bits and geo-replication in stopped state. 

1. Upgrade n1 to 3.1.3 and do not reboot
2. Check geo-rep session, it is in stopped state.
3. Upgrade n2 to 3.1.3 and reboot
4. Check geo-rep session, it is in stoppped state.
5. Upgrade n3 to 3.1.3 and reboot
6. Check geo-rep session, it is in stopped  state.
7. reboot n1
7. Check geo-rep session, it is in started state.

Expected is to be in stopped state. 

Version-Release number of selected component (if applicable):
=============================================================
glusterfs-3.7.9-10


How reproducible:
=================
2/2

Additional info:
================

I have tested reboot scenarios as part of 3.1.3 and as mentioned above, the stopped state remains in stop state until all gets rebooted once. I will try to narrow down the use case without upgrade.

Comment 5 Saravanakumar 2016-07-01 06:42:17 UTC
patch  posted in upstream:
http://review.gluster.org/14830

Comment 7 Atin Mukherjee 2016-09-17 13:26:21 UTC
Upstream mainline : http://review.gluster.org/14830
Upstream 3.8 : http://review.gluster.org/15196

And the fix is available in rhgs-3.2.0 as part of rebase to GlusterFS 3.8.4.

Comment 11 Rahul Hinduja 2017-03-09 14:54:54 UTC
Upgraded from 3.1.2 to 3.2.0 (glusterfs-geo-replication-3.8.4-18.el7rhgs.x86_64). Geo-rep sessions remains in stopped state after rebooting all nodes in cluster. Moving this bug to verified state.

Comment 14 errata-xmlrpc 2017-03-23 05:37:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html


Note You need to log in before you can comment on or make changes to this bug.