Bug 1445591 - Unable to take snapshot on a geo-replicated volume, even after stopping the session
Summary: Unable to take snapshot on a geo-replicated volume, even after stopping the s...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: geo-replication
Version: rhgs-3.2
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: RHGS 3.2.0 Async
Assignee: Kotresh HR
QA Contact: Rahul Hinduja
URL:
Whiteboard:
Depends On: 1443977
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-04-26 04:38 UTC by Atin Mukherjee
Modified: 2020-12-14 08:34 UTC (History)
11 users (show)

Fixed In Version: glusterfs-3.8.4-18.1
Doc Type: Bug Fix
Doc Text:
Previously, creation of snapshot sometimes failed on a geo-replicated volume, even after stopping the session. This was due to a bug in the way the gusterd builds up state of in-memory active geo-replication sessions. With this fix, you can successfully create snapshots of a geo-replicated volume.
Clone Of: 1416024
Environment:
Last Closed: 2017-06-08 09:36:44 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:1418 0 normal SHIPPED_LIVE glusterfs bug fix update 2017-06-08 13:33:58 UTC

Comment 2 Atin Mukherjee 2017-04-26 04:42:02 UTC
upstream patch : https://review.gluster.org/17093

Comment 3 Kotresh HR 2017-05-04 09:34:19 UTC
Downstream 3.2 patch:

https://code.engineering.redhat.com/gerrit/#/c/105181/

Comment 8 Rahul Hinduja 2017-05-24 11:59:46 UTC
Validated with build: glusterfs-geo-replication-3.8.4-18.1.el7rhgs.x86_64

With Stopped and restart of glusterd:
=====================================

[root@dhcp37-152 scripts]# gluster volume geo-replication master 10.70.37.155::slave status
 
MASTER NODE     MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
-----------------------------------------------------------------------------------------------------------------------------------------------------
10.70.37.152    master        /rhs/brick1/b1    root          10.70.37.155::slave    10.70.37.157    Active     Changelog Crawl    2017-05-24 16:28:04          
10.70.37.152    master        /rhs/brick2/b3    root          10.70.37.155::slave    10.70.37.157    Passive    N/A                N/A                          
10.70.37.153    master        /rhs/brick1/b2    root          10.70.37.155::slave    10.70.37.155    Passive    N/A                N/A                          
10.70.37.153    master        /rhs/brick2/b4    root          10.70.37.155::slave    10.70.37.155    Active     Changelog Crawl    2017-05-24 16:28:04          
[root@dhcp37-152 scripts]# 
[root@dhcp37-152 scripts]# 
[root@dhcp37-152 scripts]# service glusterd restart
Redirecting to /bin/systemctl restart  glusterd.service
[root@dhcp37-152 scripts]# gluster volume geo-replication master 10.70.37.155::slave stop
Stopping geo-replication session between master & 10.70.37.155::slave has been successful
[root@dhcp37-152 scripts]# gluster volume geo-replication master 10.70.37.155::slave status
 
MASTER NODE     MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE    STATUS     CRAWL STATUS    LAST_SYNCED          
----------------------------------------------------------------------------------------------------------------------------------------
10.70.37.152    master        /rhs/brick1/b1    root          10.70.37.155::slave    N/A           Stopped    N/A             N/A                  
10.70.37.152    master        /rhs/brick2/b3    root          10.70.37.155::slave    N/A           Stopped    N/A             N/A                  
10.70.37.153    master        /rhs/brick1/b2    root          10.70.37.155::slave    N/A           Stopped    N/A             N/A                  
10.70.37.153    master        /rhs/brick2/b4    root          10.70.37.155::slave    N/A           Stopped    N/A             N/A                  
[root@dhcp37-152 scripts]# 
[root@dhcp37-152 scripts]# 
[root@dhcp37-152 scripts]# gluster snapshot create snap1 master
snapshot create: success: Snap snap1_GMT-2017.05.24-10.59.03 created successfully
[root@dhcp37-152 scripts]# 
[root@dhcp37-152 scripts]# gluster snapshot list
snap1_GMT-2017.05.24-10.59.03
[root@dhcp37-152 scripts]# 
[root@dhcp37-152 scripts]# gluster volume geo-replication master 10.70.37.155::slave start
Starting geo-replication session between master & 10.70.37.155::slave has been successful
[root@dhcp37-152 scripts]# 
[root@dhcp37-152 scripts]# gluster volume geo-replication master 10.70.37.155::slave status
 
MASTER NODE     MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS     LAST_SYNCED                  
---------------------------------------------------------------------------------------------------------------------------------------------------
10.70.37.152    master        /rhs/brick1/b1    root          10.70.37.155::slave    10.70.37.157    Passive    N/A              N/A                          
10.70.37.152    master        /rhs/brick2/b3    root          10.70.37.155::slave    10.70.37.157    Passive    N/A              N/A                          
10.70.37.153    master        /rhs/brick1/b2    root          10.70.37.155::slave    10.70.37.155    Active     History Crawl    2017-05-24 16:28:07          
10.70.37.153    master        /rhs/brick2/b4    root          10.70.37.155::slave    10.70.37.155    Active     History Crawl    2017-05-24 16:28:07          
[root@dhcp37-152 scripts]# 


With Pause and restart of glusterd:
===================================

[root@dhcp37-152 scripts]# gluster volume geo-replication master 10.70.37.155::slave status
 
MASTER NODE     MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS     LAST_SYNCED                  
---------------------------------------------------------------------------------------------------------------------------------------------------
10.70.37.152    master        /rhs/brick1/b1    root          10.70.37.155::slave    10.70.37.157    Passive    N/A              N/A                          
10.70.37.152    master        /rhs/brick2/b3    root          10.70.37.155::slave    10.70.37.157    Passive    N/A              N/A                          
10.70.37.153    master        /rhs/brick1/b2    root          10.70.37.155::slave    10.70.37.155    Active     History Crawl    2017-05-24 16:28:07          
10.70.37.153    master        /rhs/brick2/b4    root          10.70.37.155::slave    10.70.37.155    Active     History Crawl    2017-05-24 16:28:07          
[root@dhcp37-152 scripts]# 
[root@dhcp37-152 scripts]# 
[root@dhcp37-152 scripts]# 
[root@dhcp37-152 scripts]# service glusterd restart
Redirecting to /bin/systemctl restart  glusterd.service
[root@dhcp37-152 scripts]# gluster volume geo-replication master 10.70.37.155::slave pause
Pausing geo-replication session between master & 10.70.37.155::slave has been successful
[root@dhcp37-152 scripts]# 
[root@dhcp37-152 scripts]# 
[root@dhcp37-152 scripts]# gluster snapshot create snap2 master
snapshot create: success: Snap snap2_GMT-2017.05.24-11.00.29 created successfully
[root@dhcp37-152 scripts]# gluster volume geo-replication master 10.70.37.155::slave status
 
MASTER NODE     MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE    STATUS    CRAWL STATUS    LAST_SYNCED          
---------------------------------------------------------------------------------------------------------------------------------------
10.70.37.152    master        /rhs/brick1/b1    root          10.70.37.155::slave    N/A           Paused    N/A             N/A                  
10.70.37.152    master        /rhs/brick2/b3    root          10.70.37.155::slave    N/A           Paused    N/A             N/A                  
10.70.37.153    master        /rhs/brick1/b2    root          10.70.37.155::slave    N/A           Paused    N/A             N/A                  
10.70.37.153    master        /rhs/brick2/b4    root          10.70.37.155::slave    N/A           Paused    N/A             N/A                  
[root@dhcp37-152 scripts]# gluster volume geo-replication master 10.70.37.155::slave resume
Resuming geo-replication session between master & 10.70.37.155::slave has been successful
[root@dhcp37-152 scripts]# gluster volume geo-replication master 10.70.37.155::slave status
 
MASTER NODE     MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
-----------------------------------------------------------------------------------------------------------------------------------------------------
10.70.37.152    master        /rhs/brick1/b1    root          10.70.37.155::slave    10.70.37.157    Passive    N/A                N/A                          
10.70.37.152    master        /rhs/brick2/b3    root          10.70.37.155::slave    10.70.37.157    Passive    N/A                N/A                          
10.70.37.153    master        /rhs/brick1/b2    root          10.70.37.155::slave    10.70.37.155    Active     Changelog Crawl    2017-05-24 16:28:07          
10.70.37.153    master        /rhs/brick2/b4    root          10.70.37.155::slave    10.70.37.155    Active     Changelog Crawl    2017-05-24 16:28:07          
[root@dhcp37-152 scripts]# 


Validated the basic validation, it works. Moving this bug to verified state.

Comment 9 Divya 2017-05-29 09:50:48 UTC
Kotresh,

Could you review and sign-off the edited doc text?

Comment 10 Kotresh HR 2017-05-29 10:43:07 UTC
(In reply to Divya from comment #9)
> Kotresh,
> 
> Could you review and sign-off the edited doc text?

Minor comment. Snapshot was not always failing. You can add the word 'sometimes'.
Rest of it looks good

Comment 11 Divya 2017-05-29 10:59:59 UTC
(In reply to Kotresh HR from comment #10)
> (In reply to Divya from comment #9)
> > Kotresh,
> > 
> > Could you review and sign-off the edited doc text?
> 
> Minor comment. Snapshot was not always failing. You can add the word
> 'sometimes'.
> Rest of it looks good

Added "sometimes". Thanks for the review.

Comment 13 errata-xmlrpc 2017-06-08 09:36:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1418


Note You need to log in before you can comment on or make changes to this bug.