Bug 1233575

Summary: [geo-rep]: Setting meta volume config to false when meta volume is stopped/deleted leads geo-rep to faulty
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Rahul Hinduja <rhinduja>
Component: geo-replicationAssignee: Kotresh HR <khiremat>
Status: CLOSED ERRATA QA Contact: Rahul Hinduja <rhinduja>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: aavati, csaba, khiremat, nlevinki, rcyriac
Target Milestone: ---   
Target Release: RHGS 3.1.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.7.1-6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1234694 (view as bug list) Environment:
Last Closed: 2015-07-29 05:06:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1202842, 1223636, 1234694, 1234695    

Description Rahul Hinduja 2015-06-19 08:13:11 UTC
Description of problem:
======================

In a scenario, where the shared volume (gluster_shared_storage) is stopped/deleted or non-existing. And the config use_meta_volume is set to false. The worker fails with "_GMaster: Meta-volume is not mounted. Worker Exiting..."

[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.154::slave status
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
----------------------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.154::slave    10.70.46.101    Active     Changelog Crawl    2015-06-19 18:10:14          
georep1        master        /rhs/brick2/b2    root          10.70.46.154::slave    10.70.46.101    Active     Changelog Crawl    2015-06-19 18:10:14          
georep2        master        /rhs/brick1/b1    root          10.70.46.154::slave    10.70.46.103    Passive    N/A                N/A                          
georep2        master        /rhs/brick2/b2    root          10.70.46.154::slave    10.70.46.103    Passive    N/A                N/A                          
georep3        master        /rhs/brick1/b1    root          10.70.46.154::slave    10.70.46.154    Passive    N/A                N/A                          
georep3        master        /rhs/brick2/b2    root          10.70.46.154::slave    10.70.46.154    Passive    N/A                N/A                          
[root@georep1 scripts]# 

[root@georep1 scripts]# gluster volume stop gluster_shared_storage
Stopping the shared storage volume(gluster_shared_storage), will affect features like snapshot scheduler, geo-replication and NFS-Ganesha. Do you still want to continue? (y/n) y
volume stop: gluster_shared_storage: success
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.154::slave status
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE    STATUS    CRAWL STATUS    LAST_SYNCED          
--------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep1        master        /rhs/brick2/b2    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep2        master        /rhs/brick1/b1    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep2        master        /rhs/brick2/b2    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep3        master        /rhs/brick1/b1    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep3        master        /rhs/brick2/b2    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
[root@georep1 scripts]#

[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.154::slave config use_meta_volume false
geo-replication config updated successfully
[root@georep1 scripts]#

[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.154::slave status
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE    STATUS    CRAWL STATUS    LAST_SYNCED          
--------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep1        master        /rhs/brick2/b2    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep2        master        /rhs/brick1/b1    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep2        master        /rhs/brick2/b2    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep3        master        /rhs/brick1/b1    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep3        master        /rhs/brick2/b2    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
[root@georep1 scripts]#


Version-Release number of selected component (if applicable):
==============================================================

glusterfs-3.7.1-3.el6rhs.x86_64


How reproducible:
=================

Always

Comment 4 Kotresh HR 2015-06-23 13:25:30 UTC
Upstream Patch (master):
http://review.gluster.org/#/c/11358/

Upstream Patch (3.7):
http://review.gluster.org/#/c/11359/

Comment 5 Kotresh HR 2015-06-25 09:46:32 UTC
Downstream Patch:
https://code.engineering.redhat.com/gerrit/#/c/51580/

Comment 6 Rahul Hinduja 2015-07-04 11:30:42 UTC
Verified the bug with the build: glusterfs-3.7.1-7.el6rhs.x86_64

[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave status
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
----------------------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-04 16:56:24          
georep1        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-04 16:56:24          
georep2        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                          
georep2        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                          
georep3        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-04 16:56:15          
georep3        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-04 16:56:15          
georep4        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                          
georep4        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                          
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave config use_meta_volume
true
[root@georep1 scripts]# gluster volume list
gluster_shared_storage
master
[root@georep1 scripts]# gluster volume stop gluster_shared_storage
Stopping the shared storage volume(gluster_shared_storage), will affect features like snapshot scheduler, geo-replication and NFS-Ganesha. Do you still want to continue? (y/n) y
volume stop: gluster_shared_storage: success
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave status
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE    STATUS    CRAWL STATUS    LAST_SYNCED          
--------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.101::slave    N/A           Faulty    N/A             N/A                  
georep1        master        /rhs/brick2/b2    root          10.70.46.101::slave    N/A           Faulty    N/A             N/A                  
georep2        master        /rhs/brick1/b1    root          10.70.46.101::slave    N/A           Faulty    N/A             N/A                  
georep2        master        /rhs/brick2/b2    root          10.70.46.101::slave    N/A           Faulty    N/A             N/A                  
georep3        master        /rhs/brick1/b1    root          10.70.46.101::slave    N/A           Faulty    N/A             N/A                  
georep3        master        /rhs/brick2/b2    root          10.70.46.101::slave    N/A           Faulty    N/A             N/A                  
georep4        master        /rhs/brick1/b1    root          10.70.46.101::slave    N/A           Faulty    N/A             N/A                  
georep4        master        /rhs/brick2/b2    root          10.70.46.101::slave    N/A           Faulty    N/A             N/A                  
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave config use_meta_volume
true
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave config use_meta_volume false
geo-replication config updated successfully
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave config use_meta_volume
false
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave status
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS     LAST_SYNCED                  
--------------------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     History Crawl    2015-07-04 16:56:24          
georep1        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     History Crawl    2015-07-04 16:56:24          
georep2        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A              N/A                          
georep2        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A              N/A                          
georep3        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     History Crawl    2015-07-04 16:56:15          
georep3        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     History Crawl    2015-07-04 16:56:15          
georep4        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A              N/A                          
georep4        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A              N/A                          
[root@georep1 scripts]# 
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave status
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
----------------------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-04 16:56:24          
georep1        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-04 16:56:24          
georep3        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-04 16:56:15          
georep3        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-04 16:56:15          
georep2        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                          
georep2        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                          
georep4        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                          
georep4        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                          
[root@georep1 scripts]#


From Code:
            raise
        logging.debug("Got the lock")
        return True

    def should_crawl(self):
        if not boolify(gconf.use_meta_volume):
            return gconf.glusterd_uuid in self.master.server.node_uuid()



Moving the bug to verified state

Comment 7 errata-xmlrpc 2015-07-29 05:06:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1495.html