Bug 1233575

Summary:	[geo-rep]: Setting meta volume config to false when meta volume is stopped/deleted leads geo-rep to faulty
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Rahul Hinduja <rhinduja>
Component:	geo-replication	Assignee:	Kotresh HR <khiremat>
Status:	CLOSED ERRATA	QA Contact:	Rahul Hinduja <rhinduja>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	rhgs-3.1	CC:	aavati, csaba, khiremat, nlevinki, rcyriac
Target Milestone:	---
Target Release:	RHGS 3.1.0
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	glusterfs-3.7.1-6	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	1234694 (view as bug list)		Environment:
Last Closed:	2015-07-29 05:06:16 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1202842, 1223636, 1234694, 1234695

Description Rahul Hinduja 2015-06-19 08:13:11 UTC

Description of problem:
======================

In a scenario, where the shared volume (gluster_shared_storage) is stopped/deleted or non-existing. And the config use_meta_volume is set to false. The worker fails with "_GMaster: Meta-volume is not mounted. Worker Exiting..."

[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.154::slave status
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
----------------------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.154::slave    10.70.46.101    Active     Changelog Crawl    2015-06-19 18:10:14          
georep1        master        /rhs/brick2/b2    root          10.70.46.154::slave    10.70.46.101    Active     Changelog Crawl    2015-06-19 18:10:14          
georep2        master        /rhs/brick1/b1    root          10.70.46.154::slave    10.70.46.103    Passive    N/A                N/A                          
georep2        master        /rhs/brick2/b2    root          10.70.46.154::slave    10.70.46.103    Passive    N/A                N/A                          
georep3        master        /rhs/brick1/b1    root          10.70.46.154::slave    10.70.46.154    Passive    N/A                N/A                          
georep3        master        /rhs/brick2/b2    root          10.70.46.154::slave    10.70.46.154    Passive    N/A                N/A                          
[root@georep1 scripts]# 

[root@georep1 scripts]# gluster volume stop gluster_shared_storage
Stopping the shared storage volume(gluster_shared_storage), will affect features like snapshot scheduler, geo-replication and NFS-Ganesha. Do you still want to continue? (y/n) y
volume stop: gluster_shared_storage: success
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.154::slave status
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE    STATUS    CRAWL STATUS    LAST_SYNCED          
--------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep1        master        /rhs/brick2/b2    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep2        master        /rhs/brick1/b1    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep2        master        /rhs/brick2/b2    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep3        master        /rhs/brick1/b1    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep3        master        /rhs/brick2/b2    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
[root@georep1 scripts]#

[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.154::slave config use_meta_volume false
geo-replication config updated successfully
[root@georep1 scripts]#

[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.154::slave status
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE    STATUS    CRAWL STATUS    LAST_SYNCED          
--------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep1        master        /rhs/brick2/b2    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep2        master        /rhs/brick1/b1    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep2        master        /rhs/brick2/b2    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep3        master        /rhs/brick1/b1    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
georep3        master        /rhs/brick2/b2    root          10.70.46.154::slave    N/A           Faulty    N/A             N/A                  
[root@georep1 scripts]#


Version-Release number of selected component (if applicable):
==============================================================

glusterfs-3.7.1-3.el6rhs.x86_64


How reproducible:
=================

Always

Comment 4 Kotresh HR 2015-06-23 13:25:30 UTC

Upstream Patch (master):
http://review.gluster.org/#/c/11358/

Upstream Patch (3.7):
http://review.gluster.org/#/c/11359/

Comment 5 Kotresh HR 2015-06-25 09:46:32 UTC

Downstream Patch:
https://code.engineering.redhat.com/gerrit/#/c/51580/

Comment 6 Rahul Hinduja 2015-07-04 11:30:42 UTC

Verified the bug with the build: glusterfs-3.7.1-7.el6rhs.x86_64

[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave status
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
----------------------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-04 16:56:24          
georep1        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-04 16:56:24          
georep2        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                          
georep2        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                          
georep3        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-04 16:56:15          
georep3        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-04 16:56:15          
georep4        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                          
georep4        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                          
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave config use_meta_volume
true
[root@georep1 scripts]# gluster volume list
gluster_shared_storage
master
[root@georep1 scripts]# gluster volume stop gluster_shared_storage
Stopping the shared storage volume(gluster_shared_storage), will affect features like snapshot scheduler, geo-replication and NFS-Ganesha. Do you still want to continue? (y/n) y
volume stop: gluster_shared_storage: success
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave status
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE    STATUS    CRAWL STATUS    LAST_SYNCED          
--------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.101::slave    N/A           Faulty    N/A             N/A                  
georep1        master        /rhs/brick2/b2    root          10.70.46.101::slave    N/A           Faulty    N/A             N/A                  
georep2        master        /rhs/brick1/b1    root          10.70.46.101::slave    N/A           Faulty    N/A             N/A                  
georep2        master        /rhs/brick2/b2    root          10.70.46.101::slave    N/A           Faulty    N/A             N/A                  
georep3        master        /rhs/brick1/b1    root          10.70.46.101::slave    N/A           Faulty    N/A             N/A                  
georep3        master        /rhs/brick2/b2    root          10.70.46.101::slave    N/A           Faulty    N/A             N/A                  
georep4        master        /rhs/brick1/b1    root          10.70.46.101::slave    N/A           Faulty    N/A             N/A                  
georep4        master        /rhs/brick2/b2    root          10.70.46.101::slave    N/A           Faulty    N/A             N/A                  
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave config use_meta_volume
true
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave config use_meta_volume false
geo-replication config updated successfully
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave config use_meta_volume
false
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave status
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS     LAST_SYNCED                  
--------------------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     History Crawl    2015-07-04 16:56:24          
georep1        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     History Crawl    2015-07-04 16:56:24          
georep2        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A              N/A                          
georep2        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A              N/A                          
georep3        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     History Crawl    2015-07-04 16:56:15          
georep3        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     History Crawl    2015-07-04 16:56:15          
georep4        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A              N/A                          
georep4        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A              N/A                          
[root@georep1 scripts]# 
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave status
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
----------------------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-04 16:56:24          
georep1        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-04 16:56:24          
georep3        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-04 16:56:15          
georep3        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-04 16:56:15          
georep2        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                          
georep2        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                          
georep4        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                          
georep4        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                          
[root@georep1 scripts]#


From Code:
            raise
        logging.debug("Got the lock")
        return True

    def should_crawl(self):
        if not boolify(gconf.use_meta_volume):
            return gconf.glusterd_uuid in self.master.server.node_uuid()



Moving the bug to verified state

Comment 7 errata-xmlrpc 2015-07-29 05:06:16 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1495.html