Description of problem: Starting with 2 RHHI pods, I attempted to follow the documentaton to configure geo-replication and get the following error; Unable to fetch slave volume details. Please check the slave cluster and slave volume. The exact steps followed starting with Maintaining Redhat Hyperconverged Infrastructure in section 3.1 Configuring geo-replication for disaster recovery; 1 - [On pod I want to replicate FROM] #gluster volume set all cluster.enable-shared-storage enable 2 - [On the pod I want to replicate TO] #gluster volume set data features.shard enable (Where data is the name of the destination volume) 3 - Pointed to the gluster documentaton, section 10.3.4.1 Setting up your environment for Geo-replication sessoin 4 - [On pod I want to replicate FROM] #gluster system:: execute gsec_create 5 - [On pod I want to replicate FROM] #gluster volume geo-replication data 192.168.50.36::data create push-pem (The volume name on both the source and target systems is called 'data' and 192.168.50.36 is the IP of the master node on the target pod.) 6 - Resulting error; Unable to fetch slave volume details. Please check the slave cluster and slave volume. geo-replication command failed Passwordless ssh is configured. Version-Release number of selected component (if applicable): How reproducible: I've run these commands exactly multiple times in either direction and they both fail the same way. 100% repeatable in my configuration Steps to Reproduce: (See above) Actual results: Unable to fetch slave volume details. Please check the slave cluster and slave volume. geo-replication command failed Expected results: Successfully create a geo-replication session Additional info:
Anoop - It's better to have this mentioned as [CSS] followed by the title. This is a similar method what GSS group follows as well. The reason I say this because people might consider CCS as a term to the problem title itself?
Hi, That just indicates the master node could mount the slave volume. We need the following log file to find out what exactly is the issue. /var/log/glusterfs/geo-replicatoin-slaves/slave.log I agree that cli output should have mentioned you about this log file. This improvement is already merged upstream [1]. With the patch, it correctly displays the log file to looked for. https://review.gluster.org/#/c/19242/
Created attachment 1393239 [details] Slave.log from the SOURCE system Attaching the slave.log from the source system, the system I'm attempting to start the geo-replication process from
Log attached previously
Following a re-install of the destination pod, we were able to successfully create and start geo-replication. We'd still like to know what was the root cause of this.
Not sure why this was flagged as needs info. What information are you looking for?
Verified the bug against log improvement to show the log location incase of wrong slave volume or stopped slave volume. Use Case 1: Wrong Slave volume 3.3.1 [root@dhcp47-167 ~]# gluster volume geo-replication master 10.70.47.17::slave1 create push-pem Unable to fetch slave volume details. Please check the slave cluster and slave volume. geo-replication command failed [root@dhcp47-167 ~]# 3.4 [root@dhcp42-53 ~]# gluster volume geo-replication master 10.70.41.221::slave1 create push-pem Unable to mount and fetch slave volume details. Please check the log: /var/log/glusterfs/geo-replication/gverify-slavemnt.log geo-replication command failed [root@dhcp42-53 ~]# Use Case 2: Slave volume is stopped 3.3.1 [root@dhcp47-167 ~]# gluster volume geo-replication master 10.70.47.17::slave create push-pem Unable to fetch slave volume details. Please check the slave cluster and slave volume. geo-replication command failed [root@dhcp47-167 ~]# 3.4 [root@dhcp42-53 ~]# gluster volume geo-replication master 10.70.41.221::slave create push-pem Unable to mount and fetch slave volume details. Please check the log: /var/log/glusterfs/geo-replication/gverify-slavemnt.log geo-replication command failed [root@dhcp42-53 ~]# Additionally the patch also brings clarity in log locations: 3.3.1 => Log location which points to these errors: /var/log/glusterfs/geo-replication-slaves/slave.log 3.4 => Specific logs representing master and slave [root@dhcp42-53 ~]# ls /var/log/glusterfs/geo-replication/ gverify-mastermnt.log gverify-slavemnt.log master [root@dhcp42-53 ~]# ls /var/log/glusterfs/geo-replication-slaves/ mbr [root@dhcp42-53 ~]# Moving this bug to verified state against the fix. Any other enhancements to the log will be tracked in separate bug.
We no longer see this issue, closing this bug.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607