Description of problem: Call "gluster volume geo-replication <master_volume> <slave_host>::<slave_volume> create push-pem" returns "Unable to fetch master volume details. Please check the master cluster and master volume. geo-replication command failed" Version-Release number of selected component (if applicable): How reproducible: Each time during geo-replication initialization. Steps to Reproduce: 1.Set passwordless SSH for root; 2.Execute "gluster system:: execute gsec_create"; 3.Call "gluster volume geo-replication <master_volume> root@<slave_host>::<slave_volume> create push-pem" Actual results: "Unable to fetch master volume details. Please check the master cluster and master volume. geo-replication command failed" Expected results: Creating geo-replication session between <master_volume> & root@<slave_host>::<slave_host> has been successful Additional info: the problem could be fixed by modifying "gverify.sh": - add path for all instances of "gluster" and "glusterfs"; - replace awk "{print \\\$4}" with awk "{print \\\$2}" at function cmd_slave().
In addition option: -P, --portability use the POSIX output format has to be used by the command "df" at "gverify.sh".
Fixed in commit 5224a785faaa615adac7cd25207912ab109ad7e6 Author: Avra Sengupta <asengupt> Date: Tue May 20 14:50:25 2014 +0000 glusterd/geo-rep: Creating .ssh dir with right ownership Also adding -P option to the usage of df for portability in gverify.sh Change-Id: I0be19d26ea63769a934c6ccbfc04ef80768ebc9a BUG: 1099041 Signed-off-by: Avra Sengupta <asengupt> Reviewed-on: http://review.gluster.org/7812 Reviewed-by: Kotresh HR <khiremat> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Venky Shankar <vshankar> Tested-by: Venky Shankar <vshankar> Reviewed-by: Aravinda VK <avishwan>
This bug is getting closed because the 3.5 is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.
This bug and similar problem reported as bug 1099041 are back in GlusterFS 5.2. Do call: [root@SC-10-10-63-182 log]# /usr/libexec/glusterfs/gverify.sh master-volume-0005 nasgorep 10.10.60.182 slave-volume-0006 22 /var/log/glusterfs/create_verify_log Returns: Testing_Passwordless_SSH [root@SC-10-10-63-182 log]# echo $? 1 [root@SC-10-10-63-182 log]# cat /var/log/glusterfs/create_verify_log FORCE_BLOCKER|gluster command on nasgorep.60.182 failed. Error: bash: gluster: command not found It means that script gverify.sh fails to find command "gluster". All required settings for GEO-replication are in place: [root@SC-10-10-63-182 log]# /usr/bin/ssh -q -oConnectTimeout=5 nasgorep.60.182 /bin/pwd /home/nasgorep [root@SC-10-10-63-182 log]# echo $? 0 [root@SC-10-10-63-182 log]# gluster volume list master-volume-0005 slave-volume-0006
Similar problem is exposed by gsyncd: [2018-12-29 00:36:20.785577] I [gsyncd(monitor):308:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/master-volume-0005_10.10.60.182_slave-volume-0006/gsyncd.conf [2018-12-29 00:36:21.47989] E [syncdutils(monitor):809:errlog] Popen: command returned error cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 nasgorep.60.182 gluster --xml --remote-host=localhost volume info slave-volume-0006 error=127 [2018-12-29 00:36:21.48366] E [syncdutils(monitor):813:logerr] Popen: ssh> bash: gluster: command not found Source code for GlusterFS 5.2 is used in all cases. The code is installed on Centos 6 by using next commands: ./autogen.sh ./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --target=x86_64-redhat-linux-gnu --program-prefix= --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --without-tmpfilesdir --disable-syslog --enable-gnfs make make install
Here an other problem with geo-replication in GlusterFS 5.2 that is related to the previous ones. After hack is put into python script to provide path for "gluster" so geo-replication could be initialized the next problem is reported when attempt is made to create second geo-replication on the same system. The first one is created successfully and is "Active". File /var/log/glusterfs/cli.log is created by the first geo-replication. But when the second one is going to use the same file the geo-replication initialization hits problem: [2018-12-31 19:08:34.415534] E [syncdutils(monitor):809:errlog] Popen: command returned error cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 nasgorep.60.182 /usr/sbin/gluster --xml --remote-host=localhost volume info slave-volume-0008 error=255 [2018-12-31 19:08:34.415894] E [syncdutils(monitor):813:logerr] Popen: ssh> ERROR: failed to create logfile "/var/log/glusterfs/cli.log" (Permission denied) [2018-12-31 19:08:34.416060] E [syncdutils(monitor):813:logerr] Popen: ssh> ERROR: failed to open logfile /var/log/glusterfs/cli.log [root@SC-10-10-63-182 log]# ls -l /var/log/glusterfs/geo-replication/cli.log -rw-r--r-- 1 root root 0 Dec 31 10:32 /var/log/glusterfs/geo-replication/cli.log
(In reply to vnosov from comment #6) Information about wrong file "cli.log" was put into original message. It fixed now. > Here an other problem with geo-replication in GlusterFS 5.2 that is related > to the previous ones. > > After hack is put into python script to provide path for "gluster" so > geo-replication could be initialized the next problem is reported when > attempt is made to create second geo-replication > on the same system. The first one is created successfully and is "Active". > But when the second one is going to use the same file the geo-replication > initialization hits problem: > > [2018-12-31 19:08:34.415534] E [syncdutils(monitor):809:errlog] Popen: > command returned error cmd=ssh -oPasswordAuthentication=no > -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem > -p 22 nasgorep.60.182 /usr/sbin/gluster --xml --remote-host=localhost > volume info slave-volume-0008 error=255 > [2018-12-31 19:08:34.415894] E [syncdutils(monitor):813:logerr] Popen: ssh> > ERROR: failed to create logfile "/var/log/glusterfs/cli.log" (Permission > denied) > [2018-12-31 19:08:34.416060] E [syncdutils(monitor):813:logerr] Popen: ssh> > ERROR: failed to open logfile /var/log/glusterfs/cli.log > [root@SC-10-10-63-182 log]# ls -l /var/log/glusterfs/cli.log -rw------- 1 root root 72629 Dec 31 15:24 /var/log/glusterfs/cli.log
Her is some additional info about geo-replication failure to use log file /var/log/glusterfs/cli.log. This problem is exposed on geo-replication slave system. Log file /var/log/glusterfs/cli.log is created and updated by gluster that runs on slave system. It makes log file to havenext attributes: [root@SC-10-10-63-182 log]# ls -l /var/log/glusterfs/cli.log -rw------- 1 root root 72629 Dec 31 15:24 /var/log/glusterfs/cli.log If geo-replication is based on SSH access to the slave for not a "root" user, for example, "nasgorep" from group "nasgorep", all handling of the /var/log/glusterfs/cli.log on slave including slave's gluster are successful when log file has attributes: [root@SC-10-10-63-182 log]# ls -l /var/log/glusterfs/cli.log -rw-rw---- 1 root nasgorep 41553 Jan 2 16:00 /var/log/glusterfs/cli.log Problem is that GlusterFS 5.2 does not provide these settings for the log file or lets geo-replication use it now.
vnosov, thanks for all the details here. Missed this in triaging, as it already had 'Triaged' keyword. We will look into this.
https://review.gluster.org/#/c/glusterfs/+/22890/. This will solve this problem.