Description of problem: When NET=HOST is used, Unable to create Geo-replication session. Setup: ------ CoreOS + Centos containers RHGS 3.1 RPMs installed inside Containers. Net=Host setup 4 containers, 2 containers for Master and 2 containers for Slave.(One brick in each container) Replica Volumes - Master and Slave Issue: -------- - Unable to create the session. Port mapping is used for ssh(Custom port is mapped to port 22). Geo-replication all SSH commands are executed without port option to ssh. - After creating session Rsync was failing due to validation in gsyncd. (gsyncd is used as shell instead of bash)
Workaround: ----------- - Kotresh modified gverify.sh and hook script to use custom SSH port instead of default. (https://gist.github.com/kotreshhr/dd16c5fca425b417c097) - Geo-rep config options to use ssh options runtime. gluster vol geo-rep<master vol> <slavehost>::<slavevol> config ssh_command_tar "ssh -p 50002 -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem" gluster vol geo-rep<master vol> <slavehost>::<slavevol> config ssh_command "ssh -p 50002 -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem" - Suggestion to remove "command=" from authorized_keys files of Slave nodes to prevent executing all commands via gsyncd shell. - Replaced /nonexistent/gsyncd with actual path of gsyncd(/usr/libexec/glusterfs/gsyncd) in Geo-replication session config file.
Identified following changes, 1. gsec_create command accepts new parameter for generating SSH keys for containers gluster system:: execute gsec_create container 2. Create command accepts ssh-port option as --ssh-port=22 gluster volume geo-replication <MASTERVOL> <[SLAVEUSER@]SLAVEHOST>::<SLAVEVOL> create push-pem --ssh-port=52022 3. New configurable option ssh-port. gluster volume geo-replication <MASTERVOL> <[SLAVEUSER@]SLAVEHOST>::<SLAVEVOL> config ssh-port 52022 4. Enable setting remote_gsyncd path gluster volume geo-replication <MASTERVOL> <[SLAVEUSER@]SLAVEHOST>::<SLAVEVOL> config remote_gsyncd /usr/libexec/glusterfs/gsyncd Following patches sent to upstream, one more patch expected. http://review.gluster.org/#/c/12459/ http://review.gluster.org/#/c/12444/ http://review.gluster.org/#/c/12472/
Created attachment 1096026 [details] 0001-geo-rep-Fix-portability-issues-with-NetBSD.patch
Created attachment 1096027 [details] 0002-gverify-Adding-StrictHostKeyChecking-no-for-ssh-veri.patch
Created attachment 1096028 [details] 0003-glusterd-geo-rep-Adding-ssh-port-option-for-geo-rep-.patch
Created attachment 1096029 [details] 0004-geo-rep-New-Config-option-for-ssh_port.patch
Created attachment 1096030 [details] 0005-geo-rep-Make-restrictive-ssh-keys-optional.patch
Created attachment 1096031 [details] 0006-geo-rep-Allow-setting-config-remote_gsyncd.patch
Patches attached for creating Hotfix on top of RHGS 3.1(Two dependent patches also added) 0001-geo-rep-Fix-portability-issues-with-NetBSD.patch 0002-gverify-Adding-StrictHostKeyChecking-no-for-ssh-veri.patch 0003-glusterd-geo-rep-Adding-ssh-port-option-for-geo-rep-.patch 0004-geo-rep-New-Config-option-for-ssh_port.patch 0005-geo-rep-Make-restrictive-ssh-keys-optional.patch 0006-geo-rep-Allow-setting-config-remote_gsyncd.patch For container setup, Changes in the steps are, 1. Delete /var/lib/glusterd/geo-replication/common_secret.pem.pub if exists 2. Run gsec_create with container option gluster system:: execute gsec_create container 3. Add port option during Geo-replication CREATE gluster volume geo-replication <MASTERVOL> <SLAVEHOST>::<SLAVEVOL> create push-pem ssh-port 52022 4. Set config gluster volume geo-replication <MASTERVOL> <SLAVEHOST>::<SLAVEVOL> config ssh-port 52022 gluster volume geo-replication <MASTERVOL> <SLAVEHOST>::<SLAVEVOL> config remote_gsyncd /usr/libexec/glusterfs/gsyncd Use meta Volume as specified in documentation. Start the Geo-replication as usual.
Downstream patches sent. https://code.engineering.redhat.com/gerrit/62770 https://code.engineering.redhat.com/gerrit/62771 https://code.engineering.redhat.com/gerrit/62772 https://code.engineering.redhat.com/gerrit/62773
Tried Following 2 Scenarios: Case 1: When container ssh port is default 22. Result: No additional Steps needed to create geo-replication. It is as similar as non container setup Case 2: When container ssh port is customized. Result: Following additional steps would be needed <i>. Password lesss ssh to mention -p <port> <ii>. Distribute key with ssh-port <port> push-pem <iii>. Configure ssh_port <port> 99 ssh-copy-id -i /root/.ssh/id_rsa.pub root.eng.blr.redhat.com -p 60000 100 gluster system:: execute gsec_create 101 gluster volume geo-replication master vm6-rhsqa13.lab.eng.blr.redhat.com::slave create ssh-port 60000 push-pem force 102 gluster volume geo-replication master vm6-rhsqa13.lab.eng.blr.redhat.com::slave config use_meta_volume true 103 gluster volume geo-replication master vm6-rhsqa13.lab.eng.blr.redhat.com::slave config ssh_port 60000 104 gluster volume geo-replication master vm6-rhsqa13.lab.eng.blr.redhat.com::slave status 105 gluster volume geo-replication master vm6-rhsqa13.lab.eng.blr.redhat.com::slave start 106 gluster volume geo-replication master vm6-rhsqa13.lab.eng.blr.redhat.com::slave status Moving the bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0193.html