Bug 1617903

Summary: Geo-rep tests fails with Permission denied even though keys are distributed
Product: [Community] GlusterFS Reporter: Kotresh HR <khiremat>
Component: project-infrastructureAssignee: Nigel Babu <nigelb>
Status: CLOSED WORKSFORME QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs, gluster-infra, nigelb
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-12-17 06:44:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Kotresh HR 2018-08-16 06:52:43 UTC
Description of problem:
Occasionally geo-rep tests fail with Permission denied even though keys are distributed.

Geo-rep generates keys using following command.
1. gluster system:: exec gsec_create

Geo-rep distributes keys using following command.
2. gluster vol geo-rep <mastervol> <slavehost>::<slavevol> create push-pem

Actually hook script is used behind to distribute as part of above command

extras/hook-scripts/S56glusterd-geo-rep-create-post.sh

In the test case, we do wait and confirm the key is distributed to authorized_keys before proceeding further

#Verify the keys are distributed
EXPECT_WITHIN $GEO_REP_TIMEOUT  0 check_keys_distributed

tests/geo-rep.rc
function check_keys_distributed()
{
    local search_key=$(cat /var/lib/glusterd/geo-replication/master_slave_common_secret.pem.pub)
    grep -F "$search_key" ~/.ssh/authorized_keys > /dev/null
    echo $?
}



Version-Release number of selected component (if applicable):


How reproducible:
sometimes

Steps to Reproduce:
1. regression run of tests/00-geo-rep/00-georep-verify-setup.t


Actual results:
Fails sometimes

Expected results:
Should always pass

Additional info:

Comment 1 Kotresh HR 2018-08-16 06:56:13 UTC
The latest failure

https://build.gluster.org/job/centos7-regression/2333/


use_meta_volume true ++++++++++
[2018-08-16 06:38:14.631127] I [gsyncd(config-get):308:main] <top>: Using session config file	path=/var/lib/glusterd/geo-replication/master_127.0.0.1_slave/gsyncd.conf
[2018-08-16 06:38:14.788916] I [gsyncd(config-get):308:main] <top>: Using session config file	path=/var/lib/glusterd/geo-replication/master_127.0.0.1_slave/gsyncd.conf
[2018-08-16 06:38:14.870143] I [gsyncd(config-set):308:main] <top>: Using session config file	path=/var/lib/glusterd/geo-replication/master_127.0.0.1_slave/gsyncd.conf
[2018-08-16 06:38:14.956933] I [gsyncd(config-get):308:main] <top>: Using session config file	path=/var/lib/glusterd/geo-replication/master_127.0.0.1_slave/gsyncd.conf
[2018-08-16 06:38:14.970256]:++++++++++ G_LOG:./tests/00-geo-rep/00-georep-verify-setup.t: TEST: 67 0 check_common_secret_file ++++++++++
[2018-08-16 06:40:14.848440]:++++++++++ G_LOG:./tests/00-geo-rep/00-georep-verify-setup.t: TEST: 70 0 check_keys_distributed ++++++++++
[2018-08-16 06:40:14.870809]:++++++++++ G_LOG:./tests/00-geo-rep/00-georep-verify-setup.t: TEST: 73 gluster --mode=script --wignore volume geo-replication master 127.0.0.1::slave start ++++++++++
[2018-08-16 06:40:15.93890] I [gsyncd(config-get):308:main] <top>: Using session config file	path=/var/lib/glusterd/geo-replication/master_127.0.0.1_slave/gsyncd.conf
[2018-08-16 06:40:15.236028] I [gsyncd(config-get):308:main] <top>: Using session config file	path=/var/lib/glusterd/geo-replication/master_127.0.0.1_slave/gsyncd.conf
[2018-08-16 06:40:15.375828] I [gsyncd(config-get):308:main] <top>: Using session config file	path=/var/lib/glusterd/geo-replication/master_127.0.0.1_slave/gsyncd.conf
[2018-08-16 06:40:15.516060] I [gsyncd(monitor):308:main] <top>: Using session config file	path=/var/lib/glusterd/geo-replication/master_127.0.0.1_slave/gsyncd.conf
[2018-08-16 06:40:15.547323]:++++++++++ G_LOG:./tests/00-geo-rep/00-georep-verify-setup.t: TEST: 75 2 check_status_num_rows Active ++++++++++
[2018-08-16 06:40:15.829257] I [gsyncd(config-get):308:main] <top>: Using session config file	path=/var/lib/glusterd/geo-replication/master_127.0.0.1_slave/gsyncd.conf
[2018-08-16 06:40:15.969827] I [gsyncd(status):308:main] <top>: Using session config file	path=/var/lib/glusterd/geo-replication/master_127.0.0.1_slave/gsyncd.conf
[2018-08-16 06:40:16.125322] I [gsyncd(status):308:main] <top>: Using session config file	path=/var/lib/glusterd/geo-replication/master_127.0.0.1_slave/gsyncd.conf
[2018-08-16 06:40:16.279227] I [gsyncd(status):308:main] <top>: Using session config file	path=/var/lib/glusterd/geo-replication/master_127.0.0.1_slave/gsyncd.conf
[2018-08-16 06:40:16.429445] I [gsyncd(status):308:main] <top>: Using session config file	path=/var/lib/glusterd/geo-replication/master_127.0.0.1_slave/gsyncd.conf
[2018-08-16 06:40:16.597454] E [syncdutils(monitor):804:errlog] Popen: command returned error	cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 127.0.0.1 gluster --xml --remote-host=localhost volume info slave	error=255
[2018-08-16 06:40:16.598094] E [syncdutils(monitor):808:logerr] Popen: ssh> Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).

Comment 2 Nigel Babu 2018-10-03 08:45:32 UTC
Does this happen anymore? Did we get to the bottom of this?

Comment 3 Nigel Babu 2018-12-17 06:44:18 UTC
Haven't seen this happen lately, so marking as closed.