Red Hat Bugzilla – Bug 1480507
tests: Pre-requisite setup to run geo-rep test case on regression machines.
Last modified: 2017-11-17 05:11:04 EST
Description of problem:
Geo-replication test cases were disabled in master. I have sent a patch 
to re-enable geo-replication test cases. But it will fail as few pre-requisite steps are required.
1. Setup passwordless SSH in all regression test machines for root. Please add it to the script which will spawn new regression machines as it will avoid doing it again on witnessing failures.
I remember we did do few other changes as the path to install gluster binaries is different in regression machines. But Let's give it a run after having password less SSH. We can get through this step by step.
Let me know if any doubts.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
We already do have authentication setup:
Can you detail what kind of failure you are seeing now ?
(In reply to M. Scherer from comment #1)
> We already do have authentication setup:
> Can you detail what kind of failure you are seeing now ?
Cool, I will trigger run and try it out. I thought, with new regression machines, it is no longer there.
(In reply to Kotresh HR from comment #2)
> (In reply to M. Scherer from comment #1)
> > We already do have authentication setup:
> > https://github.com/gluster/gluster.org_ansible_configuration/blob/master/
> > roles/jenkins_builder/tasks/authroot_georep.yml
> > Can you detail what kind of failure you are seeing now ?
> Cool, I will trigger run and try it out. I thought, with new regression
> machines, it is no longer there.
> Kotresh HR
But the regression  is still failing with no password less SSH for root as below.
13:10:11 [13:10:11] Running tests in file ./tests/geo-rep/georep-basic-dr-rsync.t
13:10:24 Passwordless ssh login has not been setup with slave32.cloud.gluster.org for user root.
13:11:25 Geo-replication session between master and slave32.cloud.gluster.org::slave does not exist.
13:11:26 Geo-replication session between master and slave32.cloud.gluster.org::slave does not exist.
May be it's not copied to authorized_keys properly ?
I suggest skip gsec_create and push-pem steps from the geo-rep tests.
- Run gsec_create command as part of setup and store all the files outside the build setup(May be in /root/data/georep_keys/)
- Add the common_secret.pem.pub content to same node authorized_keys file.
Every Geo-rep test will copy secret.*.pem files to $BUILD/var/lib/glusterd/geo-replication/ and create session without `push-pem`.
(In reply to Aravinda VK from comment #4)
> I suggest skip gsec_create and push-pem steps from the geo-rep tests.
> - Run gsec_create command as part of setup and store all the files outside
> the build setup(May be in /root/data/georep_keys/)
> - Add the common_secret.pem.pub content to same node authorized_keys file.
> Every Geo-rep test will copy secret.*.pem files to
> $BUILD/var/lib/glusterd/geo-replication/ and create session without
Yeah, seems to be better approach.
@mscherer, can we get that done ?
No, I rather try to figure what was wrong, and why it was working before.
And if that was never working, why it wasn't detected sooner.
So far, I did found some quoting issue (and pushed a fix), but this was likely present since more than 1 year.
Were the test disabled since that time ?
Also, the key to connect is /root/.ssh/id_georep , you can use it for the test. I did make sure this is properly limited to localhost, to avoid various security issue.
And the reason this is done like this is because the tests were not cleaning up the old root key when they broke (or when it did block the system), which in turn did cause trouble to connect as root after a while, IIRC (cause there is a limit, if only computational to the number of key you can place in authorized_keys).
We did manage last time to avoid breakage since we were using salt, but now we use ansible, any sshd breakage would be much more annoying to fix.
(In reply to M. Scherer from comment #6)
> No, I rather try to figure what was wrong, and why it was working before.
> And if that was never working, why it wasn't detected sooner.
Agreed. Kotresh and Aravinda, please do not fix this from the tests, but let us fix this from the infra end.
Hi M. Scherer/Nigel,
I see that, it is creating separate ssh key pair for geo-rep( /root/.ssh/id_georep). This will not work as geo-rep won't use ssh with -i option.
It requires passwordless for root with default ssh key pair (/root/.ssh/id_rsa)
(with out -i flag, should login with out password on all regression machines)
I think previously, they had setup this and it was working on few and was failing on which it was not setup.
Hi M. Scherer/Nigel,
Did we do any progress with this ?It's been really long time.
So you need to be able to do email@example.com from inside slave21.cloud.gluster.org without password and have it work?
Misc, is this okay as far as we're concerned. Doing it isn't actually a big deal. Probably a tweak in our scripts.
IIRC, this was fixed, there is a link from id_rsa to the right key.
You mean to say, the test case mentioned in comment 8 or comment 11 works in all the regression machines?
Sorry, the latest run also failed for the same reason. Please fix it soon.