Bug 1260918 - [BACKUP]: If more than 1 node in cluster are not added in known_host, glusterfind create command hungs
Summary: [BACKUP]: If more than 1 node in cluster are not added in known_host, gluster...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterfind
Version: mainline
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Aravinda VK
QA Contact: bugs@gluster.org
URL:
Whiteboard:
Depends On: 1260119
Blocks: 1284735
TreeView+ depends on / blocked
 
Reported: 2015-09-08 08:44 UTC by Aravinda VK
Modified: 2016-06-16 13:35 UTC (History)
5 users (show)

Fixed In Version: glusterfs-3.8rc2
Clone Of: 1260119
: 1284735 (view as bug list)
Environment:
Last Closed: 2016-06-16 13:35:52 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Aravinda VK 2015-09-08 08:44:10 UTC
+++ This bug was initially created as a clone of Bug #1260119 +++

Description of problem:
======================

If more than 1 node from cluster do not have entry in the known_host of a node which is creating glusterfind session, the create hungs forever.

[root@georep1 scripts]# glusterfind create s1 master
The authenticity of host '10.70.46.97 (10.70.46.97)' can't be established.
ECDSA key fingerprint is 76:e4:6d:07:1e:82:26:1c:0a:95:b2:4c:a3:3f:f1:e2.
Are you sure you want to continue connecting (yes/no)? The authenticity of host '10.70.46.154 (10.70.46.154)' can't be established.
ECDSA key fingerprint is b4:a8:00:41:ec:f8:12:a9:89:88:cb:7a:20:a8:83:3c.
Are you sure you want to continue connecting (yes/no)? The authenticity of host '10.70.46.97 (10.70.46.97)' can't be established.
ECDSA key fingerprint is 76:e4:6d:07:1e:82:26:1c:0a:95:b2:4c:a3:3f:f1:e2.
Are you sure you want to continue connecting (yes/no)? The authenticity of host '10.70.46.154 (10.70.46.154)' can't be established.
ECDSA key fingerprint is b4:a8:00:41:ec:f8:12:a9:89:88:cb:7a:20:a8:83:3c.
Are you sure you want to continue connecting (yes/no)? The authenticity of host '10.70.46.97 (10.70.46.97)' can't be established.
ECDSA key fingerprint is 76:e4:6d:07:1e:82:26:1c:0a:95:b2:4c:a3:3f:f1:e2.
Are you sure you want to continue connecting (yes/no)? The authenticity of host '10.70.46.93 (10.70.46.93)' can't be established.
ECDSA key fingerprint is 0d:bc:e3:70:e0:86:65:5e:3e:d2:ea:9c:fb:a9:53:66.
Are you sure you want to continue connecting (yes/no)? The authenticity of host '10.70.46.93 (10.70.46.93)' can't be established.
ECDSA key fingerprint is 0d:bc:e3:70:e0:86:65:5e:3e:d2:ea:9c:fb:a9:53:66.
Are you sure you want to continue connecting (yes/no)? The authenticity of host '10.70.46.93 (10.70.46.93)' can't be established.
ECDSA key fingerprint is 0d:bc:e3:70:e0:86:65:5e:3e:d2:ea:9c:fb:a9:53:66.
Are you sure you want to continue connecting (yes/no)? The authenticity of host '10.70.46.154 (10.70.46.154)' can't be established.
ECDSA key fingerprint is b4:a8:00:41:ec:f8:12:a9:89:88:cb:7a:20:a8:83:3c.
Are you sure you want to continue connecting (yes/no)? yes


[root@georep1 scripts]# cat /root/.ssh/known_hosts


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.7.1-14.el7rhgs.x86_64

How reproducible:
=================

Always

Steps to Reproduce:
===================
1. Flush the known_hosts from node or remove the cluster host entries
2. Create glusterfind session


Actual results:
===============
glusterfind session creation hungs

Expected results:
================

Should create the session


Workaround:
===========

SSH to all the nodes in cluster to have known_hosts updated.

[root@georep1 scripts]# cat /root/.ssh/known_hosts
[root@georep1 scripts]# for i in {97,93,154}; do ssh root.46.$i; doneThe authenticity of host '10.70.46.97 (10.70.46.97)' can't be established.
ECDSA key fingerprint is 76:e4:6d:07:1e:82:26:1c:0a:95:b2:4c:a3:3f:f1:e2.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.70.46.97' (ECDSA) to the list of known hosts.
root.46.97's password: 
Last login: Fri Sep  4 12:38:50 2015 from 10.70.6.115
[root@georep2 ~]# exit
logout
Connection to 10.70.46.97 closed.
The authenticity of host '10.70.46.93 (10.70.46.93)' can't be established.
ECDSA key fingerprint is 0d:bc:e3:70:e0:86:65:5e:3e:d2:ea:9c:fb:a9:53:66.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.70.46.93' (ECDSA) to the list of known hosts.
root.46.93's password: 
Last login: Fri Sep  4 12:38:50 2015 from 10.70.6.115
[root@georep3 ~]# exit
logout
Connection to 10.70.46.93 closed.
The authenticity of host '10.70.46.154 (10.70.46.154)' can't be established.
ECDSA key fingerprint is b4:a8:00:41:ec:f8:12:a9:89:88:cb:7a:20:a8:83:3c.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.70.46.154' (ECDSA) to the list of known hosts.
root.46.154's password: 
Last login: Fri Sep  4 12:38:50 2015 from 10.70.6.115
[root@georep4 ~]# exit
logout
Connection to 10.70.46.154 closed.
[root@georep1 scripts]# cat /root/.ssh/known_hosts
10.70.46.97 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBNDWD/hxpscM20kGEWOTsiIzgmnBd78d2uyQRI7AGIX2JRRr0hIoZPOGCrW/ytRpluPEnJVr7s+vAYglVYLZlOo=
10.70.46.93 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBGCE5KvbJtmgXmQXfVVjUVjG0bjkP7fb0v7owFJnzAxy5FKjtTDQSF+qVAHA17MBh9Br7KP+SZQOxSmHyY9Tq8s=
10.70.46.154 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBNMULJn47vZ1Azq/SCi4i5VBSrLQAqs6sMZTSamzpwkhedtHrNhKe5QW7W5l+mirLJTIrLuqy8HQYSp5jDYyfrk=
[root@georep1 scripts]# glusterfind create s1 master
Session s1 created with volume master
[root@georep1 scripts]#

--- Additional comment from Rahul Hinduja on 2015-09-07 02:28:54 EDT ---

glusterfind pre also hungs for local host checking as: 


[root@georep1 scripts]# glusterfind pre --output-prefix '/mnt/glusterfs/' s1 master /root/log2
10.70.46.97 - pre failed: /rhs/brick2/b6 Historical Changelogs not available: [Errno 2] No such file or directory

10.70.46.97 - pre failed: /rhs/brick3/b10 Historical Changelogs not available: [Errno 2] No such file or directory

10.70.46.97 - pre failed: /rhs/brick1/b2 Historical Changelogs not available: [Errno 2] No such file or directory

10.70.46.93 - pre failed: /rhs/brick1/b3 Historical Changelogs not available: [Errno 2] No such file or directory

10.70.46.93 - pre failed: /rhs/brick2/b7 Historical Changelogs not available: [Errno 2] No such file or directory

10.70.46.93 - pre failed: /rhs/brick3/b11 Historical Changelogs not available: [Errno 2] No such file or directory

10.70.46.154 - pre failed: /rhs/brick2/b8 Historical Changelogs not available: [Errno 2] No such file or directory

10.70.46.154 - pre failed: /rhs/brick3/b12 Historical Changelogs not available: [Errno 2] No such file or directory

10.70.46.154 - pre failed: /rhs/brick1/b4 Historical Changelogs not available: [Errno 2] No such file or directory

The authenticity of host '10.70.46.96 (10.70.46.96)' can't be established.
ECDSA key fingerprint is 44:23:1a:4b:3c:78:63:a6:66:3d:18:01:8d:dd:17:74.
Are you sure you want to continue connecting (yes/no)? The authenticity of host '10.70.46.96 (10.70.46.96)' can't be established.
ECDSA key fingerprint is 44:23:1a:4b:3c:78:63:a6:66:3d:18:01:8d:dd:17:74.
Are you sure you want to continue connecting (yes/no)? The authenticity of host '10.70.46.96 (10.70.46.96)' can't be established.
ECDSA key fingerprint is 44:23:1a:4b:3c:78:63:a6:66:3d:18:01:8d:dd:17:74.
Are you sure you want to continue connecting (yes/no)? yes

--- Additional comment from Aravinda VK on 2015-09-07 06:52:20 EDT ---

RCA:

While connecting to other nodes programatically, Geo-rep uses an additional option with ssh(-oStrictHostKeyChecking=no). We need to use the option with Glusterfind too.

Other issue is about asking yes/no prompt for localhost, which is during scp command. We need to use the same option as used in ssh. Other fix is required in not running scp command if local node.

Workaround:
Add all the hosts in peer including local node to known_hosts.

Comment 1 Vijay Bellur 2015-09-08 08:46:19 UTC
REVIEW: http://review.gluster.org/12124 (tools/glusterfind: StrictHostKeyChecking=no for ssh/scp verification) posted (#1) for review on master by Aravinda VK (avishwan)

Comment 2 Vijay Bellur 2015-11-19 05:08:11 UTC
REVIEW: http://review.gluster.org/12124 (tools/glusterfind: StrictHostKeyChecking=no for ssh/scp verification) posted (#2) for review on master by Aravinda VK (avishwan)

Comment 3 Vijay Bellur 2015-11-21 14:19:28 UTC
REVIEW: http://review.gluster.org/12124 (tools/glusterfind: StrictHostKeyChecking=no for ssh/scp verification) posted (#3) for review on master by Aravinda VK (avishwan)

Comment 4 Vijay Bellur 2015-11-23 05:02:46 UTC
REVIEW: http://review.gluster.org/12124 (tools/glusterfind: StrictHostKeyChecking=no for ssh/scp verification) posted (#4) for review on master by Aravinda VK (avishwan)

Comment 5 Vijay Bellur 2015-11-23 17:29:27 UTC
COMMIT: http://review.gluster.org/12124 committed in master by Vijay Bellur (vbellur) 
------
commit d47323d0e6f543a8ece04c32b8d77d2785390c3c
Author: Aravinda VK <avishwan>
Date:   Mon Sep 7 14:18:45 2015 +0530

    tools/glusterfind: StrictHostKeyChecking=no for ssh/scp verification
    
    Also do not use scp command in case copy file from local
    node.
    
    Change-Id: Ie78c77eb0252945867173937391b82001f29c3b0
    Signed-off-by: Aravinda VK <avishwan>
    BUG: 1260918
    Reviewed-on: http://review.gluster.org/12124
    Tested-by: NetBSD Build System <jenkins.org>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 6 Niels de Vos 2016-06-16 13:35:52 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.