Bug 1165269

Summary: [RHSC] Adding a host to Console using FQDN fails, after it is once added using IP address and then removed.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Shruti Sampat <ssampat>
Component: rhscAssignee: Shubhendu Tripathi <shtripat>
Status: CLOSED ERRATA QA Contact: RamaKasturi <knarra>
Severity: medium Docs Contact:
Priority: medium    
Version: rhgs-3.0CC: asriram, asrivast, divya, dpati, kmayilsa, knarra, nlevinki, rhs-bugs, rhsc-qe-bugs, rnachimu, shtripat
Target Milestone: ---   
Target Release: RHGS 3.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rhsc-3.1.0-57 Doc Type: Bug Fix
Doc Text:
Previously, When you add a Red Hat Gluster Storage node in the Red Hat Gluster Storage Console using its IP address and remove it from the Red Hat Gluster Storage trusted Storage Pool, and consequently use the FQDN of the node to add it again to the trusted storage pool, the operation fails. With this fix, the node can be added successfully using FQDN even if it was earlier added using IP and removed from the trusted storage pool later.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-07-29 05:26:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1153907, 1202842    
Attachments:
Description Flags
engine logs
none
host-deploy-FQDN
none
host-deploy-IP
none
vdsm logs none

Description Shruti Sampat 2014-11-18 17:06:56 UTC
Created attachment 958666 [details]
engine logs

Description of problem:
------------------------

A host is added to a cluster managed by the Console using the IP address of the host and after it comes up, it is removed from the cluster. The same host is then imported to the Console by importing it into a cluster. This time the FQDN of the host is given as the address. The host fails to come up and remains in the non-responsive state. 

Following is from in engine logs -
-----------------------------------
<snip>

2014-11-18 15:20:30,809 ERROR [org.ovirt.engine.core.bll.InstallVdsCommand] (org.ovirt.thread.pool-4-thread-3) [4a6edfb8] Host installation failed for host 038f10c5-76b
7-4af7-a431-2cc851495066, dhcp37-100.lab.eng.blr.redhat.com.: org.ovirt.engine.core.bll.InstallVdsCommand$VdsInstallException: Network error during communication with the host

</snip>

The following log is seen repeatedly in the engine.log file -
---------------------------------------------------------------

2014-11-18 15:32:07,251 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-57) Command GetCapabilitiesVDSCommand(HostName = dhcp37-100.lab.eng.blr.redhat.com, HostId = 038f10c5-76b7-4af7-a431-2cc851495066, vds=Host[dhcp37-100.lab.eng.blr.redhat.com,038f10c5-76b7-4af7-a431-2cc851495066]) execution failed. Exception: VDSNetworkException: javax.net.ssl.SSLHandshakeException: server certificate change is restrictedduring renegotiation

Version-Release number of selected component (if applicable):
--------------------------------------------------------------
rhsc-3.0.3-1.17.el6rhs.noarch

How reproducible:
------------------
Consistently.

Steps to Reproduce:
--------------------
1. Create a cluster via Console and a node to it using the IP address in the 'address' field of the New Host dialog.
2. After the host comes up, move the host to maintenance and remove it from the cluster.
3. Create a new cluster and import this host as an existing cluster. Provide the FQDN of the host in the address field this time.

Actual results:
----------------
The host does not come up after installation and ends up in non-responsive state. The events log shows -

Host <hostname> installation failed. Network error during communication with the host.

Expected results:
------------------
The host is expected to come up.

Additional info:
------------------
Find logs attached.

Comment 1 Shruti Sampat 2014-11-18 17:11:58 UTC
Created attachment 958667 [details]
host-deploy-FQDN

Comment 2 Shruti Sampat 2014-11-18 17:12:40 UTC
Created attachment 958668 [details]
host-deploy-IP

Comment 3 Shruti Sampat 2014-11-18 17:14:36 UTC
Created attachment 958669 [details]
vdsm logs

Comment 4 Shruti Sampat 2014-11-18 17:31:24 UTC
Re-installing the non-responsive host does not cause it to be up. Removing and adding again using FQDN also results in the host being non-responsive. Removing the host and adding using IP address works.

Comment 5 Kanagaraj 2014-11-19 05:47:20 UTC
There might be an issue in generating /etc/pki/vdsm/certs/vdsmcert.pem when the  host is re-added using FQDN.

Comment 6 Pavithra 2014-12-01 07:07:00 UTC
Hi Kanagaraj,

Can you please review the edited doc text for the known issue and sign off?

Comment 7 Pavithra 2014-12-01 07:24:37 UTC
Corrected a typo

Comment 8 Kanagaraj 2014-12-01 07:25:33 UTC
looks good.

Comment 9 Shubhendu Tripathi 2015-04-24 10:56:15 UTC
Verified the below scenarios against ovirt-master and ovirt-engine-3.5-gluster branches and it works fine -

SCENARIO-1
1. Created a cluster say cluster-1 and added the node-1 (IP: x1.x1.x1.x1) to the cluster
2. Once node-1 git UP, brought it to maintenance mode and removed from the cluster
3. Created a new cluster say cluster-2 and added the same node-1, to the cluster-2 but this time using FQDN (e.q. <name>.<domain>...)
4. Node got added successfully to the cluster-2

SCENARIO-2
1. Created a cluster say cluster-1 and added the node-1 (IP: x1.x1.x1.x1) to the cluster
2. Once node-1 git UP, brought it to maintenance mode and removed from the cluster
3. Again added the same node-1, to the cluster-1 but this time using FQDN (e.q. <name>.<domain>...)
4. Node got added successfully to the cluster-1

Comment 11 RamaKasturi 2015-05-22 07:14:06 UTC
Please provide the fixed in version.

Comment 12 RamaKasturi 2015-05-27 07:19:17 UTC
Verified and works fine with build rhsc-3.1.0-0.57.master.el6.noarch.

Performed following steps to verify the bug:

1. Created a cluster say cluster-1 and added the node-1 (IP: x1.x1.x1.x1) to the cluster
2. Once node-1 is UP, brought it to maintenance mode and removed from the cluster
3. Again added the same node-1, to the cluster-1 but this time using FQDN (e.q. <name>.<domain>...)
4. Node got added successfully to the cluster-1

Comment 13 Divya 2015-07-26 10:11:12 UTC
Shubhendu,

Could you review and sign-off the edited doc text.

Comment 14 Shubhendu Tripathi 2015-07-27 04:26:52 UTC
doc-text edited to correct names as "Red Hat Gluster Storage" and "Red Hat Gluster Storage Console".
Looks fine now.

Comment 16 errata-xmlrpc 2015-07-29 05:26:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2015-1494.html