Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1278904 - credential RHEV hosts fail
credential RHEV hosts fail
Status: CLOSED ERRATA
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Providers (Show other bugs)
5.5.0
Unspecified Unspecified
high Severity high
: GA
: 5.5.0
Assigned To: Greg Blomquist
Nandini Chandra
:
: 1221707 1245171 1291858 (view as bug list)
Depends On:
Blocks: RHCS-CFME 1291858
  Show dependency treegraph
 
Reported: 2015-11-06 12:19 EST by Sergio Ocon
Modified: 2016-02-19 18:11 EST (History)
16 users (show)

See Also:
Fixed In Version: 5.5.0.12
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-12-08 08:45:16 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
automation.log (350.16 KB, text/plain)
2015-11-06 12:19 EST, Sergio Ocon
no flags Details
production.log (1.26 MB, text/plain)
2015-11-06 12:21 EST, Sergio Ocon
no flags Details
Example of Appliance Hostname Error in the UI (69.30 KB, image/png)
2015-11-16 22:19 EST, Greg Blomquist
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:2551 normal SHIPPED_LIVE Moderate: CFME 5.5.0 bug fixes and enhancement update 2015-12-08 12:58:09 EST

  None (edit)
Description Sergio Ocon 2015-11-06 12:19:51 EST
Created attachment 1090742 [details]
automation.log

Description of problem:
On a new Beta2 appliance, I have added RHEV as a provider. When adding the ssh information to the hosts it fails

Version-Release number of selected component (if applicable):
5.5.0.9-beta2.20151102161742_5530c9a 

How reproducible:
Always

Steps to Reproduce:
1. Add a new RHEVM
2. Refresh it and go to hosts
3. Try to add credentials to the hosts

Actual results:
Unexpected response returned from system, see log for details

Expected results:
Validation

Additional info:
/var/log/secure in host is not showing connection
Comment 2 Sergio Ocon 2015-11-06 12:21 EST
Created attachment 1090743 [details]
production.log
Comment 3 Dave Johnson 2015-11-14 13:35:23 EST
reproduced this, it failing to connect...  forward and reverse DNS does work from the command line as well as credentialing vmware hosts


snippet from ev,log
[----] E, [2015-11-14T13:21:50.704088 #2922:1185988] ERROR -- : MIQ(ManageIQ::Providers::Redhat::InfraManager::Host#connect_ssh) SSH connection failed for [<ip_address>] with [SocketError: getaddrinfo: Name or service not known]
[----] W, [2015-11-14T13:21:50.704467 #2922:1185988]  WARN -- : MIQ(ManageIQ::Providers::Redhat::InfraManager::Host#verify_credentials_with_ssh) #<SocketError: getaddrinfo: Name or service not known>
[----] E, [2015-11-14T13:21:50.704568 #2922:1185988] ERROR -- : MIQ(host_controller-update): Unexpected response returned from system, see log for details
Comment 5 Greg Blomquist 2015-11-16 17:38:08 EST
So, this looks like it's happening because the *appliance* has an invalid hostname.  When trying to ssh to the hosts, the code first attempts to validate the appliance's fully qualified domain name:

https://gist.github.com/blomquisg/b88c3ac018fc00f14a34
Comment 6 tim.moor 2015-11-16 17:58:33 EST
Greg, that does indeed appear to be the issue, any idea wither this added validation has been added in 4.0, or wither this existed in earlier versions?

[SOLUTION:] Provide the Cloudforms appliance with a valid hostname that matches its resolvable fqdn.
Comment 7 tim.moor 2015-11-16 18:26:49 EST
Can we update the error message to be something a little bit more meaningful.
Comment 8 Greg Blomquist 2015-11-16 18:54:54 EST
Hi Tim,

Yeah, I've been playing with one of the QE appliances a little to attempt to improve the logging.  Here's what I've got so far:

> [----] I, [2015-11-16T18:48:39.813591 #45741:3d9988]  INFO -- : MIQ(ManageIQ::Providers::Redhat::InfraManager::Host#verify_credentials_with_ssh) Verifying Host SSH credentials for [ibm-x3250m4-05]
> [----] I, [2015-11-16T18:48:39.820598 #45741:3d9988]  INFO -- : MIQ(ManageIQ::Providers::Redhat::InfraManager::Host#connect_ssh) Initiating SSH connection to Host:[ibm-x3250m4-05] using [ibm-x3250m4-05.REDACTED.com] for user:[test].  Options:[{:remember_host=>false}]
> [----] I, [2015-11-16T18:48:39.820742 #45741:3d9988]  INFO -- : MIQ(ManageIQ::Providers::Redhat::InfraManager::Host#connect_ssh) SSH connection established to [ibm-x3250m4-05.REDACTED.com]
> [----] E, [2015-11-16T18:48:39.980280 #45741:3d9988] ERROR -- : MIQ(ManageIQ::Providers::Redhat::InfraManager::Host#connect_ssh) SSH connection failed for [ibm-x3250m4-05.REDACTED.com] with [SocketError: Unable to get fully qualified domain name for appliance localhost.localdomain.localdomain, error: getaddrinfo: Name or service not known]
> [----] W, [2015-11-16T18:48:39.980563 #45741:3d9988]  WARN -- : MIQ(ManageIQ::Providers::Redhat::InfraManager::Host#verify_credentials_with_ssh) #<SocketError: Unable to get fully qualified domain name for appliance localhost.localdomain.localdomain, error: getaddrinfo: Name or service not known>
> [----] E, [2015-11-16T18:48:39.980941 #45741:3d9988] ERROR -- : MIQ(host_controller-update): Unexpected response returned from system, see log for details

That shows from the beginning of the validation process down through the error that ends up getting presented to the user in the UI.

If I can connect the dots, I might even be able to get the updated SocketError message (Unable to get fully qualified domain name for appliance localhost.localdomain.localdomain) percolated up to the UI.  I'll see what I can do about that.
Comment 9 Greg Blomquist 2015-11-16 22:19 EST
Created attachment 1095199 [details]
Example of Appliance Hostname Error in the UI

I've attached an example of what the Appliance Hostname error would look like in the UI.  So far, this is all with just testing on an QE appliance (that exhibits the same lack of DNS-resolvable hostname).

If this looks right, I can put together a pull request with these changes.
Comment 10 Greg Blomquist 2015-11-16 22:26:25 EST
Here's the branch tracking the changes shown in the screenshot from comment #9:

https://github.com/blomquisg/manageiq/tree/bz1278904-invalid-appliance-hostname
Comment 11 John Matthews 2015-11-17 14:39:34 EST
From our testing we ran into this same BZ.
We saw that /etc/hostname on cfme-rhevm-5.5.0.9-2.x86_64.rhevm.ova had a bad entry:

# cat /etc/hostname
localhost.localdomain.localdomain


opposed to 
 localhost.localdomain


Once I updated the hostname to "localhost.localdomain" the issue was resolved:

From "rails console":

 MiqSockUtil.getFullyQualifiedDomainName
=> "localhost"

Added credentials for the RHEV Hypervisor and it worked.
Comment 12 John Matthews 2015-11-17 14:52:08 EST
Filed Bug 1282927 to track the change for /etc/hostname so it is "localhost.localdomain" by default and SSH functionality will then work.
Comment 13 Greg Blomquist 2015-11-18 15:41:51 EST
https://github.com/ManageIQ/manageiq/pull/5502
Comment 14 Nick Carboni 2015-11-19 08:47:41 EST
Would this also be solved if the hostname in /etc/hostname was resolvable to 127.0.0.1? (i.e. if we added whatever was in /etc/hostname to the /etc/hosts file)
Comment 16 CFME Bot 2015-11-19 14:42:59 EST
New commit detected on ManageIQ/manageiq-appliance/master:
https://github.com/ManageIQ/manageiq-appliance/commit/1926c54093577c1c0542eea14dd80b086b9438ce

commit 1926c54093577c1c0542eea14dd80b086b9438ce
Author:     Nick Carboni <ncarboni@redhat.com>
AuthorDate: Tue Nov 17 16:02:44 2015 -0500
Commit:     Nick Carboni <ncarboni@redhat.com>
CommitDate: Thu Nov 19 13:50:01 2015 -0500

    Remove cloud-init's ability to change the appliance hostname
    
    The altered hostname was not included in /etc/hosts
    causing us to not be able to resolve it when attempting
    to run `MiqSockUtil.getFullyQualifiedDomainName`
    
    The decision was made to disallow cloud-init from
    changing the hostname at all as to not conflict with our
    existing methods of changing the hostname using the
    appliance_console.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1282927
    https://bugzilla.redhat.com/show_bug.cgi?id=1278904

 COPY/etc/cloud/cloud.cfg.d/miq_cloud.cfg | 2 ++
 1 file changed, 2 insertions(+)
Comment 17 Nick Carboni 2015-11-19 14:44:34 EST
The decision was made to change the cloud-init config to not touch the hostname files. This should leave the hostname as localhost.localdomain on new appliances.
Comment 18 Greg Blomquist 2015-11-19 18:21:55 EST
*** Bug 1221707 has been marked as a duplicate of this bug. ***
Comment 19 CFME Bot 2015-11-23 12:26:45 EST
New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/5e18768df48ad7dc8985a861ba0e16197b50202c

commit 5e18768df48ad7dc8985a861ba0e16197b50202c
Author:     Greg Blomquist <gblomqui@redhat.com>
AuthorDate: Mon Nov 16 22:18:39 2015 -0500
Commit:     Greg Blomquist <gblomqui@redhat.com>
CommitDate: Thu Nov 19 17:59:15 2015 -0500

    No longer validate source hostname with MiqSshUtil
    
    Way back in e6dcb57e41a5b9a2326dbadecd995808f9e043d3, code was added to make
    sure that the appliance had a valid hostname before attempting to SSH from the
    appliance to another box.  The reasoning at the time was that the SSH gem would
    ignore a failure caused by having an invalid appliance hostname, but then later
    blow up because of the invalid appliance hostname.
    
    It does not appears that SSH even cares about the appliance's hostname anymore
    when establishing an SSH connection to another server (in fact, it was
    surprising that it even would care).
    
    With this check removed, validating a host's SSH credentials will no longer
    throw a misleading "getaddrinfo" error when the appliance has a bad hostname.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1278904

 gems/pending/util/MiqSshUtilV2.rb | 4 ----
 1 file changed, 4 deletions(-)
Comment 20 CFME Bot 2015-11-23 16:12:11 EST
New commit detected on cfme/5.5.z:
https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=1fe4c792c748030a405c0ab02d98e491f3056152

commit 1fe4c792c748030a405c0ab02d98e491f3056152
Author:     Greg Blomquist <gblomqui@redhat.com>
AuthorDate: Mon Nov 16 22:18:39 2015 -0500
Commit:     Greg Blomquist <gblomqui@redhat.com>
CommitDate: Mon Nov 23 12:24:34 2015 -0500

    No longer validate source hostname with MiqSshUtil
    
    Way back in e6dcb57e41a5b9a2326dbadecd995808f9e043d3, code was added to make
    sure that the appliance had a valid hostname before attempting to SSH from the
    appliance to another box.  The reasoning at the time was that the SSH gem would
    ignore a failure caused by having an invalid appliance hostname, but then later
    blow up because of the invalid appliance hostname.
    
    It does not appears that SSH even cares about the appliance's hostname anymore
    when establishing an SSH connection to another server (in fact, it was
    surprising that it even would care).
    
    With this check removed, validating a host's SSH credentials will no longer
    throw a misleading "getaddrinfo" error when the appliance has a bad hostname.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1278904

 gems/pending/util/MiqSshUtilV2.rb | 4 ----
 1 file changed, 4 deletions(-)
Comment 21 CFME Bot 2015-11-23 16:12:30 EST
New commit detected on cfme/5.5.z:
https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=3fb9303616a99ac2217f035ee22c5d0bbfa1c86f

commit 3fb9303616a99ac2217f035ee22c5d0bbfa1c86f
Merge: e3a6ea0 1fe4c79
Author:     Jason Frey <jfrey@redhat.com>
AuthorDate: Mon Nov 23 15:49:06 2015 -0500
Commit:     Jason Frey <jfrey@redhat.com>
CommitDate: Mon Nov 23 15:49:06 2015 -0500

    Merge branch 'bz1283195-5.5.z-backport' into '5.5.z'
    
    No longer validate source hostname with MiqSshUtil
    
    Clean cherry pick from upstream PR: https://github.com/ManageIQ/manageiq/pull/5502
    
    Way back in e6dcb57e41a5b9a2326dbadecd995808f9e043d3, code was added to make
    sure that the appliance had a valid hostname before attempting to SSH from the
    appliance to another box.  The reasoning at the time was that the SSH gem would
    ignore a failure caused by having an invalid appliance hostname, but then later
    blow up because of the invalid appliance hostname.
    
    It does not appears that SSH even cares about the appliance's hostname anymore
    when establishing an SSH connection to another server (in fact, it was
    surprising that it even would care).
    
    With this check removed, validating a host's SSH credentials will no longer
    throw a misleading "getaddrinfo" error when the appliance has a bad hostname.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1278904
    
    See merge request !523

 gems/pending/util/MiqSshUtilV2.rb | 4 ----
 1 file changed, 4 deletions(-)
Comment 22 Nandini Chandra 2015-11-24 23:19:05 EST
Verified in 5.5.0.12
Comment 24 errata-xmlrpc 2015-12-08 08:45:16 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:2551
Comment 25 Greg Blomquist 2016-02-19 18:11:09 EST
*** Bug 1291858 has been marked as a duplicate of this bug. ***
Comment 26 Greg Blomquist 2016-02-19 18:11:12 EST
*** Bug 1245171 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.