Bug 1245171

Summary: [RFE] Provide improved message when CFME Appliance is not resolvable
Product: Red Hat CloudForms Management Engine Reporter: Thom Carlin <tcarlin>
Component: ProvidersAssignee: Greg Blomquist <gblomqui>
Status: CLOSED DUPLICATE QA Contact: Dave Johnson <dajohnso>
Severity: medium Docs Contact:
Priority: low    
Version: 5.4.0CC: jfrey, jhardy, jprause, obarenbo, tcarlin
Target Milestone: GAKeywords: FutureFeature
Target Release: 5.6.0Flags: tcarlin: automate_bug?
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra:provider:host
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
: 1291858 (view as bug list) Environment:
Last Closed: 2016-02-19 23:11:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1291858    

Description Thom Carlin 2015-07-21 11:26:58 UTC
Description of problem:

If CFME Appliance fqdn is not resolvable, the symptoms may lead user to wrong conclusion as to the cause.

Version-Release number of selected component (if applicable):

5.4.1.0.20150717083323_6ed7e1c

How reproducible:

100%

Steps to Reproduce:
1. Have a CFME appliance with non-resolvable fqdn
2. Define Infrastructure provider
3. Refresh Relationships and Power States
4. Enter Host Credentials
5. Validate Credentials

Actual results:

User sees flash message on web UI: "Unexpected response returned from system, see log for details"
Looking at evm.log, user sees "./evm.log-[----] I, [2015-07-20T15:01:52.156213 #2450:777ea0]  INFO -- : host.connect_ssh: Initiating SSH connection to Host:[ibm-x3250m4-05] using [host_fqdn] for user:[root].  Options:[{:remember_host=>false}]
./evm.log-[----] I, [2015-07-20T15:01:52.156392 #2450:777ea0]  INFO -- : host.connect_ssh: SSH connection established to [host_fqdn]
./evm.log:[----] E, [2015-07-20T15:01:52.176895 #2450:777ea0] ERROR -- : host.connect_ssh: SSH connection failed for [host_fqdn] with [SocketError: getaddrinfo: Name or service not known]
./evm.log:[----] W, [2015-07-20T15:01:52.177253 #2450:777ea0]  WARN -- : MIQ(Host-verify_credentials_with_ssh): #<SocketError: getaddrinfo: Name or service not known>
./evm.log-[----] E, [2015-07-20T15:01:52.177427 #2450:777ea0] ERROR -- : MIQ(host_controller-update): Unexpected response returned from system, see log for details

Expected results:

Better error message

Additional info:

The sequence of events could have the user thinking the issue is with the host fqdn when it is actually with the CFME appliance fqdn.

Suggestion is to wrap call to MiqSockUtil.getFullyQualifiedDomainName in /var/www/miq/lib/util/MiqSshUtilV2.rb, line 269 in begin/rescue block and emit a more user-friendly error message indicating the problem (see lines 267-268).

Comment 2 Greg Blomquist 2015-07-30 13:46:12 UTC
Thom,

can you suggest a better error message here?

I just happen to know that "getaddrinfo" means that there's a DNS or name resolution problem somewhere.  I can see how it would be nicer to know who's name can't be resolved.  But, I'm not sure what to really convey to the user to help make this easier to diagnose.

Thanks!

Comment 3 Thom Carlin 2015-07-30 22:26:32 UTC
"SSH connection failed for [host_fqdn] with [SocketError: getaddrinfo: Name or service not known for <host_or_ip_that_failed>]"
or
"DNS Error: <host_or_ip_that_failed> doesn't resolve properly"

Comment 5 Greg Blomquist 2016-02-19 23:11:12 UTC
https://github.com/ManageIQ/manageiq/pull/5502

*** This bug has been marked as a duplicate of bug 1278904 ***