Created attachment 1025484 [details] detailed error log Description of problem: Host fleecing fails for nodes discovered in the Openstack Infrastructure provider Version-Release number of selected component (if applicable): 5.4.0.1.20150512111354_4368716 How reproducible: Fresh provisioned appliance with Openstack Infra provider attached. Steps to Reproduce: 1. Add a new Openstack Infra provider. 2. Wait for the nodes to be discovered. 3. Trigger SmartState analysis for the discovered nodes. Actual results: SmartState analysis fails. Expected results: SmartState analysis runs and CFME gets nodes info via SSH. Additional info: Logs show the following: ERROR -- : host.connect_ssh: SSH connection failed for [192.0.2.11] with [SocketError: getaddrinfo: Name or service not known] ERROR -- : [SocketError]: getaddrinfo: Name or service not known Method:[rescue in block in scan_from_queue] ERROR -- : /var/www/miq/lib/util/MiqSockUtil.rb:11:in `gethostbyname' CFME instance can access hosts by IP but it appears that it's trying to do some name resolution. The nodes show the IP address in both Hostname and IP Address fields.
full log http://paste.openstack.org/show/223658/
The problem was caused by the CFME appliance not having a valid fqdn. It got solved after running: [root@localhost ~]# hostname -f hostname: Unknown host [root@localhost ~]# hostname localhost.localdomain [root@localhost ~]# hostname -f localhost
Marius, sounds like we should close this one then as not a bug?
Yes, it's not a bug but the log error message is misleading because it points to remote hosts whilst it's actually generated by the local host. Not sure if we should close it or mark as low priority.
I am putting it to low. Marius, where should the fix go? Into installing doc of CFME, or some CFME installer? Or we should not require fqdn?
I think that SSH should work without having a valid fqdn set, I don't know why it's not possible here (see [1]). Would it be possible to log an error when this is hit that clearly specifies you don't have valid fqdn set on the cfme machine ? [1] https://github.com/ManageIQ/manageiq/blob/2a6ac9973eab0ad759c8382a013626fd775b8f06/lib/util/MiqSshUtilV2.rb#L269
Right. @Greg can you figure out if the fqdn check is still needed, the comment about unclear error is unclear. :-) Possibly just some old bug in Net:SSH ?
I'm moving this over to the appliance component. Basically, we need to make sure that the appliance has *some* hostname set. If it's already guaranteed, then this bug could just be closed, I guess.
@Marius could you check if FQDN set will be documented and required setting in the installer? That should be enough to close this one. Otherwise we would need to investigate the errors with missing fqdn as mentioned in comment #7
Marius / Ladislav, From QE prospective, can you please let us know what needs to be tested to close this issue? As far as the error is concerned I am still able to reproduce it. Not sure if there is any updates on the documentation. If you can share the updates, we can go ahead with further action. Thanks, Ramesh
I've just tried deploying an appliance on an Openstack environment and I got this issue. I believe this is the same when deploying it on other infrastructure types(RHEV or VMware). [root@host-192-168-0-101 ~]# hostname -f hostname: Unknown host [root@host-192-168-0-101 ~]# hostname host-192-168-0-101 This bugs remains valid in my opinion and should not be ON_QA as no patches were done. The error message is misleading because it refers to not being able to resolve the local machine hostname in the context of sshing to a remote host. If there is a need to do name resolution for the localhost while sshing to a remote node then the error message should explicitly mention why it is failing. I'm going to file an additional docs BZ for this.
Filed docs bug BZ#1276521.
Hi Ladislav, As per the comment#12 from Marius, I am moving this back to ON_DEV. Thanks, Ramesh
https://github.com/ManageIQ/manageiq/pull/5403
*** This bug has been marked as a duplicate of bug 1278904 ***
@Greg so can I drop the fix https://github.com/ManageIQ/manageiq/pull/5403, the hostname will be always set?
@Ladas, yes, you can drop that fix (looks like it's closed already). https://github.com/ManageIQ/manageiq/pull/5502 fixed this by just not checking for a fully qualified domain name of the appliance before SSHing out. The check is no longer necessary with current versions of SSH.