Bug 1120426

Summary: rubygem-staypuft: HA deployment fails because rabbitmq-server failed to start on one of the hosts, because DNS resolution failed, because the hostname was all numeric
Product: Red Hat OpenStack Reporter: Omri Hochman <ohochman>
Component: rubygem-staypuftAssignee: Mike Burns <mburns>
Status: CLOSED ERRATA QA Contact: Leonid Natapov <lnatapov>
Severity: urgent Docs Contact:
Priority: high    
Version: 5.0 (RHEL 6)CC: ajeain, dnavale, hbrock, mburns
Target Milestone: gaKeywords: TestOnly
Target Release: Installer   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Known Issue
Doc Text:
Foreman discovery plugin generated a hostname from the host's MAC address. If the MAC address is all numeric, the resulting host name was all numeric. But all numeric hostnames are a violation of internet protocol and caused the glibc resolver to fail. As a workaround, change the hostnames of the machines you discover to include at least one non-numeric character and as a result, the DNS resolution succeeds.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-08-21 18:05:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Omri Hochman 2014-07-16 22:16:09 UTC
rubygem-staypuft: HA deployment fails because rabbitmq-server failed to start on one of the hosts (Mnesia on rabbit@525400868093 could not connect to node).
 

Steps:
-------
(1) Install Staypuft.
(2) Attempt create nova-network HA deployment 

Results:
---------
- on one of the hosts service rabbitmq-server failed to start
- deployment failed. 

It seems that there was DNS issue with the short names
 ping to 525400868092
 ping: unknown host 525400868092
but...
 ping -c1 525400868092.lab.eng.rdu2.redhat.com
 64 bytes from 525400868092.lab.eng.rdu2.redhat.com (192.168.0.6): icmp_seq=1   ttl=64 time=0.253 ms
 


Environment:
-------------
ruby193-rubygem-staypuft-0.1.17-1.el6ost.noarch
puppet-3.6.2-1.1.el6.noarch
puppet-server-3.6.2-1.1.el6.noarch
openstack-puppet-modules-2014.1-19.1.el6ost.noarch
openstack-foreman-installer-2.0.15-1.el6ost.noarch
foreman-1.6.0.21-1.el6sat.noarch
rhel-osp-installer-0.1.1-1.el6ost.noarch

journalctl -u puppet 
---------------------
Jul 16 20:33:35 525400868093.lab.eng.rdu2.redhat.com puppet-agent[3036]: Could not start Service[rabbitmq-server]: Execution of '/usr/bin/systemctl start rabbitmq-server' returned 1: Job for rabbitmq-server.serv
Jul 16 20:33:35 525400868093.lab.eng.rdu2.redhat.com puppet-agent[3036]: Wrapped exception:
Jul 16 20:33:35 525400868093.lab.eng.rdu2.redhat.com puppet-agent[3036]: Execution of '/usr/bin/systemctl start rabbitmq-server' returned 1: Job for rabbitmq-server.service failed. See 'systemctl status rabbitmq
Jul 16 20:33:35 525400868093.lab.eng.rdu2.redhat.com puppet-agent[3036]: (/Stage[main]/Rabbitmq::Service/Service[rabbitmq-server]/ensure) change from stopped to running failed: Could not start Service[rabbitmq-s


/var/log/rabbitmq/rabbit\@525400868093.log
-------------------------------------------
=INFO REPORT==== 16-Jul-2014::20:30:36 ===
Limiting to approx 924 file handles (829 sockets)

=ERROR REPORT==== 16-Jul-2014::20:30:36 ===
Mnesia(rabbit@525400868093): ** ERROR ** Mnesia on rabbit@525400868093 could not connect to node(s) [rabbit@525400868092]

Comment 2 Hugh Brock 2014-07-17 14:36:58 UTC
Marked as a known issue. Fix will be to update the discovery plugin to not ever assign an all-numeric hostname. Set known issue, removed blocker and exception, added GA3 flag.

Comment 4 Mike Burns 2014-08-05 17:48:54 UTC
hosts should now be prefixed with mac

Comment 6 Omri Hochman 2014-08-10 18:06:02 UTC
Verified : ruby193-rubygem-staypuft-0.2.2-1.el6ost.noarch

Unable to reproduce when using the Workaround:
Change the hostnames of your discovered machines to include at least one non-numeric character

Result: DNS resolution succeeds

Comment 7 errata-xmlrpc 2014-08-21 18:05:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1090.html