Bug 1227017

Summary: [foreman-discovery-image][RHOS 6]: issues on rhel-osp-installer - Could not send facts to Foreman: getaddrinfo: Name or service not known
Product: Red Hat OpenStack Reporter: Aaron Thomas <aathomas>
Component: foreman-discovery-imageAssignee: Mike Burns <mburns>
Status: CLOSED EOL QA Contact: Omri Hochman <ohochman>
Severity: high Docs Contact:
Priority: high    
Version: 6.0 (Juno)CC: brad.beam, mburns, mjs, rhos-maint, sclewis, srevivo
Target Milestone: ---Keywords: ZStream
Target Release: Installer   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-09-29 13:38:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Aaron Thomas 2015-06-01 17:21:29 UTC
Description of problem:
-----------------------------------------
Currently the customer reporting this problem is not able to auto discover new hosts without logging onto the bootstrapped foreman image and restarting the network utilizing the following script outlined below. The host is able to pull the image from rhel-osp-installer's tftp server, bootstrap the discovery image then appears to fail with the following DNS error: "Could not send facts to Foreman: getaddrinfo: Name or service not known".

# cat autostart.d/01_fix_networking.sh
-----------------------------------------
export PATH="$PATH:/usr/sbin:/sbin"
for x in {0..10}; do
	echo;echo;echo
	ip a
	ip_line=$(ip a show enp1s0f0 | grep inet.*10.0.0.)
	[[ $ip_line ]] && break
	echo "Attempt $x at restarting networking"
	sudo systemctl restart network
done


Additional information around the tcpdump collected on the rhel-osp-installer. These are observations between the rhel-osp-installer and two discovery hosts 10.0.0.7 and 10.0.0.14. We requested that the customer provide a tcpdump taken from the installer's provisioning network interface while the discovery host attempts to register and then fails, then had the customer restart the network on the discovery host with '01_fix_networking.sh' which then allows the discovery host to register to foreman. This tcpdump is named 'rhel-osp-installet.pcap' and attach to case 01441568.

Tcpdump Observations: 

Wireshark Time 32 40.695776: rhel-osp-installer: (ip: 10.0.0.5 && mac: fc:aa:14:7e:48:75)  discovery host: (ip: 10.0.0.7 && mac: fc:aa:14:7e:43:b9)  
-----------------------------------------
The initial DHCP request from the discovery host 10.0.0.7 to the rhel-osp-installer 10.0.0.5 completes succesfully, then we see an arp requests from the discovery host to the rhel-osp-installer over broadcast/multicast which has the sender mac address correctly reporting mac address: fc:aa:14:7e:43:b9, however the discovery host reports an ip: 0.0.0.0 and target macaddress of: ff:ff:ff:ff:ff:ff and target ip address of: 10.0.0.7. This occurs twice and then we see a successful unicast arp request from the discovery host which reports a target mac address for the rhel-osp-installer at 00:00:00:00:00:00.


Wireshark Time 308 173.583369: following the UDP stream and filtering the conversation from 10.0.0.5:55400 --> 10.0.0.14:49169 mentions issues with direct floppy boot not being supported as well as issues with a supported limited functionality around the bootloader and issues with compatibility issues with RHEL 7 concerning the cpu.
-----------------------------------------
Direct floppy boot is not supported. Use a boot loader program instead.

[...]

This kernel requires an %s CPU, .but only detected an %s CPU.
.This kernel requires the following features not present on the CPU:
.%s .%d:%d ...@...a...... ........................................................................earlyprintk.serial.0x.ttyS.console.uart8250,io,.uart,io,............edd.skipmbr.skip.off.on.quiet.Probing EDD (edd=off to disable)... .ok
.early console in setup code
.debug.WARNING: Ancient bootloader, some functionality may be limited!
.This processor is unsupported in RHEL7.
.A20 gate not responding, unable to boot...
.........................................................g.......0123456789ABCDEFPress <ENTER> to see video modes available, <SPACE> to continue, or wait.... 30 sec
.Mode: Resolution:  Type: .%dx%d.%c %03X %4dx%-7s %-6s.Enter a video mode or "scan" to scan for additional modes: .. ..Undefined video mode number: %x
.3.10.0-123.20.1.el7.x86_64 (mockbuild.eng.bos.redhat.com) #1 SMP Wed Jan 21 09:45:55 EST 2015.VGA.CGA/MDA/HGC.EGA.<+...*..D+...+..L+..T+..x+..VESA.BIOS.........


Version-Release number of selected component (if applicable):
-----------------------------------------
foreman-discovery-image-7.0-20150227.0.el7ost.noarch 

How reproducible:
-----------------------------------------
100% in the customer's environment


Steps to Reproduce:
-----------------------------------------
1. PXE boot a new host to auto discovery in rhel-osp-installer's environment
2. Once the host bootstraps the foreman-discovery image it will eventually produce the error outlined in the description and not register to rhel-osp-installer as a discovery host. 

Actual results:
-----------------------------------------
Host does not register to rhel-osp-installer as a discovery host.

Expected results:
-----------------------------------------
Host registers as a discovery host.


Additional info:
-----------------------------------------
The case this bug is attached to has the most current debugs and tcpdumps of the discovery process on foreman's provisioning interface.

Comment 64 Jaromir Coufal 2016-09-29 13:38:32 UTC
Closing list of bugs for RHEL OSP Installer since its support cycle has already ended [0]. If there is some bug closed by mistake, feel free to re-open.

For new deployments, please, use RHOSP director (starting with version 7).

-- Jaromir Coufal
-- Sr. Product Manager
-- Red Hat OpenStack Platform

[0] https://access.redhat.com/support/policy/updates/openstack/platform