Bug 1391758

Summary: OSP Director - overcloud deployment should fail if there is a domain name mismatch between Heat and Nova.
Product: Red Hat OpenStack Reporter: Matt Flusche <mflusche>
Component: openstack-tripleo-heat-templatesAssignee: Jason E. Rist <jrist>
Status: CLOSED CURRENTRELEASE QA Contact: Ola Pavlenko <opavlenk>
Severity: high Docs Contact:
Priority: high    
Version: 9.0 (Mitaka)CC: aschultz, dbecker, jrist, jschluet, mburns, morazi, opavlenk, rhel-osp-director-maint
Target Milestone: ---Keywords: Triaged
Target Release: 12.0 (Pike)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-6.0.0-4.el7ost Doc Type: Known Issue
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-06-08 14:23:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Matt Flusche 2016-11-03 22:52:13 UTC
Description of problem:

These two parameters are required to be consistent in order to properly set a new domain name for overcloud nodes:

 - The tripleo heat template parameter: CloudDomain
 - The [DEFAULT] dhcp_domain parameter in nova.conf on the Undercloud.

However, it is currently possible to deploy an overcloud with these parameters mismatched.  

Due to the confusion this can cause and difficulty to resolve (often requiring re-deployment); the overcloud deployment should fail as soon as possible if the configuration is not consistent.

Version-Release number of selected component (if applicable):
Current OSP 9

How reproducible:
100%

Steps to Reproduce:
1. Set CloudDomain parameter for tripleo deployment (ie example.com)
2. Leave dhcp_domain with default settings in nova.conf
3. Deploy overcloud
4. After deployment, login to overcloud node and observe issues with the configured FQDN.

$ hostname -i
hostname: Name or service not known

$ hostname -f
hostname: Name or service not known

$ grep $(hostname) /etc/hosts
(null)


Actual results:
FQDN is broken on overcloud nodes.

Expected results:
Deployment fails as soon as possible or a single parameter is used to set domain name.


Additional info:

The documentation should be enhanced to make users more aware of the impact and configuration needed.

Currently:

CloudDomain

    Type: string

    The DNS domain used for the hosts. This should match the dhcp_domain configured in the Undercloud’s networking. Defaults to localdomain.

Comment 1 Matt Flusche 2016-11-04 23:05:42 UTC
The following will catch it but it seems like there should be a better way and earlier in the deployment.  This would also break any deployments where this mismatch already exists.

tripleo-heat-templates
diff --git a/validation-scripts/all-nodes.sh b/validation-scripts/all-nodes.sh
index 0b8b352..88a8147 100644
--- a/validation-scripts/all-nodes.sh
+++ b/validation-scripts/all-nodes.sh
@@ -67,5 +67,21 @@ function ping_default_gateways() {
   echo "SUCCESS"
 }
 
+# Verify the FQDN from the nova/ironic deployment matches
+# FQDN in the heat templates.
+function fqdn_check() {
+  HOSTNAME=$(hostname)
+  SHORT_NAME=$(hostname -s)
+  FQDN_FROM_HOSTS=$(awk '$3 == "'${SHORT_NAME}'"{print $2}' /etc/hosts)
+  echo -n "Checking hostname vs /etc/hosts entry..."
+  if [[ $HOSTNAME != $FQDN_FROM_HOSTS ]]; then
+    echo "FAILURE"
+    echo -e "System hostname: ${HOSTNAME}\nEntry from /etc/hosts: ${FQDN_FROM_HOSTS}\n"
+    exit 1
+  fi
+  echo "SUCCESS"
+}
+
 ping_controller_ips "$ping_test_ips"
 ping_default_gateways
+fqdn_check

Comment 2 Matt Flusche 2017-03-27 12:51:49 UTC
I submitted this code upstream as an optional validation.

https://review.openstack.org/#/c/433820/

Comment 3 Jason E. Rist 2017-04-03 16:18:44 UTC
433820 was merged,
https://review.openstack.org/#/c/452809/ as backport to Ocata

Comment 4 Jason E. Rist 2017-04-03 22:46:19 UTC
Ocata backport merged as well.

Comment 6 Jason E. Rist 2017-04-04 13:16:06 UTC
Sorry, did I put it in the wrong status? Might need to be in post.

For Ocata the commit is Ib5689acae66baf63ecccbc3b1c0b96684781b863 so it would be nice to have it in 11 (and 12)

Comment 8 Mike Burns 2017-06-08 14:23:12 UTC
Looks like the patch shipped in 11 GA, closing