Bug 1564654
Summary: | OSP13: Overcloud deployment fails when using capital letters in customized stack name ( --stack TEST-STACK34 ). | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Omri Hochman <ohochman> | |
Component: | puppet-tripleo | Assignee: | RHOS Maint <rhos-maint> | |
Status: | CLOSED ERRATA | QA Contact: | nlevinki <nlevinki> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 13.0 (Queens) | CC: | agurenko, aschultz, bdobreli, chjones, dciabrin, jcoufal, jjoyce, joflynn, jschluet, mburns, mcornea, mflusche, michele, mkrcmari, rhos-maint, rscarazz, sathlang, slinaber, tvignaud | |
Target Milestone: | z2 | Keywords: | Regression, ReleaseNotes, Triaged, ZStream | |
Target Release: | 13.0 (Queens) | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | puppet-tripleo-8.3.4-2.el7ost openstack-tripleo-common-8.6.3-2.el7ost | Doc Type: | Bug Fix | |
Doc Text: |
If you used uppercase letters in the stack name, the deployment failed.
Fixes have been introduced to ensure that a stack name with upper case letters leads to a successful deployment. Specifically, the bootstrap_host scripts inside the containers now convert strings to lowercase correctly and the same happens for pacemaker properties.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1585189 (view as bug list) | Environment: | ||
Last Closed: | 2018-08-29 16:35:26 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1576148, 1585189 |
Description
Omri Hochman
2018-04-06 19:29:15 UTC
(Thanks Marius) Debug: try 15/20: /usr/sbin/pcs -f /var/lib/pacemaker/cib/puppet-cib-backup20180405-8-1sqw3dc property set --node TEST-STACK34-controller-1 redis-role=true Debug: Error: Error: unable to set attribute redis-role Could not map name=TEST-STACK34-controller-1 to a UUID while the name in the cluster is test-stack34-controller-1 the stack name has capital letters vs lower case how the node is registered in the cluster ------------------------------------------------------------------------ A possible solution would be to block on the CLI level the option to name stack with capital letters. Your sosreport has been generated and saved in: /var/tmp/sosreport-undercloud75-20180224155142.tar.xz The checksum is: fbf9f2ce310d1020e3bea43bd64203d5 Please send this file to your support representative. Copying the results to the publicly available URL The reports should be available here: http://rhos-release.virt.bos.redhat.com/log/bz1564654 Node names in tripleo puppet are mostly casted via downcase(). This should be a puppet configuration issue for the corosync cluster/pacemaker bundles probably. This Bz got potential to cause Major-Upgrade to OSP13 to fail in-case there was use of capital letters on the stack name of the previous version. *** Bug 1575517 has been marked as a duplicate of this bug. *** Dropping the DF because it's not related to the the framework. It's failing in the pacemaker configuration. *** Bug 1575752 has been marked as a duplicate of this bug. *** Quick update on the failure: Unlike previous version of OSP, in OSP13 the pcs command that is being invoked to set up cluster node properties is being given a hostname with capitals. We deployed a OSP10 with Raoul earlier today and it succeeded with capital letters in the stack name. The generated hiera keys both have capital in their name in OSP10 and OSP13, but one notable difference between our OSP10 and OSP13 deploys is that the hostname's FQDN now includes capitals in OSP13: # hostname uppercaseovercloud-controller-0 # hostname -f UPPERCASEOverCloud-controller-0.localdomain whereas it was all lowercase in OSP10 Alex, is that expected? If so, would that change cause issue with upgrades? Anyway we'll downcase the name we use to invoke the pcs command, and see whether that fixes the deployment or if other issues arise. This is likely an issue with our pacemaker implementation with docker. In <OSP12 since we used puppet, $::hostname was always lowercased as facter was switching it out. This is why we had to use downcase for bootstrap node checks in puppet-tripleo. If we're relying on fqdn for anything, it will need to continue to be lower cased to match the previous implementations. Ok so there are a bunch of aspects to this: 1) The pcs property stuff which we are fixing via https://review.openstack.org/570413. I tested that part specifically and it works. 2) The other aspect is that in tripleo-common we have a bunch of bootstrap_* scripts which also break this assumption: HOSTNAME=$(/bin/hostname -s) SERVICE_NODEID=$(/bin/hiera -c /etc/puppet/hiera.yaml "${SERVICE_NAME}_short_bootstrap_node_name") if [[ "$HOSTNAME" == "$SERVICE_NODEID" ]]; then eval $* else echo "Skipping execution since this is not the bootstrap node for this service." fi So all bootstrap task will fail to run because we never match the hostname == service_nodeid We will need another patch fixing tripleo-common as well. 3) Also I'll add that this does not work since OSP12 and is not really OSP13 specific. So it's likely that the the tripleo-common scripts are fine because hostname -s will return the capital letters. $ hostname -s UNDERCLOUD This primarily affects anything puppet related where $::hostname from facter is lowercase. Actually it's not, because from comment #9, the host's shortname is all lowercase on the overcloud controller nodes. After applying https://review.openstack.org/#/c/570413/ , the deploy continues but all the {service}_sync_db containers do nothing, and the subsequent service container fail to run. I've manually updated all the containers locally to force lowercase comparison like what is proposed in https://review.openstack.org/#/c/570484/, and the deploy finishes successfully. So fixing this bz requires: . a fix in puppet-tripleo https://review.openstack.org/#/c/570413 . a fix in tripleo-common https://review.openstack.org/#/c/570484 . rebuilding all container images with the updated tripleo-common Can we please get the fixes for this bug into a downstream puddle? *** Bug 1610498 has been marked as a duplicate of this bug. *** This bug is marked for inclusion in the errata but does not currently contain draft documentation text. To ensure the timely release of this advisory please provide draft documentation text for this bug as soon as possible. If you do not think this bug requires errata documentation, set the requires_doc_text flag to "-". To add draft documentation text: * Select the documentation type from the "Doc Type" drop down field. * A template will be provided in the "Doc Text" field based on the "Doc Type" value selected. Enter draft text in the "Doc Text" field. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2574 |