Bug 1412352 - rhel-registration script broken with satellite 6.2.5+
Summary: rhel-registration script broken with satellite 6.2.5+
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: diskimage-builder
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: Upstream M2
: 10.0 (Newton)
Assignee: James Slagle
QA Contact: Gurenko Alex
URL:
Whiteboard:
Depends On:
Blocks: 1411935
TreeView+ depends on / blocked
 
Reported: 2017-01-11 20:18 UTC by Jason Montleon
Modified: 2018-02-12 21:48 UTC (History)
9 users (show)

Fixed In Version: openstack-tripleo-heat-templates-5.3.0-5
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-02-12 21:48:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1711435 0 None None None 2017-08-17 18:31:33 UTC

Description Jason Montleon 2017-01-11 20:18:24 UTC
Description of problem:
The script /usr/share/diskimage-builder/elements/rhel-common/os-refresh-config/pre-configure.d/06-rhel-registration, which is part of the diskimage-builder package and is part of overcloud images included with RHEL OSP 10 is broken with Satellite 6.2.5+

The problem is that the katello-ca-consumer package now run the command /usr/bin/katello-rhsm-consumer as a postscript.

This in turn at the very end writes a file /etc/rhsm/facts/katello.facts by running:
if [ -d /etc/rhsm/facts/ ]; then
  echo "{\"network.hostname-override\":\"`hostname -f`\"}" > /etc/rhsm/facts/katello.facts
fi

The problem is that every overcloud node populates this file with:
{"network.hostname-override":"localhost"}

This in turn causes every overcloud host to register with the name 'localhost' overwriting previous registrations. Depending on timing this can cause later commands in the script to fail, throwing an error, and causing the deployment to fail.

Version-Release number of selected component (if applicable):
On the Satellite:
[root@qci ~]# rpm -q satellite
satellite-6.2.6-2.0.el7sat.noarch

On the Director:
[root@undercloud ~]# rpm -qa | grep rhosp
rhosp-director-images-ipa-10.0-20161212.1.el7ost.noarch
rhosp-director-images-10.0-20161212.1.el7ost.noarch

How reproducible:
Always

Steps to Reproduce:
1. Install a satllite 6.2.6 host
2. Set up a director with plan parameters to register to the satellite
3. run deployment

Actual results:
you end up with one host registered to satellite with the name localhost. All previous host registrations are invalidated

Expected results:
All hosts register correctly and deployment does not fail.

Additional Information:
It looks like hostname -f returns localhost until /etc/hosts is updated, which as far as I can tell happens well after registration.

Comment 1 Jason Montleon 2017-01-12 14:58:15 UTC
The script being run is from the plan:
extraconfig/pre_deploy/rhel-registration/scripts/rhel-registration

Although I see 06-rhel-registration running on the hosts in /var/log/messages it appears it's not actually doing anything as it's exiting in a couple seconds.

Again `hostname -f` is returning localhost because /etc/hosts has not been populated yet and the name does not resolve from DNS, despite the hostname being set.

To workaround this I added the line:
echo "{\"network.hostname\":\"$HOSTNAME\"}" > /etc/rhsm/facts/katello.facts
immediately after:
rpm -Uvh katello-ca-consumer-latest.noarch.rpm || true

The registration process then goes on using the correct hostname.

Comment 3 Alex Schultz 2017-03-13 16:20:07 UTC
This might be a duplicate of BZ#1421228

Comment 4 James Slagle 2017-04-13 17:25:24 UTC
The NodeExtraConfig resource that is mapped to rhel registration definitely comes before the {{role.name}}HostsDeployment in the templates. Still, I would have expected cloud-init to have already set the hostname.

If the workaround from comment 1 works, then let's just add that to tripleo-heat-templates/extraconfig/pre_deploy/rhel-registration/scripts/rhel-registration
as the fix. Perhaps we could use hostnamectl to get the hostname though instead of $HOSTNAME. Would have to investigate that.

Comment 5 Ben Nemec 2017-04-19 22:01:00 UTC
James has this.  Cancelling needinfo.

Comment 6 James Slagle 2017-08-16 15:54:19 UTC
can you clarify if you are building rhel images with diskimage-builder and registering them to satellite during that image build process?

Or,

only attempting to register the unmodified director images to satellite with the extraconfig/pre_deploy/rhel-registration/environment-rhel-registration.yaml template from tripleo-heat-templates?

From what I can tell, katello-ca-consumer would only be installed by either the script during the image build or during the overcloud deploy.

If used during the image build, then I'm not surprised the hostname is "localhost".

If used during the overcloud deploy, then the hostname should be set by cloud-init, and if it's not then we need to investigate why that's not the case.

Or possibly if you're doing both, then the image build fact is still taking precedence.

Comment 7 James Slagle 2017-08-16 21:30:01 UTC
there are a few more details on https://bugzilla.redhat.com/show_bug.cgi?id=1476760

I believe I've tracked this down to Heat configuring /etc/hosts after rhel registration. This is a change from when we were using the 51-hosts script to configure /etc/hosts.

Since satellite forces a hostname override in /etc/rhsm/facts/katello.facts to the result of "hostname -f", if /etc/hosts is not configured, that value will always be localhost.

as i mentioned in the other bug, a quick fix may be to rm /etc/rhsm/facts/katello.facts after katello-ca-consumer is installed in the rhel-registration script


Note You need to log in before you can comment on or make changes to this bug.