Bug 1816481
| Summary: | case-insensitive entries are created in /etc/host and leading to undercloud install failure | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Srinivas Atmakuri <satmakur> |
| Component: | openstack-tripleo-heat-templates | Assignee: | Cédric Jeanneret <cjeanner> |
| Status: | CLOSED ERRATA | QA Contact: | David Rosenfeld <drosenfe> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 16.0 (Train) | CC: | cjeanner, dbecker, dciabrin, lmiccini, mburns, mgarciac, morazi |
| Target Milestone: | --- | Keywords: | Triaged |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-tripleo-heat-templates-11.3.2-0.20200401082140.cd5c992.el8ost | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-05-14 12:16:33 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Srinivas Atmakuri
2020-03-24 05:19:06 UTC
Hello PIDONE! We might need to do some collaboration on that case - care to provide some info about the way mariadb|galera is configured? is there a way to make it case-insensitive for the "host" thing? In upstream master, the /etc/hosts is apparently lower-cased, especially since https://review.opendev.org/714409 - there might be a regression in osp-16 when the new tripleo_hosts_entries role was introduced, but still..... lower-case doesn't seem to be mandatory, according to the RFCs :/. Thank you for your help and thoughts! C. I'm assuming this deployment enabled TLS-everywhere, because that's the only case that I see where we explicitly create Openstack users with FQDN as hosts.
()[root@database-0 /]# mysql -u root -e "select user,host from mysql.user where user = 'nova';"
+------+------------------------------------+
| user | host |
+------+------------------------------------+
| nova | % |
| nova | 172.17.1.13 |
| nova | overcloud.internalapi.redhat.local |
+------+------------------------------------+
In other non-TLS-e environment, Only the Openstack users 'nova'@'%' and 'nova'@'<IP>' are created at deployment time.
I suspect the original intent for creating those users was to give a means to mysql to verify the client-side of the TLS connections. I have to dig more to refresh my memory.
So right now we always store the FQDN the way that puppet generate them for us (which is probably why we see it lowercase in /etc/hosts as well).
Ideally, the value in field 'host' in the DB should match what is stored in the certificate...:
[root@database-0 ~]# openssl x509 -in /etc/pki/tls/certs/mysql.crt -text -noout | grep -e Subject: -e DNS
Subject: O = REDHAT.LOCAL, CN = database-0.internalapi.redhat.local
DNS:overcloud.internalapi.redhat.local, DNS:database-0.internalapi.redhat.local, othername:<unsupported>, othername:<unsupported>
... and also what is used in the client config:
[root@controller-0 ~]# grep pymysql /var/lib/config-data/puppet-generated/nova/etc/nova/nova.conf
connection=mysql+pymysql://nova_api:<...>@overcloud.internalapi.redhat.local/nova_api?read_default_file=/etc/my.cnf.d/tripleo.cnf&read_default_group=tripleo
connection=mysql+pymysql://nova:<...>@overcloud.internalapi.redhat.local/nova?read_default_file=/etc/my.cnf.d/tripleo.cnf&read_default_group=tripleo
@Srinivas, could you verify that the certificate had only lowercase in its fields Subject and DNS?
Also, can you check whether the pymysql connection URL contained the hostname information with uppercase in it?
Some more information here: - I'm able to get "an error" during the undercloud deployment, using upstream Master code, in the following cases: 1. undercloud FQDN is "camelcased", such as tripleo-vRAN-example.localdomain 2. when I set undercloud_hostname in the undercloud.conf (if you don't set it, it will take the current host name) - the error isn't clear, it fails to get container status at some point, without any log - enforcing lower-case makes it work - it's using tls-everywhere, I'll try the same cases without TLS, just to be sure it is indeed "the thing" At least I'm able to get an env for this kind of issue, it does help a bit. Damien, if you want to check a broken env, lemme know, it can be up in something like 30 minutes on my end. Right now deploying with all lower-case just to check, I'll re-start with camelcase and without tls-e. Cheers, C. Me again, So, quick update: without TLS-e, it also fails with camelcase hostname.... Now I'm trying to find the cause of this failure, logs are masked apparently. Hello there, So it's a bit more complicated: apparently, when we use a «camelcased» hostname, all the *_init_tasks containers are just ignored. This leads to the missing entries in the mariadb "user" database, leading to the failures. We're trying to understand why this happens. Probably linked to a "key" somewhere based on the fqdn without the "to_lower()" thing - we apparently DO some lowering in the deploy process, but I'm pretty sure we missed some places. A solution might be to regenerate the /etc/hosts completely, but this will probably create some other issues at some point. Stay tuned! C. Me again, found the issue: some weak string comparisons in tripleo-heat-templates content led to some tasks being filtered out, and not executed. A patch is under test in my env, I should be able to get its status in less than an hour hopefully. Stay tuned! C. Upstream just merged in stable/train: https://review.opendev.org/#/c/716166/ Automated sync should kick in downstream today - I'll provide the new package version as soon as it's available. Moving to POST. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2114 |