Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
DescriptionMarian Krcmarik
2016-03-24 01:44:29 UTC
Description of problem:
nova-compute-wait complains about invalid nova host name when domain resource attribute is specified, hostname of node is FQDN and nova host is specified as FQDN.
There is a check in resource agent code:
NOVA_HOST=$(openstack-config --get /etc/nova/nova.conf DEFAULT host 2>/dev/null)
if [ $? = 1 ]; then
if [ "x${OCF_RESKEY_domain}" != x ]; then
NOVA_HOST=$(uname -n | awk -F. '{print $1}')
else
NOVA_HOST=$(uname -n)
fi
fi
# We only need to check a configured value, calculated ones are fine
if [ $? = 0 ]; then
if [ "x${OCF_RESKEY_domain}" != x ]; then
short_host=$(uname -n | awk -F. '{print $1}')
if [ "x$NOVA_HOST" != "x${short_host}" ]; then
ocf_exit_reason "Invalid Nova host name, must be ${short_host} in order for instance recovery to function"
rc=$OCF_ERR_CONFIGURED
fi
elif [ "x$NOVA_HOST" != "x$(uname -n)" ]; then
ocf_exit_reason "Invalid Nova host name, must be $(uname -n) in order for instance recovery to function"
rc=$OCF_ERR_CONFIGURED
fi
fi
This seems to cause following:
1. If Nova host in nova.conf is specified as FQDN, hostname of the node is specified as FQDN and domain attribute of resource agent is specified then nova-compute-wait refuses to start and complains about invalid nova host name.
- If I do not specify domain attribute it would work but I think resource agent should handle that situation or it should be exactly written in the description of resource attribute that If setup uses FQDN then domain attribute should not be specified.
2. The check actually checks the opposite situation from resource description, description of domain attribute says: "domain: DNS domain in which hosts live, useful when the cluster uses short names and nova uses FQDN" but in the reality It checks that when nova does not use FQDN it equals node hostname without domain. If nova host is specified in nova.conf as FQDN and hostname of node is not FQDN, It would fail to start.
I do not really know what the reason of the check is, so I may be wrong somewhere, Initially I understood from the description that host attribute from nova.conf is being compared with the name of host from pacemaker point of view. But from the code it compares it with node hostname.
Version-Release number of selected component (if applicable):
resource-agents-3.9.5-54.el7_2.9
How reproducible:
Always
Does 'pcs status' show FQDN?
If so this would be expected but I can appreciate that it is suboptimal.
The ideal solution is bug #1289410 which would allow to avoid any heuristics and just use the value nova is using.
Until then, we can look at improving the agent.
(In reply to Andrew Beekhof from comment #2)
> Does 'pcs status' show FQDN?
> If so this would be expected but I can appreciate that it is suboptimal.
No, cluster node names are not FQDN in "pcs status" output, but the nodes are defined as FQDN in nova.conf and node hostnames are FQDN as well.
Did code verification. /usr/lib/ocf/resource.d/openstack/nova-compute-wait doesn't include the problematic code mentioned in the
resource-agents-3.9.5-71.el7
Comment 10Oyvind Albrigtsen
2016-09-09 07:52:21 UTC
*** Bug 1374327 has been marked as a duplicate of this bug. ***
I agree on the need for the backport. We are straddling RHEL 7.2 and 7.3 in our OSP10 testing efforts and because of the tight delivery timeframe we need to remove all blockers (potential or otherwise) to our progress. This change is low-risk and the business value is that we can move ahead with our OSP 10 test plan on 7.2 if needed, regardless of any RHEL 7.3 delays.
Comment 13Fabio Massimo Di Nitto
2016-09-11 12:20:16 UTC
*** Bug 1374980 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://rhn.redhat.com/errata/RHBA-2016-2174.html
Description of problem: nova-compute-wait complains about invalid nova host name when domain resource attribute is specified, hostname of node is FQDN and nova host is specified as FQDN. There is a check in resource agent code: NOVA_HOST=$(openstack-config --get /etc/nova/nova.conf DEFAULT host 2>/dev/null) if [ $? = 1 ]; then if [ "x${OCF_RESKEY_domain}" != x ]; then NOVA_HOST=$(uname -n | awk -F. '{print $1}') else NOVA_HOST=$(uname -n) fi fi # We only need to check a configured value, calculated ones are fine if [ $? = 0 ]; then if [ "x${OCF_RESKEY_domain}" != x ]; then short_host=$(uname -n | awk -F. '{print $1}') if [ "x$NOVA_HOST" != "x${short_host}" ]; then ocf_exit_reason "Invalid Nova host name, must be ${short_host} in order for instance recovery to function" rc=$OCF_ERR_CONFIGURED fi elif [ "x$NOVA_HOST" != "x$(uname -n)" ]; then ocf_exit_reason "Invalid Nova host name, must be $(uname -n) in order for instance recovery to function" rc=$OCF_ERR_CONFIGURED fi fi This seems to cause following: 1. If Nova host in nova.conf is specified as FQDN, hostname of the node is specified as FQDN and domain attribute of resource agent is specified then nova-compute-wait refuses to start and complains about invalid nova host name. - If I do not specify domain attribute it would work but I think resource agent should handle that situation or it should be exactly written in the description of resource attribute that If setup uses FQDN then domain attribute should not be specified. 2. The check actually checks the opposite situation from resource description, description of domain attribute says: "domain: DNS domain in which hosts live, useful when the cluster uses short names and nova uses FQDN" but in the reality It checks that when nova does not use FQDN it equals node hostname without domain. If nova host is specified in nova.conf as FQDN and hostname of node is not FQDN, It would fail to start. I do not really know what the reason of the check is, so I may be wrong somewhere, Initially I understood from the description that host attribute from nova.conf is being compared with the name of host from pacemaker point of view. But from the code it compares it with node hostname. Version-Release number of selected component (if applicable): resource-agents-3.9.5-54.el7_2.9 How reproducible: Always