RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1320783 - nova-compute-wait complains about Invalid Nova host name.
Summary: nova-compute-wait complains about Invalid Nova host name.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: resource-agents
Version: 7.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Oyvind Albrigtsen
QA Contact: Leonid Natapov
URL:
Whiteboard:
: 1374327 1374980 (view as bug list)
Depends On:
Blocks: 1334162 1380314
TreeView+ depends on / blocked
 
Reported: 2016-03-24 01:44 UTC by Marian Krcmarik
Modified: 2020-02-14 17:43 UTC (History)
8 users (show)

Fixed In Version: resource-agents-3.9.5-71.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1334162 1380314 (view as bug list)
Environment:
Last Closed: 2016-11-04 00:02:12 UTC
Target Upstream Version:
Embargoed:
royoung: needinfo+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:2174 0 normal SHIPPED_LIVE resource-agents bug fix and enhancement update 2016-11-03 13:16:36 UTC

Description Marian Krcmarik 2016-03-24 01:44:29 UTC
Description of problem:
nova-compute-wait complains about invalid nova host name when domain resource attribute is specified, hostname of node is FQDN and nova host is specified as FQDN.
There is a check in resource agent code:
   NOVA_HOST=$(openstack-config --get /etc/nova/nova.conf DEFAULT host 2>/dev/null)
   if [ $? = 1 ]; then
       if [ "x${OCF_RESKEY_domain}" != x ]; then
	   NOVA_HOST=$(uname -n | awk -F. '{print $1}')
       else
	   NOVA_HOST=$(uname -n)
       fi
   fi

    # We only need to check a configured value, calculated ones are fine
   if [ $? = 0 ]; then
	if [ "x${OCF_RESKEY_domain}" != x ]; then
	    short_host=$(uname -n | awk -F. '{print $1}')
	    if [ "x$NOVA_HOST" != "x${short_host}" ]; then
		ocf_exit_reason "Invalid Nova host name, must be ${short_host} in order for instance recovery to function"
		rc=$OCF_ERR_CONFIGURED
	    fi

	elif [ "x$NOVA_HOST" != "x$(uname -n)" ]; then
            ocf_exit_reason "Invalid Nova host name, must be $(uname -n) in order for instance recovery to function"
	    rc=$OCF_ERR_CONFIGURED
	fi
    fi

This seems to cause following:
1. If Nova host in nova.conf is specified as FQDN, hostname of the node is specified as FQDN and domain attribute of resource agent is specified then nova-compute-wait refuses to start and complains about invalid nova host name.
- If I do not specify domain attribute it would work but I think resource agent should handle that situation or it should be exactly written in the description of resource attribute that If setup uses FQDN then domain attribute should not be specified.
2. The check actually checks the opposite situation from resource description, description of domain attribute says: "domain: DNS domain in which hosts live, useful when the cluster uses short names and nova uses FQDN" but in the reality It checks that when nova does not use FQDN it equals node hostname without domain. If nova host is specified in nova.conf as FQDN and hostname of node is not FQDN, It would fail to start.

I do not really know what the reason of the check is, so I may be wrong somewhere, Initially I understood from the description that host attribute from nova.conf is being compared with the name of host from pacemaker point of view. But from the code it compares it with node hostname.

Version-Release number of selected component (if applicable):
resource-agents-3.9.5-54.el7_2.9

How reproducible:
Always

Comment 2 Andrew Beekhof 2016-03-24 05:29:35 UTC
Does 'pcs status' show FQDN?
If so this would be expected but I can appreciate that it is suboptimal.


The ideal solution is bug #1289410 which would allow to avoid any heuristics and just use the value nova is using.

Until then, we can look at improving the agent.

Comment 3 Marian Krcmarik 2016-03-24 09:29:51 UTC
(In reply to Andrew Beekhof from comment #2)
> Does 'pcs status' show FQDN?
> If so this would be expected but I can appreciate that it is suboptimal.

No, cluster node names are not FQDN in "pcs status" output, but the nodes are defined as FQDN in nova.conf and node hostnames are FQDN as well.

Comment 4 Andrew Beekhof 2016-03-24 09:34:52 UTC
Thats exceedingly strange then.

I will investigate (there will be some delay due to Easter/PTO)

Comment 5 Andrew Beekhof 2016-04-22 04:56:03 UTC
These are the two patches currently being tested for this:

  https://github.com/beekhof/fence-agents/commit/564b70d
  https://github.com/beekhof/openstack-resource-agents/commit/6a42076e

Comment 6 Andrew Beekhof 2016-05-09 04:43:47 UTC
Oyvind: Can we get a build for the resource-agents piece please?

Will clone for the fence-agents

Comment 7 Oyvind Albrigtsen 2016-05-13 10:35:33 UTC
Patched from upstream and updated metadata longdesc/shortdesc to "Deprecated - do not use anymore." for the parameters that arent in use anymore.

Comment 9 Leonid Natapov 2016-05-25 14:47:53 UTC
Did code verification. /usr/lib/ocf/resource.d/openstack/nova-compute-wait doesn't include the problematic code  mentioned in the  
resource-agents-3.9.5-71.el7

Comment 10 Oyvind Albrigtsen 2016-09-09 07:52:21 UTC
*** Bug 1374327 has been marked as a duplicate of this bug. ***

Comment 12 Rob Young 2016-09-09 14:15:27 UTC
I agree on the need for the backport. We are straddling RHEL 7.2 and 7.3 in our OSP10 testing efforts and because of the tight delivery timeframe we need to remove all blockers (potential or otherwise) to our progress. This change is low-risk and the business value is that we can move ahead with our OSP 10 test plan on 7.2 if needed, regardless of any RHEL 7.3 delays.

Comment 13 Fabio Massimo Di Nitto 2016-09-11 12:20:16 UTC
*** Bug 1374980 has been marked as a duplicate of this bug. ***

Comment 16 errata-xmlrpc 2016-11-04 00:02:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2174.html


Note You need to log in before you can comment on or make changes to this bug.