Bug 1320777 - rhel-osp-director: After upgrade 7.3->8.0 nova compute has state "Down" in the nova services list
Summary: rhel-osp-director: After upgrade 7.3->8.0 nova compute has state "Down" in th...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 8.0 (Liberty)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ga
: 8.0 (Liberty)
Assignee: James Slagle
QA Contact: Arik Chernetsky
URL:
Whiteboard:
: 1322427 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-24 01:17 UTC by Alexander Chuzhoy
Modified: 2017-09-11 17:13 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-03-31 18:52:49 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Alexander Chuzhoy 2016-03-24 01:17:52 UTC
rhel-osp-director: After upgrade 7.3->8.0 nova compute has state "Down" in the nova services list

Environment:
openstack-nova-compute-12.0.2-4.el7ost.noarch
python-novaclient-3.1.0-2.el7ost.noarch
openstack-tripleo-heat-templates-0.8.12-2.el7ost.noarch
instack-undercloud-2.2.6-1.el7ost.noarch
openstack-puppet-modules-7.0.15-1.el7ost.noarch
openstack-tripleo-heat-templates-kilo-0.8.12-2.el7ost.noarch


Steps to reproduce:
1. Deploy 7.3 overcloud.
2. Upgrade to 8.0
3. Run "nova service-list"

Result:

+----+------------------+------------------------------------+----------+---------+-------+----------------------------+-----------------+
| Id | Binary           | Host                               | Zone     | Status  | State | Updated_at                 | Disabled Reason |
+----+------------------+------------------------------------+----------+---------+-------+----------------------------+-----------------+
| 2  | nova-scheduler   | overcloud-controller-0.localdomain | internal | enabled | up    | 2016-03-24T01:07:09.000000 | -               |
| 5  | nova-scheduler   | overcloud-controller-2.localdomain | internal | enabled | up    | 2016-03-24T01:07:08.000000 | -               |
| 8  | nova-scheduler   | overcloud-controller-1.localdomain | internal | enabled | up    | 2016-03-24T01:07:08.000000 | -               |
| 11 | nova-consoleauth | overcloud-controller-0.localdomain | internal | enabled | up    | 2016-03-24T01:07:00.000000 | -               |
| 14 | nova-consoleauth | overcloud-controller-2.localdomain | internal | enabled | up    | 2016-03-24T01:07:01.000000 | -               |
| 17 | nova-consoleauth | overcloud-controller-1.localdomain | internal | enabled | up    | 2016-03-24T01:07:07.000000 | -               |
| 20 | nova-conductor   | overcloud-controller-2.localdomain | internal | enabled | up    | 2016-03-24T01:07:04.000000 | -               |
| 23 | nova-conductor   | overcloud-controller-1.localdomain | internal | enabled | up    | 2016-03-24T01:06:59.000000 | -               |
| 26 | nova-compute     | overcloud-compute-0.localdomain    | nova     | enabled | down  | 2016-03-23T22:49:40.000000 | -               |
| 29 | nova-conductor   | overcloud-controller-0.localdomain | internal | enabled | up    | 2016-03-24T01:07:06.000000 | -               |
| 32 | nova-compute     | overcloud-compute-0                | nova     | enabled | up    | 2016-03-24T01:07:00.000000 | -               |
+----+------------------+------------------------------------+----------+---------+-------+----------------------------+-----------------+


Expected result:
nova-compute     | overcloud-compute-0.localdomain    | nova     | enabled | up

Comment 2 Alexander Chuzhoy 2016-03-24 01:18:56 UTC
Note:
The service is active on the compute machine:

[root@overcloud-compute-0 ~]# openstack-service status
MainPID=4314 Id=neutron-openvswitch-agent.service ActiveState=active
MainPID=4481 Id=openstack-ceilometer-compute.service ActiveState=active
MainPID=4414 Id=openstack-nova-compute.service ActiveState=active     


and I was able to launch an instance.

Comment 4 Alexander Chuzhoy 2016-03-30 13:57:41 UTC
*** Bug 1322427 has been marked as a duplicate of this bug. ***

Comment 6 James Slagle 2016-03-30 14:50:29 UTC
ben, can you comment here and just confirm this doesn't cause an actual issue and is only cosmetic? this is related to the domainname (localdomain) in this case getting added to the compute service, and now the old one shows as down, and the new one shows as up.

Comment 7 Emilien Macchi 2016-03-31 14:02:20 UTC
This is not a bug. Your DNS is not sending your domain name on DHCP, look at your resolv.conf.

See https://github.com/puppetlabs/facter/blob/2.4.3/lib/facter/domain.rb#L44-L71

Facter first tries to run hostname -f and if no domain is set, it will try to read resolv.conf and find it, otherwise return nothing, which is your case.

Comment 8 James Slagle 2016-03-31 16:20:37 UTC
after debugging a bit with sasha, we found that /etc/resolv.conf was being manually set after the deployment. this meant the "search" line in resolv.conf that would have specified a domain (and caused puppet to return the correct value for fqdn) to not be set.

recommendation is to set the correct values for the dns servers via the DnsServers parameter in network-environment.yaml and try to reproduce without manually configuring dns.

Comment 9 James Slagle 2016-03-31 18:52:49 UTC
didn't reproduce when setting dns via DnsServers

Comment 10 Marius Cornea 2016-04-07 10:39:24 UTC
*** Bug 1324739 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.