Bug 1477768

Summary: OSP11 -> OSP12 upgrade: nova-compute binaries listed by nova service-list are disabled with reason: AUTO: Failed to connect to libvirt: Failed to find user record for uid '162'
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: openstack-tripleo-heat-templatesAssignee: Emilien Macchi <emacchi>
Status: CLOSED ERRATA QA Contact: Marius Cornea <mcornea>
Severity: urgent Docs Contact:
Priority: high    
Version: 12.0 (Pike)CC: dbecker, mandreou, mbultel, mburns, morazi, rhel-osp-director-maint, sathlang, tvignaud
Target Milestone: betaKeywords: Triaged
Target Release: 12.0 (Pike)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-7.0.0-0.20170805163046.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-13 21:48:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1399762    

Description Marius Cornea 2017-08-02 21:28:12 UTC
Description of problem:
OSP11 -> OSP12 upgrade: nova-compute binaries listed by nova service-list are disabled with reason: AUTO: Failed to connect to libvirt: Failed to find user record for uid '162':

(overcloud) [stack@undercloud-0 ~]$ nova service-list
+-----+------------------+--------------------------+----------+----------+-------+----------------------------+------------------------------------------------------------------------------+
| Id  | Binary           | Host                     | Zone     | Status   | State | Updated_at                 | Disabled Reason                                                              |
+-----+------------------+--------------------------+----------+----------+-------+----------------------------+------------------------------------------------------------------------------+
| 29  | nova-conductor   | controller-1.localdomain | internal | enabled  | up    | 2017-08-02T21:20:45.000000 | -                                                                            |
| 35  | nova-conductor   | controller-2.localdomain | internal | enabled  | up    | 2017-08-02T21:20:39.000000 | -                                                                            |
| 44  | nova-compute     | compute-1.localdomain    | nova     | disabled | up    | 2017-08-02T21:20:39.000000 | AUTO: Failed to connect to libvirt: Failed to find user record for uid '162' |
| 77  | nova-scheduler   | controller-1.localdomain | internal | enabled  | up    | 2017-08-02T21:20:45.000000 | -                                                                            |
| 80  | nova-compute     | compute-0.localdomain    | nova     | disabled | up    | 2017-08-02T21:20:46.000000 | AUTO: Failed to connect to libvirt: Failed to find user record for uid '162' |
| 83  | nova-scheduler   | controller-2.localdomain | internal | enabled  | up    | 2017-08-02T21:20:40.000000 | -                                                                            |
| 86  | nova-consoleauth | controller-1.localdomain | internal | enabled  | up    | 2017-08-02T21:20:44.000000 | -                                                                            |
| 89  | nova-consoleauth | controller-2.localdomain | internal | enabled  | up    | 2017-08-02T21:20:38.000000 | -                                                                            |
| 92  | nova-conductor   | controller-0.localdomain | internal | enabled  | up    | 2017-08-02T21:20:46.000000 | -                                                                            |
| 98  | nova-scheduler   | controller-0.localdomain | internal | enabled  | up    | 2017-08-02T21:20:46.000000 | -                                                                            |
| 101 | nova-consoleauth | controller-0.localdomain | internal | enabled  | up    | 2017-08-02T21:20:45.000000 | -                                                                            |
| 29  | nova-conductor   | controller-1.localdomain | internal | enabled  | up    | 2017-08-02T21:20:45.000000 | -                                                                            |
| 35  | nova-conductor   | controller-2.localdomain | internal | enabled  | up    | 2017-08-02T21:20:39.000000 | -                                                                            |
| 44  | nova-compute     | compute-1.localdomain    | nova     | disabled | up    | 2017-08-02T21:20:39.000000 | AUTO: Failed to connect to libvirt: Failed to find user record for uid '162' |
| 77  | nova-scheduler   | controller-1.localdomain | internal | enabled  | up    | 2017-08-02T21:20:45.000000 | -                                                                            |
| 80  | nova-compute     | compute-0.localdomain    | nova     | disabled | up    | 2017-08-02T21:20:46.000000 | AUTO: Failed to connect to libvirt: Failed to find user record for uid '162' |
| 83  | nova-scheduler   | controller-2.localdomain | internal | enabled  | up    | 2017-08-02T21:20:40.000000 | -                                                                            |
| 86  | nova-consoleauth | controller-1.localdomain | internal | enabled  | up    | 2017-08-02T21:20:44.000000 | -                                                                            |
| 89  | nova-consoleauth | controller-2.localdomain | internal | enabled  | up    | 2017-08-02T21:20:38.000000 | -                                                                            |
| 92  | nova-conductor   | controller-0.localdomain | internal | enabled  | up    | 2017-08-02T21:20:46.000000 | -                                                                            |
| 98  | nova-scheduler   | controller-0.localdomain | internal | enabled  | up    | 2017-08-02T21:20:46.000000 | -                                                                            |
| 101 | nova-consoleauth | controller-0.localdomain | internal | enabled  | up    | 2017-08-02T21:20:45.000000 | -                                                                            |
+-----+------------------+--------------------------+----------+----------+-------+----------------------------+------------------------------------------------------------------------------+


Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-7.0.0-0.20170721174554.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy OSP11 with 3 controllers + 2 computes
2. Upgrade to OSP12

Actual results:
After the upgrade process was completed (major-upgrade-converge-docker.yaml) the nova-compute binaries reported by nova service-list are disabled with reason: AUTO: Failed to connect to libvirt: Failed to find user record for uid '162'

Expected results:
All nova services are up and running.

Additional info:


On the compute node:

[root@compute-0 heat-admin]# systemctl status libvirtd
● libvirtd.service - Virtualization daemon
   Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; disabled; vendor preset: enabled)
   Active: inactive (dead) since Wed 2017-08-02 18:37:48 UTC; 2h 47min ago
     Docs: man:libvirtd(8)
           http://libvirt.org
 Main PID: 45532 (code=exited, status=0/SUCCESS)

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
[root@compute-0 heat-admin]# docker ps
CONTAINER ID        IMAGE                                                                   COMMAND             CREATED             STATUS              PORTS               NAMES
882fb3d06dab        192.168.24.1:8787/rhosp12/openstack-nova-compute-docker:2017-07-26.10   "kolla_start"       2 hours ago         Up 2 hours                              nova_compute
e2603fe25223        192.168.24.1:8787/rhosp12/openstack-iscsid-docker:2017-07-26.10         "kolla_start"       2 hours ago         Up 2 hours                              iscsid
f1e68c536bf5        192.168.24.1:8787/rhosp12/openstack-nova-libvirt-docker:2017-07-26.10   "kolla_start"       2 hours ago         Up 2 hours                              nova_libvirt

nova-compute service is not disabled:

[root@compute-0 heat-admin]# systemctl status openstack-nova-compute
● openstack-nova-compute.service - OpenStack Nova Compute Server
   Loaded: loaded (/usr/lib/systemd/system/openstack-nova-compute.service; enabled; vendor preset: disabled)
   Active: activating (start) since Wed 2017-08-02 21:27:04 UTC; 2s ago
 Main PID: 207659 (nova-compute)
   Memory: 88.1M
   CGroup: /system.slice/openstack-nova-compute.service
           └─207659 /usr/bin/python2 /usr/bin/nova-compute

Aug 02 21:27:04 compute-0 systemd[1]: Starting OpenStack Nova Compute Server...
Aug 02 21:27:05 compute-0 nova-compute[207659]: Option "rabbit_use_ssl" from group "oslo_messaging_rabbit" is deprecated. Use option "ssl" from group "oslo_messaging_rabbit".


/var/log/nova/nova-compute.log:

/oslo_config/cfg.py:2889
2017-08-02 21:25:01.872 205711 INFO nova.service [-] Starting compute node (version 16.0.0-0.20170721224719.fe46cfa.el7ost)
2017-08-02 21:25:01.873 205711 DEBUG nova.virt.libvirt.host [-] Starting native event thread _init_events /usr/lib/python2.7/site-packages/nova/virt/libvirt/host.py:354
2017-08-02 21:25:01.874 205711 DEBUG nova.virt.libvirt.host [-] Starting green dispatch thread _init_events /usr/lib/python2.7/site-packages/nova/virt/libvirt/host.py:360
2017-08-02 21:25:01.874 205711 DEBUG nova.virt.libvirt.host [-] Starting connection event dispatch thread initialize /usr/lib/python2.7/site-packages/nova/virt/libvirt/host.py:480
2017-08-02 21:25:01.874 205711 DEBUG nova.virt.libvirt.host [-] Connecting to libvirt: qemu:///system _get_new_connection /usr/lib/python2.7/site-packages/nova/virt/libvirt/host.py:365
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host [-] Connection to libvirt failed: Failed to find user record for uid '162': libvirtError: Failed to find user record for uid '162'
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host Traceback (most recent call last):
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/host.py", line 451, in get_connection
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host     conn = self._get_connection()
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/host.py", line 434, in _get_connection
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host     {'msg': ex})
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host     self.force_reraise()
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host     six.reraise(self.type_, self.value, self.tb)
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/host.py", line 423, in _get_connection
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host     self._wrapped_conn = self._get_new_connection()
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/host.py", line 368, in _get_new_connection
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host     wrapped_conn = self._connect(self._uri, self._read_only)
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/host.py", line 224, in _connect
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host     libvirt.openAuth, uri, auth, flags)
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host     rv = execute(f, *args, **kwargs)
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host     six.reraise(c, e, tb)
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host     rv = meth(*args, **kwargs)
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 105, in openAuth
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host     if ret is None:raise libvirtError('virConnectOpenAuth() failed')
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host libvirtError: Failed to find user record for uid '162'
2017-08-02 21:25:01.881 205711 ERROR nova.virt.libvirt.host 
2017-08-02 21:25:01.892 205711 INFO nova.virt.libvirt.driver [-] Connection event '0' reason 'Failed to connect to libvirt: Failed to find user record for uid '162''
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service [req-728d7d18-b624-4098-ac21-6f94029982c2 - - - - -] Error starting thread.: HypervisorUnavailable: Connection to the hypervisor is broken on host: compute-0.localdomain
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service Traceback (most recent call last):
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/oslo_service/service.py", line 721, in run_service
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service     service.start()
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/nova/service.py", line 143, in start
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service     self.manager.init_host()
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1100, in init_host
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service     self.driver.init_host(host=self.host)
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 455, in init_host
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service     self._do_quality_warnings()
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 432, in _do_quality_warnings
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service     caps = self._host.get_capabilities()
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/host.py", line 636, in get_capabilities
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service     xmlstr = self.get_connection().getCapabilities()
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/host.py", line 460, in get_connection
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service     raise exception.HypervisorUnavailable(host=CONF.host)
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service HypervisorUnavailable: Connection to the hypervisor is broken on host: compute-0.localdomain
2017-08-02 21:25:01.907 205711 ERROR oslo_service.service

Comment 1 Marius Cornea 2017-08-02 21:31:52 UTC
After stopping and disabling the openstack-nova-compute service on the host and starting the nova-compute container the service comes back ok:

[root@compute-0 heat-admin]# systemctl stop openstack-nova-compute
[root@compute-0 heat-admin]# systemctl disable openstack-nova-compute
Removed symlink /etc/systemd/system/multi-user.target.wants/openstack-nova-compute.service.
[root@compute-0 heat-admin]# docker stop nova_compute
docker start nova_compute
nova_compute
[root@compute-0 heat-admin]# docker start nova_compute
nova_compute

Comment 7 errata-xmlrpc 2017-12-13 21:48:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462