Bug 1475969 - [OSP 6] Packstack tempest job failure on keystone identity error [NEEDINFO]
[OSP 6] Packstack tempest job failure on keystone identity error
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-keystone (Show other bugs)
6.0 (Juno)
x86_64 Linux
urgent Severity urgent
: async
: 6.0 (Juno)
Assigned To: John Dennis
Prasanth Anbalagan
: Triaged, ZStream
Depends On: 1478176
Blocks: 1478149 1478173
  Show dependency treegraph
 
Reported: 2017-07-27 12:22 EDT by Jason Joyce
Modified: 2017-09-06 13:52 EDT (History)
10 users (show)

See Also:
Fixed In Version: openstack-keystone-2014.2.4-3.el7ost,openstack-selinux-0.8.9-0.1.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1478149 1478173 (view as bug list)
Environment:
Last Closed: 2017-09-06 13:52:34 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
lruzicka: needinfo? (jdennis)


Attachments (Terms of Use)

  None (edit)
Description Jason Joyce 2017-07-27 12:22:03 EDT
Description of problem:

tempest is failing with:
stdout: 2017-07-26 11:46:30.409 13362 INFO tempest [-] Using tempest config file /etc/tempest/tempest.conf
2017-07-26 11:46:30.491 13362 INFO __main__ [-] Reading defaults from file '/home/cloud-user/tempest-dir/etc/default-overrides.conf'
2017-07-26 11:46:30.492 13362 INFO __main__ [-] Adding options from deployer-input file '/home/cloud-user/tempest-dir/etc/deployer-input-juno.conf'
2017-07-26 11:46:30.492 13362 DEBUG __main__ [-] Setting [volume-feature-enabled] bootable = false set tools/config_tempest.py:467
2017-07-26 11:46:30.492 13362 DEBUG __main__ [-] Setting [compute-feature-enabled] preserve_ports = false set tools/config_tempest.py:467
2017-07-26 11:46:30.492 13362 DEBUG __main__ [-] Setting [compute-feature-enabled] live_migrate_paused_instances = false set tools/config_tempest.py:467
2017-07-26 11:46:30.493 13362 DEBUG __main__ [-] Setting [data_processing] catalog_type = data_processing set tools/config_tempest.py:467
2017-07-26 11:46:30.493 13362 DEBUG __main__ [-] Setting [identity] uri = http://192.168.1.13:5000/v2.0/ set tools/config_tempest.py:467
2017-07-26 11:46:30.493 13362 DEBUG __main__ [-] Setting [identity] admin_password = redhat set tools/config_tempest.py:467
2017-07-26 11:46:30.493 13362 DEBUG __main__ [-] Setting [scenario] img_dir = /home/cloud-user/tempest-dir/etc set tools/config_tempest.py:467
2017-07-26 11:46:30.493 13362 DEBUG __main__ [-] Setting [image] http_image = http://rhos-qe-mirror-qeos.usersys.redhat.com/images/cirros-0.3.5-x86_64-uec.tar.gz set tools/config_tempest.py:467
2017-07-26 11:46:30.493 13362 DEBUG __main__ [-] Setting [identity] uri_v3 = http://192.168.1.13:5000/v3/ set tools/config_tempest.py:467
2017-07-26 11:51:34.990 13362 INFO tempest.lib.common.rest_client [-] Request (main): 500 POST http://192.168.1.13:5000/v2.0/tokens
2017-07-26 11:51:34.990 13362 CRITICAL tempest [-] Unhandled error: IdentityError: Got identity error
Details: Unexpected status code 500
2017-07-26 11:51:34.990 13362 ERROR tempest Traceback (most recent call last):
2017-07-26 11:51:34.990 13362 ERROR tempest   File "tools/config_tempest.py", line 842, in <module>
2017-07-26 11:51:34.990 13362 ERROR tempest     main()
2017-07-26 11:51:34.990 13362 ERROR tempest   File "tools/config_tempest.py", line 155, in main
2017-07-26 11:51:34.990 13362 ERROR tempest     clients = ClientManager(conf, not args.non_admin)
2017-07-26 11:51:34.990 13362 ERROR tempest   File "tools/config_tempest.py", line 417, in __init__
2017-07-26 11:51:34.990 13362 ERROR tempest     tenant_name)['id']
2017-07-26 11:51:34.990 13362 ERROR tempest   File "/home/cloud-user/tempest-dir/tempest/common/identity.py", line 20, in get_tenant_by_name
2017-07-26 11:51:34.990 13362 ERROR tempest     tenants = client.list_tenants()['tenants']
2017-07-26 11:51:34.990 13362 ERROR tempest   File "/home/cloud-user/tempest-dir/tempest/services/identity/v2/json/tenants_client.py", line 56, in list_tenants
2017-07-26 11:51:34.990 13362 ERROR tempest     resp, body = self.get('tenants')
2017-07-26 11:51:34.990 13362 ERROR tempest   File "/home/cloud-user/tempest-dir/tempest/lib/common/rest_client.py", line 275, in get
2017-07-26 11:51:34.990 13362 ERROR tempest     return self.request('GET', url, extra_headers, headers)
2017-07-26 11:51:34.990 13362 ERROR tempest   File "/home/cloud-user/tempest-dir/tempest/lib/common/rest_client.py", line 633, in request
2017-07-26 11:51:34.990 13362 ERROR tempest     headers=headers, body=body)
2017-07-26 11:51:34.990 13362 ERROR tempest   File "/home/cloud-user/tempest-dir/tempest/lib/common/rest_client.py", line 527, in _request
2017-07-26 11:51:34.990 13362 ERROR tempest     method, url, headers, body, self.filters)
2017-07-26 11:51:34.990 13362 ERROR tempest   File "/home/cloud-user/tempest-dir/tempest/lib/auth.py", line 187, in auth_request
2017-07-26 11:51:34.990 13362 ERROR tempest     filters, method, url, headers, body)
2017-07-26 11:51:34.990 13362 ERROR tempest   File "/home/cloud-user/tempest-dir/tempest/lib/auth.py", line 272, in _decorate_request
2017-07-26 11:51:34.990 13362 ERROR tempest     auth_data = self.get_auth()
2017-07-26 11:51:34.990 13362 ERROR tempest   File "/home/cloud-user/tempest-dir/tempest/lib/auth.py", line 149, in get_auth
2017-07-26 11:51:34.990 13362 ERROR tempest     self.set_auth()
2017-07-26 11:51:34.990 13362 ERROR tempest   File "/home/cloud-user/tempest-dir/tempest/lib/auth.py", line 158, in set_auth
2017-07-26 11:51:34.990 13362 ERROR tempest     self.cache = self._get_auth()
2017-07-26 11:51:34.990 13362 ERROR tempest   File "/home/cloud-user/tempest-dir/tempest/lib/auth.py", line 309, in _get_auth
2017-07-26 11:51:34.990 13362 ERROR tempest     token, auth_data = auth_func(**auth_params)
2017-07-26 11:51:34.990 13362 ERROR tempest   File "/home/cloud-user/tempest-dir/tempest/lib/services/identity/v2/token_client.py", line 104, in get_token
2017-07-26 11:51:34.990 13362 ERROR tempest     body = self.auth(user, password, tenant)
2017-07-26 11:51:34.990 13362 ERROR tempest   File "/home/cloud-user/tempest-dir/tempest/lib/services/identity/v2/token_client.py", line 54, in auth
2017-07-26 11:51:34.990 13362 ERROR tempest     resp, body = self.post(self.auth_url, body=body)
2017-07-26 11:51:34.990 13362 ERROR tempest   File "/home/cloud-user/tempest-dir/tempest/lib/common/rest_client.py", line 260, in post
2017-07-26 11:51:34.990 13362 ERROR tempest     return self.request('POST', url, extra_headers, headers, body)
2017-07-26 11:51:34.990 13362 ERROR tempest   File "/home/cloud-user/tempest-dir/tempest/lib/services/identity/v2/token_client.py", line 98, in request
2017-07-26 11:51:34.990 13362 ERROR tempest     'Unexpected status code {0}'.format(resp.status))
2017-07-26 11:51:34.990 13362 ERROR tempest IdentityError: Got identity error
2017-07-26 11:51:34.990 13362 ERROR tempest Details: Unexpected status code 500
2017-07-26 11:51:34.990 13362 ERROR tempest 


Version-Release number of selected component (if applicable):
openstack-keystone-2014.2.4-1.el7ost

How reproducible:
Always

Steps to Reproduce:
1. Deploy OSP 6 with packstack allinone
2. Run tempest

Actual results:
tempest fails with Unexpected status code 500

Expected results:
tempest runs
Comment 1 Jason Joyce 2017-07-27 12:27:06 EDT
This is only appearing on RHEL 7.4, RHEL 7.3 seems to be running without error.
Comment 2 Attila Fazekas 2017-07-28 06:46:29 EDT
In the above experiment keystone is not responsive,
you can find similar messages in the log:

>  [Fri Jul 28 06:23:36.857715 2017] [core:error] [pid 6813] [client 192.168.1.13:34040] Script timed out before returning headers: admin


No sign of database activity.
Restarting the httpd process does not helps.

On the same system the `stand alone` version of keystone (openstack-keystone) service was responsive. (manually started)

This kind of issue can happen, if some part of the code wants to
use eventlet in the mod_wsgi mode.

keystone likely was responsive when packstack created the endpoints,
some extra config or library added later in the job might be responsible for the issue.
Comment 3 Attila Fazekas 2017-07-28 10:49:11 EDT
The keystoneclient used by keystone tries to do something with gnome-keyring when it is installed. The gnome-keyring was pulled by git.

Uninstalling  gnome-keyring  package helps.

The CLI version of the keystone client also throws the following warning:

/usr/lib/python2.7/site-packages/keyring/backends/Gnome.py:6: PyGIWarning: GnomeKeyring was imported without specifying a version first. Use gi.require_version('GnomeKeyring', '1.0') before import to ensure that the right version gets loaded.
  from gi.repository import GnomeKeyring


BTW, the service code was in a gi call.
Thread 8 (Thread 0x7f316298b700 (LWP 20239)):
#4 Waiting for a lock (e.g. GIL)
#10 Frame 0x7f315cda1bb0, for file /usr/lib64/python2.7/site-packages/gi/module.py, line 157, in __getattr__ (self=<IntrospectionModule(_namespace='GLib', __path__='/usr/lib64/girepository-1.0/GLib-2.0.typelib', Variant=<StructMeta(new_fixed_array=<gi.FunctionInfo at remote 0x7f315cd81458>, new_array=<gi.FunctionInfo at remote 0x7f315cd81298>, __module__='gi.repository.GLib', new_int64=<gi.FunctionInfo at remote 0x7f315cd815a8>, hash=<gi.FunctionInfo at remote 0x7f315cd81fb8>, get_uint32=<gi.FunctionInfo at remote 0x7f315cd81f10>, new_objv=<gi.FunctionInfo at remote 0x7f315cd81650>, get_uint16=<gi.FunctionInfo at remote 0x7f315cd81ed8>, get_byte=<gi.FunctionInfo at remote 0x7f315cd81ab0>, parse_error_print_context=<gi.FunctionInfo at remote 0x7f315cd8a378>, parse_error_quark=<gi.FunctionInfo at remote 0x7f315cd8a3b0>, new_uint16=<gi.FunctionInfo at remote 0x7f315cd81768>, get_int16=<gi.FunctionInfo at remote 0x7f315cd81c70>, equal=<gi.FunctionInfo at remote 0x7f315cd81a40>, new_signature=<gi.FunctionInfo at remot...(truncated)
    wrapper = enum_register_new_gtype_and_add(info)
#19 Frame 0x7f315cd12fa0, for file /usr/lib64/python2.7/site-packages/gi/overrides/GLib.py, line 489, in <module> ()
    globals()[attr] = getattr(GLib.UserDirectory, 'DIRECTORY_' + n)
#31 Frame 0x7f315cd11b60, for file /usr/lib64/python2.7/importlib/__init__.py, line 37, in import_module (name='gi.overrides.GLib', package=None)
    __import__(name)
#35 Frame 0x7f315cd108f0, for file /usr/lib64/python2.7/site-packages/gi/overrides/__init__.py, line 125, in load_overrides (introspection_module=<IntrospectionModule(_namespace='GLib', __path__='/usr/lib64/girepository-1.0/GLib-2.0.typelib', Variant=<StructMeta(new_fixed_array=<gi.FunctionInfo at remote 0x7f315cd81458>, new_array=<gi.FunctionInfo at remote 0x7f315cd81298>, __module__='gi.repository.GLib', new_int64=<gi.FunctionInfo at remote 0x7f315cd815a8>, hash=<gi.FunctionInfo at remote 0x7f315cd81fb8>, get_uint32=<gi.FunctionInfo at remote 0x7f315cd81f10>, new_objv=<gi.FunctionInfo at remote 0x7f315cd81650>, get_uint16=<gi.FunctionInfo at remote 0x7f315cd81ed8>, get_byte=<gi.FunctionInfo at remote 0x7f315cd81ab0>, parse_error_print_context=<gi.FunctionInfo at remote 0x7f315cd8a378>, parse_error_quark=<gi.FunctionInfo at remote 0x7f315cd8a3b0>, new_uint16=<gi.FunctionInfo at remote 0x7f315cd81768>, get_int16=<gi.FunctionInfo at remote 0x7f315cd81c70>, equal=<gi.FunctionInfo at remote 0x7f315cd81a40>, new_sign...(truncated)
    override_mod = importlib.import_module(override_package_name)
#38 Frame 0x7f315cd403a0, for file /usr/lib64/python2.7/site-packages/gi/importer.py, line 146, in load_module (self=<DynamicImporter(path='gi.repository') at remote 0x7f315cd50b90>, fullname='gi.repository.GLib', path='gi.repository', namespace='GLib', stacklevel=4, introspection_module=<IntrospectionModule(_namespace='GLib', __path__='/usr/lib64/girepository-1.0/GLib-2.0.typelib', Variant=<StructMeta(new_fixed_array=<gi.FunctionInfo at remote 0x7f315cd81458>, new_array=<gi.FunctionInfo at remote 0x7f315cd81298>, __module__='gi.repository.GLib', new_int64=<gi.FunctionInfo at remote 0x7f315cd815a8>, hash=<gi.FunctionInfo at remote 0x7f315cd81fb8>, get_uint32=<gi.FunctionInfo at remote 0x7f315cd81f10>, new_objv=<gi.FunctionInfo at remote 0x7f315cd81650>, get_uint16=<gi.FunctionInfo at remote 0x7f315cd81ed8>, get_byte=<gi.FunctionInfo at remote 0x7f315cd81ab0>, parse_error_print_context=<gi.FunctionInfo at remote 0x7f315cd8a378>, parse_error_quark=<gi.FunctionInfo at remote 0x7f315cd8a3b0>, new_uint16=<gi.FunctionI...(truncated)
    dynamic_module = load_overrides(introspection_module)
#53 Frame 0x7f315cd3e190, for file /usr/lib64/python2.7/importlib/__init__.py, line 37, in import_module (name='gi.repository.GLib', package=None)
    __import__(name)
#57 Frame 0x7f315cd3db20, for file /usr/lib64/python2.7/site-packages/gi/importer.py, line 145, in load_module (self=<DynamicImporter(path='gi.repository') at remote 0x7f315cd50b90>, fullname='gi.repository.GObject', path='gi.repository', namespace='GObject', stacklevel=4, introspection_module=<IntrospectionModule(__name__='gi.repository.GObject', _namespace='GObject', _version='2.0', __path__='/usr/lib64/girepository-1.0/GObject-2.0.typelib') at remote 0x7f315cd50dd0>, dep='GLib-2.0')
    importlib.import_module('gi.repository.' + dep.split("-")[0])
#72 Frame 0x7f315cd35c20, for file /usr/lib64/python2.7/importlib/__init__.py, line 37, in import_module (name='gi.repository.GObject', package=None)
    __import__(name)
#76 Frame 0x7f315cd16090, for file /usr/lib64/python2.7/site-packages/gi/importer.py, line 145, in load_module (self=<DynamicImporter(path='gi.repository') at remote 0x7f315cd50b90>, fullname='gi.repository.GnomeKeyring', path='gi.repository', namespace='GnomeKeyring', stacklevel=4, introspection_module=<IntrospectionModule(__name__='gi.repository.GnomeKeyring', _namespace='GnomeKeyring', _version='1.0', __path__='/usr/lib64/girepository-1.0/GnomeKeyring-1.0.typelib') at remote 0x7f315cd50c50>, dep='GObject-2.0')
    importlib.import_module('gi.repository.' + dep.split("-")[0])
#92 Frame 0x7f315c71d900, for file /usr/lib/python2.7/site-packages/keyring/backends/Gnome.py, line 6, in <module> ()
    from gi.repository import GnomeKeyring
#104 Frame 0x7f315cc67a50, for file /usr/lib64/python2.7/importlib/__init__.py, line 37, in import_module (name='keyring.backends.Gnome', package='keyring.backends', level=1, character='G')
    __import__(name)
#108 Frame 0x7f315cc67860, for file /usr/lib/python2.7/site-packages/keyring/backend.py, line 114, in _load_backend (name='Gnome', package='keyring.backends')
    mod = importlib.import_module('.'+name, package)
#115 Frame 0x7f315cc67660, for file /usr/lib/python2.7/site-packages/keyring/backend.py, line 129, in _load_backends (backends=('file', 'Gnome', 'Google', 'keyczar', 'kwallet', 'multi', 'OS_X', 'pyfs', 'SecretService', 'Windows'))
    list(map(_load_backend, backends))
#119 Frame 0x7f315cc67470, for file /usr/lib/python2.7/site-packages/keyring/backend.py, line 137, in get_all_keyring ()
    _load_backends()
#124 Frame 0x7f315cc60c40, for file /usr/lib/python2.7/site-packages/keyring/util/__init__.py, line 24, in wrapper (args=(), kwargs={})
    func.always_returns = func(*args, **kwargs)
#128 Frame 0x7f315cc0fd80, for file /usr/lib/python2.7/site-packages/keyring/core.py, line 66, in _get_best_keyring ()
    keyrings = backend.get_all_keyring()
#131 Frame 0x7f315cc174d0, for file /usr/lib/python2.7/site-packages/keyring/core.py, line 58, in init_backend ()
    set_keyring(load_config() or _get_best_keyring())
#134 Frame 0x7f315cc08e40, for file /usr/lib/python2.7/site-packages/keyring/core.py, line 158, in <module> ()
    init_backend()
#147 Frame 0x7f315cbfc960, for file /usr/lib/python2.7/site-packages/keyring/__init__.py, line 12, in <module> ()
    from .core import (set_keyring, get_keyring, set_password, get_password,
#162 Frame 0x7f315ca45050, for file /usr/lib/python2.7/site-packages/keystoneclient/httpclient.py, line 40, in <module> ()
    import keyring
#175 Frame 0x7f315cb63b80, for file /usr/lib/python2.7/site-packages/keystoneclient/v2_0/client.py, line 20, in <module> ()
    from keystoneclient import httpclient
#188 Frame 0x7f315cb681d0, for file /usr/lib/python2.7/site-packages/keystoneclient/v2_0/__init__.py, line 2, in <module> ()
    from keystoneclient.v2_0.client import Client
#203 Frame 0x7f315c95a720, for file /usr/lib/python2.7/site-packages/keystoneclient/discover.py, line 21, in <module> ()
    from keystoneclient.v2_0 import client as v2_client
#216 Frame 0x7f315c962850, for file /usr/lib/python2.7/site-packages/keystoneclient/client.py, line 13, in <module> ()
    from keystoneclient import discover
#229 Frame 0x7f315c0df7b0, for file /usr/lib/python2.7/site-packages/keystoneclient/__init__.py, line 26, in <module> ()
    from keystoneclient import client
#244 Frame 0x7f315c945d80, for file /usr/lib/python2.7/site-packages/keystone/models/token_model.py, line 15, in <module> ()
    from keystoneclient.common import cms
#257 Frame 0x7f315c949380, for file /usr/lib/python2.7/site-packages/keystone/common/authorization.py, line 21, in <module> ()
    from keystone.models import token_model
#270 Frame 0x7f315c139300, for file /usr/lib/python2.7/site-packages/keystone/common/controller.py, line 18, in <module> ()
    from keystone.common import authorization
#283 Frame 0x7f315c141ef0, for file /usr/lib/python2.7/site-packages/keystone/assignment/controllers.py, line 26, in <module> ()
    from keystone.common import controller
#296 Frame 0x7f315c0a70a0, for file /usr/lib/python2.7/site-packages/keystone/assignment/__init__.py, line 15, in <module> ()
    from keystone.assignment import controllers  # noqa
#311 Frame 0x7f315c144e10, for file /usr/lib/python2.7/site-packages/keystone/backends.py, line 13, in <module> ()
    from keystone import assignment
#324 Frame 0x7f315c00cbd0, for file /var/www/cgi-bin/keystone/admin, line 28, in <module> ()
    from keystone import backends

Related packages: 

libgnome-keyring-3.12.0-1.el7 
                                     
python-keyring-4.0-1.el7ost.noarch
python-keystoneclient-0.11.1-2.el7ost.noarch
openstack-keystone-2014.2.4-1.el7ost.noarch
python-keystonemiddleware-1.3.2-3.el7ost.noarch
python-keystone-2014.2.4-1.el7ost.noarch
Comment 4 Filip Hubík 2017-07-28 10:54:11 EDT
I deployed and stopped before playbooks which perform Tempest configuration and run, means before step of the job:

$ ir-tester tempest -v --inventory hosts -i install.yml -o test.yml --openstack-installer packstack --openstack-version 6 --deployer-input-method local                         --deployer-input-file etc/deployer-input-juno.conf --config-options="image.http_image=http://<mirror>/images/cirros-0.3.5-x86_64-uec.tar.gz" --setup-revision=liberty --tests=sanity

Looks like httpd is sane and restartable there and keystone responds to local and even remote requests fine, even after

$ systemctl restart httpd # keystone still responds as expected.

When I continue with mentioned command performing automation (Tempest stage), keystone commands stop to respond usually, HTTP 500 responses are being returned, "systemctl restart httpd" or "reboot" doesn't help from that point and deployment is in unusable state as described.
Comment 5 Filip Hubík 2017-07-28 10:57:24 EDT
Confirmed Atilla's solution workarounded this issue also on my setup (https://bugzilla.redhat.com/show_bug.cgi?id=1475969#c3, packstack, nova, flatdhcp, qpid deployment).
Comment 6 Attila Fazekas 2017-07-28 12:36:18 EDT
Using >=keyring-8.0 helps,
the other releases avoids this issue in another way because we never used newer than python-keyring-5.7.1-1.el7ost.noarch .

12/pike is not affected, I am unsure about the others ATM.
Comment 7 John Dennis 2017-07-28 15:41:10 EDT
Shouldn't the component for this bug be changed from keystone to gnome-keyring?
Comment 8 Jason Joyce 2017-07-28 16:02:20 EDT
I was able to reproduce with OSP 7 using a basic packstack --allinone install.
Comment 9 Jason Joyce 2017-07-28 16:10:26 EDT
Changing the component to python-keyring.
Comment 10 Attila Fazekas 2017-07-29 10:15:44 EDT
Failed to reproduce the issue with OSP 8 (and 9.) , so just 7 and 6 .
Comment 11 Attila Fazekas 2017-07-29 11:14:36 EDT
Using the python-keystoneclient and it's dependencies from 8 can help.

Installing:
 python-monotonic                                 noarch                            0.3-1.el7ost                                     /python-monotonic-0.3-1.el7ost.noarch                                    12 k
Updating:
 python-keystoneclient                            noarch                            1:1.7.2-1.el7ost                                 /python-keystoneclient-1.7.2-1.el7ost.noarch                            3.0 M
 python-oslo-config                               noarch                            2:2.4.0-1.1.el7ost                               /python-oslo-config-2.4.0-1.1.el7ost.noarch                             759 k
 python-oslo-utils                                noarch                            2.8.0-1.el7ost                                   /python-oslo-utils-2.8.0-1.el7ost.noarch                                440 k
 python-stevedore                                 noarch                            1.8.0-1.el7ost                                   /python-stevedore-1.8.0-1.el7ost.noarch                                 221 k


Some of the above changes are not compatible with horizon.

Downgrading to :
python-keystoneclient-1.3.0-3.el7ost.noarch                                                                                                                                                   
python-oslo-utils-1.4.0-2.el7ost.noarch  

from:
python-keystoneclient-1.7.2-1.el7ost.noarch                                                                                                                                                       
python-oslo-utils-2.8.0-1.el7ost.noarch 

reintroduces the issue.

The actual component can be one of these two.
Comment 12 Rob Crittenden 2017-07-31 21:28:55 EDT
One thing I noticed is that Gnome was updated to 3.22.0 in RHEL 7.4. Given the problem seems related to loading libgnome-keyring I wonder if that isn't related.

What I'm guessing is that the gnome keyring is trying to prompt for something but apache has closed stdin/stdout/stderr so it sorta freaks out and hangs for something that will never come.
Comment 13 Rob Crittenden 2017-07-31 22:50:36 EDT
On my install I narrowed it down to where in python-gobject-base (/usr/lib64/python2.7/site-packages/gi/overrides/GLib.py) it is hanging:

for n in ['ERR', 'HUP', 'IN', 'NVAL', 'OUT', 'PRI']:
    globals()['IO_' + n] = getattr(GLib.IOCondition, n)
    __all__.append('IO_' + n)

I feel like I'm poking in the dark here though. To get this far I annotated the code with a print and ran apache like:

# strace -f -o /tmp/out -s 1024 /usr/sbin/httpd -X -k start -d /etc/httpd -f /etc/httpd/conf/httpd.conf

This keeps Apache single-process and makes it easier to capture output. It gets to the point where it tries getattr(Glib.IOCondition, 'IO_ERR') followed by a ton of:

select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>

and select timeouts until I kill it.
Comment 14 Rob Crittenden 2017-08-01 13:50:39 EDT
I made a little progress on a workaround. If you create /var/lib/keystone/.local/share/python_keyring/keyringrc.cfg with the contents:

[backend]
default-keyring=keyring.backends.file.EncryptedKeyring

then keystone will work with SELinux in permissive mode.

In enforcing mode httpd doesn't have read access to this file:

type=AVC msg=audit(1501609484.063:13177): avc:  denied  { open } for  pid=10111 comm="httpd" path="/var/lib/keystone/.local/share/python_keyring/keyringrc.cfg" dev="dm-0" ino=396418 scontext=system_u:system_r:httpd_t:s0 tcontext=unconfined_u:object_r:keystone_var_lib_t:s0 tclass=file
type=AVC msg=audit(1501609484.063:13177): avc:  denied  { read } for  pid=10111 comm="httpd" name="keyringrc.cfg" dev="dm-0" ino=396418 scontext=system_u:system_r:httpd_t:s0 tcontext=unconfined_u:object_r:keystone_var_lib_t:s0 tclass=file

A temporary workaround for this is to use chcon to grant access to the file:

# chcon system_u:system_r:httpd_t:s0 /var/lib/keystone/.local/share/python_keyring/keyringrc.cfg

But on the next relabel this is likely to be lost.
Comment 15 John Dennis 2017-08-02 20:52:29 EDT
Thanks to Rob Crittenden for doing the heavy lifting of figuring out it's some how related to libgnome-keyring

I did some further debugging on what happens when python-keyring initializes.

It attempts to load the following backends in it's backends directory:

file, Gnome, Google, keyczar, kwallet, multi, OS_X, pyfs, SecretService, Windows

Each backend Keyring class is supposed to implement a priority() method used to sort the backends in priority order in order to select the best backend. If the priority() method raises an exception the backend is eliminated from consideration. 

The backend priority list (even with libgnome-keyring installed) is:

[<keyring.backends.file.EncryptedKeyring, <keyring.backends.file.PlaintextKeyring]

Gnome is skipped because it's priority() checks for the existence of these environment variables: DISPLAY, DBUS_SESSION_BUS_ADDRESS which are not present.

So NOTHING in the gnome backend actually executes in any meaningful way.

Simply commenting out this import in backends/Gnome.py

        from gi.repository import GnomeKeyring

gets rid of the problem.

the gi.repository is a Gnome specific magic mechanism for loading Python modules. I looked at the gnome keyring source code and it's not obvious to me what is being invoked when it's loaded as a Python module, but clearly it's doing something during the module load that hangs the process.

The file /var/lib/keystone/.local/share/python_keyring/keyringrc.cfg is a python-keyring construct, as far as I can tell it has nothing to do with loading the Gnome backend so why creating this file mitigates the hang currently eludes me.
Comment 16 Rob Crittenden 2017-08-03 09:43:35 EDT
It works because it skips the discovery process. You'll find where it loads and processes keyringrc.cfg in keyring/core.py::load_config(). If a backend is defined and is loadable then it is used, otherwise _get_best_keyring() is called and all available backends are examined.
Comment 17 John Dennis 2017-08-03 09:55:51 EDT
Re comment #16, thanks Rob, of course that makes sense, the discovery process is short-circuited.
Comment 18 John Dennis 2017-08-03 13:44:37 EDT
I was finally able to identify the Python call where the hang occurs by instrumenting the WSGI app to log every Python call. Here is the call sequence and the WSGI log messages indicating the process has stalled.

Call to _setup_methods on line 54 of /usr/lib64/python2.7/site-packages/gi/types.py from line 332 of /usr/lib64/python2.7/site-packages/gi/types.py
Call to _VariantCreator on line 87 of /usr/lib64/python2.7/site-packages/gi/overrides/GLib.py from line 87 of /usr/lib64/python2.7/site-packages/gi/overrides/GLib.py
Call to __getattr__ on line 135 of /usr/lib64/python2.7/site-packages/gi/module.py from line 90 of /usr/lib64/python2.7/site-packages/gi/overrides/GLib.py
Call to __init__ on line 323 of /usr/lib64/python2.7/site-packages/gi/types.py from line 222 of /usr/lib64/python2.7/site-packages/gi/module.py
Call to _setup_fields on line 69 of /usr/lib64/python2.7/site-packages/gi/types.py from line 331 of /usr/lib64/python2.7/site-packages/gi/types.py
Call to _setup_methods on line 54 of /usr/lib64/python2.7/site-packages/gi/types.py from line 332 of /usr/lib64/python2.7/site-packages/gi/types.py
Call to Variant on line 229 of /usr/lib64/python2.7/site-packages/gi/overrides/GLib.py from line 229 of /usr/lib64/python2.7/site-packages/gi/overrides/GLib.py
Call to __init__ on line 323 of /usr/lib64/python2.7/site-packages/gi/types.py from line 229 of /usr/lib64/python2.7/site-packages/gi/overrides/GLib.py
Call to deprecated_attr on line 227 of /usr/lib64/python2.7/site-packages/gi/overrides/__init__.py from line 488 of /usr/lib64/python2.7/site-packages/gi/overrides/GLib.py
Call to __getattr__ on line 135 of /usr/lib64/python2.7/site-packages/gi/module.py from line 489 of /usr/lib64/python2.7/site-packages/gi/overrides/GLib.py
mod_wsgi (pid=9627): Daemon process deadlock timer expired, stopping process 'keystone_main'.
mod_wsgi (pid=9627): Shutdown requested 'keystone_main'.
mod_wsgi (pid=9627): Aborting process 'keystone_main'.

Here is the relevant snippet of code from Glib.py:

# backwards compatible names from old static bindings                                                                                                                                                                                         
for n in ['DESKTOP', 'DOCUMENTS', 'DOWNLOAD', 'MUSIC', 'PICTURES',
          'PUBLIC_SHARE', 'TEMPLATES', 'VIDEOS']:
    attr = 'USER_DIRECTORY_' + n
    deprecated_attr("GLib", attr, "GLib.UserDirectory.DIRECTORY_" + n)
    globals()[attr] = getattr(GLib.UserDirectory, 'DIRECTORY_' + n)
    __all__.append(attr)

line 489 is:

    globals()[attr] = getattr(GLib.UserDirectory, 'DIRECTORY_' + n)
Comment 19 John Dennis 2017-08-03 13:51:38 EDT
Here is what I think we should do:

Let's use Rob's workaround of creating the keyringrc.cfg as described in comment #14. That avoids python-keyring's autodiscovery process where it executes this module load:

from gi.repository import GnomeKeyring

That will get us to a working system in the shortest time.

However we really need to fix whatever is wrong in Gnome, to that end I'm going to clone this bug and change the component to one of the Gnome components so that the Gnome devs can start digging into the failure. This way we can keep the two bugs separate as each team works on their solution independently.
Comment 20 John Dennis 2017-08-03 14:24:59 EDT
I need to clarify where the hang occurs, it's in __getattr__ on line 135 of /usr/lib64/python2.7/site-packages/gi/module.py
Comment 21 Nathan Kinder 2017-08-03 15:26:06 EDT
Moving this bug back to the openstack-keystone component, which is where we will apply the identified workaround.
Comment 22 John Dennis 2017-08-03 17:13:22 EDT
I figure it's worthwhile to capture this in the bug report. I had a heck of a time figuring out where things were hanging. I tried both strace and gdb and neither of those allowed me to pin-point where where the hang was. This is how I instrumented the WSGI script to get the call list:

import sys

def trace_calls(frame, event, arg):
    if event != 'call':
        return
    co = frame.f_code
    func_name = co.co_name
    if func_name == 'write':
        # Ignore write() calls from print statements
	return
    func_line_no = frame.f_lineno
    func_filename = co.co_filename
    caller = frame.f_back
    if caller:
        caller_line_no = caller.f_lineno
        caller_filename = caller.f_code.co_filename
        sys.stdout.write('Call to %s on line %s of %s from line %s of %s\n' % \
        (func_name, func_line_no, func_filename,
         caller_line_no, caller_filename))
    return

sys.settrace(trace_calls)

This was added at the top of /var/www/cgi-bin/keystone/main
Comment 25 Filip Hubík 2017-08-23 08:35:40 EDT
Verified manually using mirror of this automated deployment. Both packages openstack-selinux-0.8.9-0.1.el7ost.noarch and openstack-keystone-2014.2.4-3.el7ost.noarch must be present to fix this issue.

$ systemctl restart httpd
$ keystone user-list
# output is produced and keystone is not stuck in loop anymore
$ echo $?
0
Comment 27 errata-xmlrpc 2017-09-06 13:52:34 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2666

Note You need to log in before you can comment on or make changes to this bug.