Bug 1218322 - Keystone auth fails after bare-metal deployment via instack-deploy-overcloud --tuskar
Summary: Keystone auth fails after bare-metal deployment via instack-deploy-overcloud ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ga
: Director
Assignee: Jay Dobies
QA Contact: Amit Ugol
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-05-04 15:27 UTC by jliberma@redhat.com
Modified: 2015-08-05 13:51 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-08-05 13:51:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Output from heat stack create and instack-deploy-overcloud script (100.50 KB, text/plain)
2015-05-04 15:27 UTC, jliberma@redhat.com
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2015:1549 0 normal SHIPPED_LIVE Red Hat Enterprise Linux OpenStack Platform director Release 2015-08-05 17:49:10 UTC

Description jliberma@redhat.com 2015-05-04 15:27:05 UTC
Created attachment 1021784 [details]
Output from heat stack create and instack-deploy-overcloud script

Description of problem: Keystone auth fails after bare-metal deployment via instack-deploy-overcloud --tuskar. heat stack creation completes successfully. automated overcloud customization fails with keystne auth errors. 

Example:
Creating user-role assignment for user ec2, role admin, tenant service
/usr/lib/python2.7/site-packages/keystoneclient/shell.py:65: DeprecationWarning: The keystone CLI is deprecated in favor of python-openstackclient. For a Python library, continue using python-keystoneclient.
  'python-keystoneclient.', DeprecationWarning)
Authorization Failed: Unable to establish connection to http://172.16.2.6:5000/v2.0/tokens
[root@rhos0 ~]# heat stack-list
+--------------------------------------+------------+-----------------+----------------------+
| id                                   | stack_name | stack_status    | creation_time        |
+--------------------------------------+------------+-----------------+----------------------+
| 798160e3-db9b-4743-b53c-2967471e1f04 | overcloud  | CREATE_COMPLETE | 2015-05-04T14:49:31Z |
+--------------------------------------+------------+-----------------+----------------------+


Version-Release number of selected component (if applicable):
[root@rhos0 ~]# rpm -qa | grep openstack | grep -E 'tripleo|ironic'
openstack-ironic-common-2015.1-dev682.el7.centos.noarch
openstack-tripleo-heat-templates-0.8.4-post33.el7.centos.noarch
openstack-ironic-discoverd-1.1.0-0.99.20150429.1425git.el7.centos.noarch
openstack-tripleo-puppet-elements-0.0.0-post56.el7.centos.noarch
openstack-tripleo-image-elements-0.9.4-post7.el7.centos.noarch
openstack-ironic-conductor-2015.1-dev682.el7.centos.noarch
openstack-tripleo-0.0.6-dev1698.el7.centos.noarch
openstack-ironic-api-2015.1-dev682.el7.centos.noarch

How reproducible: Every time


Steps to Reproduce:
1. Deploy undercloud
2. Configure deploy-overcloudrc
3. Discover bare metal servers
4. instack-deploy-overcloud --tuskar

Actual results: Fails to customize environment


Expected results: Customization completes successfully


Additional info: If I ssh to overcloud as heat-admin post deploy and restart all openstack services it temporarily works in a read-only fashion. If I attempt to create anything (IE -- neutron network, glane image) the auth errors return and services must be restarted.

Comment 3 Marios Andreou 2015-05-14 09:56:18 UTC
(on the undercloud, after sourcing overcloudrc)... can't talk to overcloud services (as above connection issues). restarting haproxy on overcloud controller reliably fixes the service connectivity. still investigating

Comment 4 Marios Andreou 2015-05-14 11:33:01 UTC
discussion/investigation ongoing... after tip from derekh we increased max_conn in both haproxy and mysqld (haproxy was logging > 150 which was max_con previously/default). currently stable at ~185 for last 25 mins ish.

Comment 5 Giulio Fidente 2015-05-14 11:45:41 UTC
astapor seems to be setting the limits at 4000 for haproxy and 1024 for galera, Jason can you confirm so we port the same values to tripleo?

Comment 6 Derek Higgins 2015-05-14 11:49:33 UTC
The key here was that connections to keystone through the VIP and haproxy were still working even when keystone commands were displaying a problem this particular curl command was responding immediately

$ curl http://10.8.147.22:5000/v2.0/tokens
{"error": {"message": "The resource could not be found.", "code": 404, "title": "Not Found"}}

the difference being that the call to curl command can't authenticate and responds before keystone attempts to connect to the database
this points us at a problem with db connections.

This probably wont happen in a virt env because the number of sql connections on this baremetal env is higher as a lot of our processes scale based on the number of CPU's. The baremetal host in question had 24 cpus.

Comment 7 Jason Guiditta 2015-05-14 13:02:55 UTC
Haproxy should have maxconn = 10000
Galera needs:
$limit_no_file            ="16384", (this in both config and passed into pcs RA)
$max_connections         = "1024",
$open_files_limit        = '-1',

These galera settings should also be configurable, as different hardware may have different needs

Comment 8 Marios Andreou 2015-05-14 13:43:12 UTC
half of the fix in https://review.openstack.org/#/c/183044/1

Comment 9 Giulio Fidente 2015-05-14 14:31:19 UTC
the other half https://review.openstack.org/#/c/183046/

or third until we make this configurable and pass needed options to the pacemaker resource agent as well

Comment 10 jliberma@redhat.com 2015-05-21 16:23:43 UTC
I saw this merged on May 14, are these fixes incorporated into the latest code base?  IE -- How can I test on baremetal?

Comment 11 Ronelle Landy 2015-05-22 10:44:07 UTC
CI job fr Dell hw is running green atm.
So thanks to gfidente, you should be able to test this out.

Comment 14 Omri Hochman 2015-07-30 19:25:16 UTC
Verified , the --tuskar replaced with --plan , deployment successfully with HA/non-HA with virt-env/Bare-Metal . 
 
instack-undercloud-2.1.2-22.el7ost.noarch
openstack-tuskar-0.4.18-3.el7ost.noarch
python-tuskarclient-0.1.18-3.el7ost.noarch
openstack-tuskar-ui-extras-0.0.4-1.el7ost.noarch
openstack-tuskar-ui-0.3.0-13.el7ost.noarch
openstack-puppet-modules-2015.1.8-8.el7ost.noarch

Comment 16 errata-xmlrpc 2015-08-05 13:51:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2015:1549


Note You need to log in before you can comment on or make changes to this bug.