Bug 1325475

Summary: rhel-osp-director: upgrade 7.3->8.0, that follows update 7.2->7.3, times out due to os-collect-config auth failure
Product: Red Hat OpenStack Reporter: Alexander Chuzhoy <sasha>
Component: rhosp-directorAssignee: Steven Hardy <shardy>
Status: CLOSED WORKSFORME QA Contact: Alexander Chuzhoy <sasha>
Severity: high Docs Contact:
Priority: high    
Version: 8.0 (Liberty)CC: augol, dbecker, ggillies, jcoufal, jschluet, mburns, mcornea, morazi, ohochman, ramishra, rhel-osp-director-maint, sasha, sathlang, sbaker, shardy, therve, zbitter
Target Milestone: asyncKeywords: Reopened
Target Release: 8.0 (Liberty)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Known Issue
Doc Text:
Cause: In the course of upgrading the undercloud from OSPd 7 to OSPd 8, the _member_ role is removed from the admin user because Keystone no longer uses that idiom. Trusts stored in the Heat database rely on the trustor user retaining all of their delegated roles, which includes the _member_ role. Consequence: Heat stack updates after the undercloud upgrade fail with authentication errors. Fix: Run the command: openstack role add _member_ --user admin --project admin to re-add the _member_ role to the admin user. Result: The trusts work as expected to authenticate.
Story Points: ---
Clone Of:
: 1523192 (view as bug list) Environment:
Last Closed: 2017-01-25 01:48:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1523192    
Attachments:
Description Flags
keystone logs from the undercloud
none
The requested files/info. none

Description Alexander Chuzhoy 2016-04-09 04:13:52 UTC
rhel-osp-director: upgrade 7.3->8.0, that follows update 7.2->7.3, fails with "ERROR: Authentication failed: Authentication required"

Environment:
openstack-tripleo-heat-templates-kilo-0.8.14-5.el7ost.noarch
openstack-puppet-modules-7.0.17-1.el7ost.noarch
openstack-tripleo-heat-templates-0.8.14-5.el7ost.noarch
instack-undercloud-2.2.7-2.el7ost.noarch


Steps to reproduce:
1. deploy and populate OC 7.2
2. Update to 7.3
3. Attempt to upgrade to 8.0

Result:
Fails during the /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker-init.yaml step:

-04-08 21:21:00 [NodeTLSCAData]: UPDATE_IN_PROGRESS  state changed
2016-04-08 21:21:02 [NetIpMap]: UPDATE_COMPLETE  state changed
2016-04-08 21:21:03 [NodeTLSCAData]: UPDATE_COMPLETE  state changed
2016-04-08 21:21:03 [NodeTLSData]: UPDATE_IN_PROGRESS  state changed
2016-04-08 21:21:06 [NodeTLSCAData]: UPDATE_COMPLETE  state changed
2016-04-08 21:21:06 [NodeTLSData]: UPDATE_IN_PROGRESS  state changed
2016-04-08 21:21:06 [NodeTLSCAData]: UPDATE_COMPLETE  state changed
2016-04-08 21:21:07 [NodeTLSData]: UPDATE_COMPLETE  state changed
2016-04-08 21:21:07 [NodeTLSData]: UPDATE_IN_PROGRESS  state changed
2016-04-08 21:21:10 [NodeTLSData]: UPDATE_COMPLETE  state changed
2016-04-08 21:21:12 [NodeTLSData]: UPDATE_COMPLETE  state changed
ERROR: Authentication failed: Authentication required


in os-collect-config[4239]: <ErrorResponse><Error><Message>The request processing has failed due to an internal error:Remote error: Forbidden Trustee has no delegate
Apr 09 04:09:11 overcloud-controller-0.localdomain os-collect-config[4239]: [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dis
Apr 09 04:09:11 overcloud-controller-0.localdomain os-collect-config[4239]: /heat/common/heat_keystoneclient.py", line 155, in _v3_client_init\n    auth_ref = self.context.auth_plugin.get_access(self.session)\n',
Apr 09 04:09:11 overcloud-controller-0.localdomain os-collect-config[4239]: + '[' 500 '!=' 200 ']'
Apr 09 04:09:11 overcloud-controller-0.localdomain os-collect-config[4239]: + exit 1
Apr 09 04:09:11 overcloud-controller-0.localdomain os-collect-config[4239]: [2016-04-09 04:09:11,160] (os-refresh-config) [ERROR] during post-configure phase. [Command '['dib-run-parts', '/usr/libexec/os-refresh-c
Apr 09 04:09:11 overcloud-controller-0.localdomain os-collect-config[4239]: [2016-04-09 04:09:11,161] (os-refresh-config) [ERROR] Aborting...


/var/log/messages has many occurances of:
Apr  8 15:35:15 localhost os-collect-config: <ErrorResponse><Error><Message>The request processing has failed due to an internal error:Remote error: Forbidden Trustee has no delegated roles. (Disable debug mode to
 suppress these details.) (HTTP 403) (Request-ID: req-9f2916c0-203b-45c5-84a4-5c3f51b7c7cf)
Apr  8 15:35:15 localhost os-collect-config: [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply\n    executor_c
allback))\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch\n    executor_callback)\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatch
er.py", line 129, in _do_dispatch\n    result = func(ctxt, **new_args)\n', u'  File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 105, in wrapper\n    return f(*args, **kwargs)\n', u'  File "/usr
/lib/python2.7/site-packages/heat/common/context.py", line 308, in wrapped\n    return func(self, ctx, *args, **kwargs)\n', u'  File "/usr/lib/python2.7/site-packages/heat/engine/service.py", line 1404, in resourc
e_signal\n    stack = parser.Stack.load(cnxt, stack=s, use_stored_context=True)\n', u'  File "/usr/lib/python2.7/site-packages/heat/engine/stack.py", line 408, in load\n    cache_data=cache_data)\n', u'  File "/us
r/lib/python2.7/site-packages/heat/engine/stack.py", line 459, in _from_db\n    current_deps=stack.current_deps, cache_data=cache_data)\n', u'  File "/usr/lib/python2.7/site-packages/heat/engine/stack.py", line 19
2, in __init__\n    \'keystone\').auth_ref.role_names\n', u'  File "/usr/lib/python2.7/site-packages/heat/engine/clients/__init__.py", line 70, in client\n    return client_plugin.client()\n', u'  File "/usr/lib/p
ython2.7/site-packages/heat/engine/clients/client_plugin.py", line 78, in client\n    self._client = self._create()\n', u'  File "/usr/lib/python2.7/site-packages/heat/engine/clients/os/keystone.py", line 29, in _
create\n    return hkc.KeystoneClient(self.context)\n', u'  File "/usr/lib/python2.7/site-packages/heat/common/heat_keystoneclient.py", line 573, in __new__\n    return KeystoneClientV3(context)\n', u'  File "/usr
/lib/python2.7/site-packages/heat/common/heat_keystoneclient.py", line 84, in __init__\n    self._client = self._v3_client_init()\n', u'  File "/usr/lib/python2.7/site-packages
Apr  8 15:35:15 localhost os-collect-config: /heat/common/heat_keystoneclient.py", line 155, in _v3_client_init\n    auth_ref = self.context.auth_plugin.get_access(self.session)\n', u'  File "/usr/lib/python2.7/si
te-packages/keystoneclient/auth/identity/base.py", line 240, in get_access\n    self.auth_ref = self.get_auth_ref(session)\n', u'  File "/usr/lib/python2.7/site-packages/keystoneclient/auth/identity/v3/base.py", l
ine 190, in get_auth_ref\n    authenticated=False, log=False, **rkwargs)\n', u'  File "/usr/lib/python2.7/site-packages/keystoneclient/session.py", line 501, in post\n    return self.request(url, \'POST\', **kwarg
s)\n', u'  File "/usr/lib/python2.7/site-packages/keystoneclient/utils.py", line 337, in inner\n    return func(*args, **kwargs)\n', u'  File "/usr/lib/python2.7/site-packages/keystoneclient/session.py", line 401,
 in request\n    raise exceptions.from_response(resp, method, url)\n', u'Forbidden: Trustee has no delegated roles. (Disable debug mode to suppress these details.) (HTTP 403) (Request-ID: req-9f2916c0-203b-45c5-84
a4-5c3f51b7c7cf)\n'].</Message><Code>InternalFailure</Code><Type>Server</Type></Error></ErrorResponse>+ rm /tmp/tmp.MXD5RUC8mH
Apr  8 15:35:15 localhost os-collect-config: + '[' 500 '!=' 200 ']'
Apr  8 15:35:15 localhost os-collect-config: + exit 1
Apr  8 15:35:15 localhost os-collect-config: [2016-04-08 19:35:15,724] (os-refresh-config) [ERROR] during post-configure phase. [Command '['dib-run-parts', '/usr/libexec/os-refresh-config/post-configure.d']' retur
ned non-zero exit status 1]
Apr  8 15:35:15 localhost os-collect-config: [2016-04-08 19:35:15,725] (os-refresh-config) [ERROR] Aborting...
Apr  8 15:35:15 localhost os-collect-config: 2016-04-08 19:35:15.733 4239 ERROR os-collect-config [-] Command failed, will not cache new data. Command 'os-refresh-config' returned non-zero exit status 1


Expected result: successful upgrade

Comment 3 Mike Burns 2016-04-11 13:36:33 UTC
This appears to be due to a different bug where the nova hostname got changed.

*** This bug has been marked as a duplicate of bug 1324739 ***

Comment 4 Mike Burns 2016-04-11 13:39:35 UTC
Sorry, closed the wrong bz.

Comment 5 Zane Bitter 2016-04-11 15:05:33 UTC
Can you attach the keystone log from the undercloud? (Heat log wouldn't hurt either, although you already pasted the most interesting part.)

Comment 6 Alexander Chuzhoy 2016-04-11 15:23:55 UTC
Created attachment 1146031 [details]
keystone logs from the undercloud

Comment 8 Thomas Hervé 2016-04-12 09:10:40 UTC
Is it possible that roles changed during the upgrade? This error will happen if a trust was created for a user, and then a role/some roles were removed from that user. As Heat delegates all roles by default, if ine role is missing Keystone then fail to authenticate the trust.

Comment 9 Zane Bitter 2016-04-14 20:54:22 UTC
From the log:

Using the keystone_authtoken user as the heat trustee user directly is deprecated. Please add the trustee credentials you need to the trustee section of your heat.conf file.
Making authentication request to http://192.0.2.1:5000/v3/auth/tokens get_auth_ref /usr/lib/python2.7/site-packages/keystoneclient/auth/identity/v3/base.py:188
Request returned failure status: 403 request /usr/lib/python2.7/site-packages/keystoneclient/session.py:400
Exception during message handling: Trustee has no delegated roles. (Disable debug mode to suppress these details.) (HTTP 403) (Request-ID: req-0be8b5b3-8bdc-4ae1-9dff-b27ac9d2ac61) (HTTP 403)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
    executor_callback))
  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
    executor_callback)
  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 129, in _do_dispatch
    result = func(ctxt, **new_args)
  File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 105, in wrapper
    return f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 308, in wrapped
    return func(self, ctx, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/heat/engine/service.py", line 1404, in resource_signal
    stack = parser.Stack.load(cnxt, stack=s, use_stored_context=True)
  File "/usr/lib/python2.7/site-packages/heat/engine/stack.py", line 408, in load
    cache_data=cache_data)
  File "/usr/lib/python2.7/site-packages/heat/engine/stack.py", line 459, in _from_db
    current_deps=stack.current_deps, cache_data=cache_data)
  File "/usr/lib/python2.7/site-packages/heat/engine/stack.py", line 192, in __init__
    'keystone').auth_ref.role_names
  File "/usr/lib/python2.7/site-packages/heat/engine/clients/__init__.py", line 70, in client
    return client_plugin.client()
  File "/usr/lib/python2.7/site-packages/heat/engine/clients/client_plugin.py", line 78, in client
    self._client = self._create()
  File "/usr/lib/python2.7/site-packages/heat/engine/clients/os/keystone.py", line 29, in _create
    return hkc.KeystoneClient(self.context)
  File "/usr/lib/python2.7/site-packages/heat/common/heat_keystoneclient.py", line 573, in __new__
    return KeystoneClientV3(context)
  File "/usr/lib/python2.7/site-packages/heat/common/heat_keystoneclient.py", line 84, in __init__
    self._client = self._v3_client_init()
  File "/usr/lib/python2.7/site-packages/heat/common/heat_keystoneclient.py", line 155, in _v3_client_init
    auth_ref = self.context.auth_plugin.get_access(self.session)
  File "/usr/lib/python2.7/site-packages/keystoneclient/auth/identity/base.py", line 240, in get_access
    self.auth_ref = self.get_auth_ref(session)
  File "/usr/lib/python2.7/site-packages/keystoneclient/auth/identity/v3/base.py", line 190, in get_auth_ref
    authenticated=False, log=False, **rkwargs)
  File "/usr/lib/python2.7/site-packages/keystoneclient/session.py", line 501, in post
    return self.request(url, 'POST', **kwargs)
  File "/usr/lib/python2.7/site-packages/keystoneclient/utils.py", line 337, in inner
    return func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/keystoneclient/session.py", line 401, in request
    raise exceptions.from_response(resp, method, url)
Forbidden: Trustee has no delegated roles. (Disable debug mode to suppress these details.) (HTTP 403) (Request-ID: req-0be8b5b3-8bdc-4ae1-9dff-b27ac9d2ac61)

So it seems that the trustee (i.e. the heat user) is lacking some role it requires. It also appears that we are using the keystone_authtoken user for this purpose, and that this is deprecated. I wonder if there is some change in the configuration that we were supposed to make when upgrading to Liberty, but didn't. Steve, any ideas?

Comment 10 Rabi Mishra 2016-04-15 00:54:21 UTC
While creating a trust scoped token for trustee(during stack initialization with stored context), the roles in the trust are matched with the trustor roles by keystone[1]. It seems to be failing there.

I suspect this could be due to either of the following:

a. It seems keystone v3 does not create the '_member_' role by default. If the upgrade to 8.0 involves migration to v3, probably trust has this role where as  trustor doesn't.

b. The trustor roles have been changed and is not matching with the the trust roles(with the stored_context trust_id)


[1] https://github.com/openstack/keystone/blob/master/keystone/token/providers/common.py#L374-L383

Comment 11 Steven Hardy 2016-04-15 12:10:14 UTC
> So it seems that the trustee (i.e. the heat user) is lacking some role it requires. 

So, this isn't my interpretation of the error, because the scope of the delegation (e.g what roles are delegated) is defined by the trust (which we only hold the ID of, all trust->role mapping is done inside keystone)

Thus if trustor "steve" has role "a", and it's delegated via trust 123 to trustee "heat", it shouldn't matter what roles "heat" has, only that "steve" has role "a", and that it exists at the time of delegation to "heat".

I suspect Rabi is on the right track here, in that we created the trust with a v2 issued token, which probably means we delegated all roles including _member_, then we upgraded, so we need to confirm all roles delegated via the trust are still available when requesting a trust-scoped token via the v3 API.

I'll try to make a small reproducer that proves/disproves the theory described by Rabi (unless he's already done so).

> It also appears that we are using the keystone_authtoken user for this purpose, and that this is deprecated. I wonder if there is some change in the configuration that we were supposed to make when upgrading to Liberty, but didn't. Steve, any ideas?

Yes, I fixed it in devstack and it's now also fixed in puppet-heat:

https://review.openstack.org/#/c/254755
https://review.openstack.org/#/c/261398/

https://review.openstack.org/#/c/261326/

I don't think that is related to this problem though, it's only a warning and the resulting trustee user will be the same.

Comment 12 Steven Hardy 2016-04-15 12:49:26 UTC
So, here's a small script which will list all the trusts, and the roles they delegate.  If we can get access to a platform with this problem, we can get the trust ID heat is trying to use, filter these results, and compare the roles in the trust with those in the ROLES print below.


from keystoneclient.v3 import client

OS_AUTH_URL_V3='http://192.168.1.13:5000/v3'
OS_ENDPOINT_V3='http://192.168.1.13:5000/v3'

USERNAME='admin'
PASSWORD='foobar'
TENANT_NAME='admin'
DEBUG = True

# Create a client with username/password
c = client.Client(debug=DEBUG,
                  username=USERNAME,
                  password=PASSWORD,
                  tenant_name=TENANT_NAME,
                  auth_url=OS_AUTH_URL_V3,
                  endpoint=OS_ENDPOINT_V3)
ret = c.authenticate()
for t in c.trusts.list():
  the_t = c.trusts.get(t)
  print "SHDEBUG trust get=%s" % the_t
  print "SHDEBUG trust roles=%s" % the_t.roles
  print "---"
print "SHDEBUG ROLES=%s" % c.roles.list()

Comment 13 Steven Hardy 2016-04-15 12:56:13 UTC
Note we need to run that script, with credentials/IP adjusted appropriately, on the undercloud after this problem has occurred - ideally correlating with the actual trust_id which we can obtain either via querying the user_creds table of the heat DB, or by adding a line of debug to the heat code which shows us the trust_id, or by turning on debug logging and seeing if the keystone logs contain it.

Comment 14 Steven Hardy 2016-04-15 13:03:32 UTC
If anyone attempts to reproduce this, please:

1. Collect all the roles including their IDs from keystone before/after the upgrade, either using CLI tools or the script above

2. Collect before/after heat.conf files from the undercloud.

Note we need these before/after the point ofre-running openstack undercloud install during the undercloud upgrade, not the overcloud stack upgrade.

Comment 15 Alexander Chuzhoy 2016-04-16 04:48:04 UTC
Created attachment 1147843 [details]
The requested files/info.

Comment 16 Thomas Hervé 2016-04-16 09:22:51 UTC
I don't see anything in it. Can we get access to a running environment?

Comment 17 Steven Hardy 2016-04-18 09:34:42 UTC
The debug data does show we're trying to delegate _member_, so we need to prove it getting a trust scoped token from v3 keystone where _member_ is delegated actually works - the before/after heat.conf's should help with attempting an accurate reproduce too

Comment 18 Steven Hardy 2016-04-18 14:53:29 UTC
We don't change the undercloud token provider to fernet as part of the upgrade do we?

https://review.openstack.org/#/c/278693/

Comment 19 Steven Hardy 2016-04-18 15:46:02 UTC
So, therve got access to an environment suffering from this issue, and noticed that we delete the _member_ role assignment for all users.  This is probably happening due to puppet-keystone getting reapplied during the upgrade process on the undercloud.

I reproduced this locally and raised an upstream bug to further investigate:

https://bugs.launchpad.net/tripleo/+bug/1571708

Comment 20 Thomas Hervé 2016-04-18 16:19:08 UTC
I got access to the platform with the problem, and the admin user doesn't have the _member_ role anymore. It's removed at some place in the process.. I've found this in the keystone logs:

DELETE http://192.0.2.1:35357/v3/projects/5a2e8afcd79b440c9ae13d43d6488f71/users/618f9860e14c49ec93add932332847a9/roles/9fe2ff9ee4384b1894a90878d3e92bab

5a2e8afcd79b440c9ae13d43d6488f7 is the admin tenant

618f9860e14c49ec93add932332847a9 is the admin user

9fe2ff9ee4384b1894a90878d3e92bab is the _member role.

Around that request we can see that the role seems to be removed from all the other service users. We should identify which part of the upgrade does that, as I think it's a wrong thing to do.

I can think of one possible workaround beforehand, which is to set trusts_delegated_roles to admin in the undercloud beforehand.

Otherwise, it may be nice to have the ability the regenerate the trust of the stack with a Heat API call. It seems bogus that a stack is broken once a role is removed from a user.

Comment 21 Steven Hardy 2016-04-18 21:06:24 UTC
> I can think of one possible workaround beforehand, which is to set trusts_delegated_roles to admin in the undercloud beforehand.

This won't work because the trusts referenced from the heat DB (e.g for an existing overcloud deployment) will still reference the old setting (which is to delegate all roles, including _member_.

I agree having a way to update the trust stored by heat would be good, but we'd have to be careful to just update the current user_creds record, because nested stacks all reference the same record.

Anyway, I discussed this with Emilienm and chem in #tripleo, it is puppet that deletes the role assignments, because puppet has no knowledge of the special _member_ role that used to be created by keystone directy in the DB.


We agreed the least-bad solution was to detect when the _member_ role is present and pass a boolean in so puppet can append the _member_ role and maintain the assignment to the admin user (investigation shows the role itself is left intact, so we just need puppet to leave the assignment to the user alone.

Emilien and I worked on this patch (only partially tested so far, feedback welcome):

https://review.openstack.org/#/c/307352/

Comment 22 Steven Hardy 2016-04-19 09:02:27 UTC
To clarify the workaround for anyone hitting this - if you re-add the _member_ role to the admin user in the admin project, all should be OK (you'll have to do this every time you re-run openstack undercloud install until we land the patch above)

Comment 23 Steve Baker 2016-05-04 21:24:28 UTC
Assigned to shardy to document the workaround in the Doc Text

Comment 24 Steven Hardy 2016-05-12 14:19:23 UTC
I was expecting us to land https://review.openstack.org/#/c/307352/ instead of just documenting the manual workaround, but we can do both I guess if anyone reviews that patch.

Comment 25 Steven Hardy 2016-05-12 14:43:16 UTC
To clarify, the proposed manual workaround is:

openstack role add _member_ --user admin --project admin

This re-adds the _member_role that heat needs and is erroneously removed by puppet.  It will be necessary to re-add it every time "openstack undercloud install" is re-run, until/unless we land the workaround patch referenced above.


[stack@instack ~]$ . stackrc 
[stack@instack ~]$ openstack role list
+----------------------------------+-----------------+
| ID                               | Name            |
+----------------------------------+-----------------+
| 3d4119f0c547490390e7176168d3b9f9 | admin           |
| 495e20aa15ef4abdbaff7082bf75e6fb | heat_stack_user |
| 949f82aa1237461fab542ca916e7f7bf | ResellerAdmin   |
| 9fe2ff9ee4384b1894a90878d3e92bab | _member_        |
| a0b57f52c98844d2a3a977bac8e2ce03 | swiftoperator   |
+----------------------------------+-----------------+
[stack@instack ~]$ openstack role list --user admin --project admin
+----------------------------------+-------+---------+-------+
| ID                               | Name  | Project | User  |
+----------------------------------+-------+---------+-------+
| 3d4119f0c547490390e7176168d3b9f9 | admin | admin   | admin |
+----------------------------------+-------+---------+-------+
[stack@instack ~]$ openstack role add _member_ --user admin --project admin
+-------+----------------------------------+
| Field | Value                            |
+-------+----------------------------------+
| id    | 9fe2ff9ee4384b1894a90878d3e92bab |
| name  | _member_                         |
+-------+----------------------------------+
[stack@instack ~]$ openstack role list --user admin --project admin
+----------------------------------+----------+---------+-------+
| ID                               | Name     | Project | User  |
+----------------------------------+----------+---------+-------+
| 3d4119f0c547490390e7176168d3b9f9 | admin    | admin   | admin |
| 9fe2ff9ee4384b1894a90878d3e92bab | _member_ | admin   | admin |
+----------------------------------+----------+---------+-------+

If this workaround can be confirmed working, we can update the doc-text.

Comment 28 Alexander Chuzhoy 2016-05-13 14:25:08 UTC
Steven,
So I see that after implementing the step in comment #25, the upgrade process advanced to the next step- the one including /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker.yaml

Where it failed after 4 hours, with:
ERROR: Authentication failed: Authentication required


[stack@instack ~]$ heat resource-list -n5 overcloud|grep -v COMPLE
+--------------------------------------------+-----------------------------------------------+---------------------------------------------------+--------------------+---------------------+-----------------------------------------------------------------------------------------------+
| resource_name                              | physical_resource_id                          | resource_type                                     | resource_status    | updated_time        | stack_name                                                                                    |
+--------------------------------------------+-----------------------------------------------+---------------------------------------------------+--------------------+---------------------+-----------------------------------------------------------------------------------------------+
| UpdateWorkflow                             | 78b53603-4d9d-4780-b67b-cce2051a4f5e          | OS::TripleO::Tasks::UpdateWorkflow                | UPDATE_FAILED      | 2016-05-12T22:48:18 | overcloud                                                                                     |
| ControllerPacemakerUpgradeDeployment_Step1 | 6e1c1888-e21d-40ac-9169-3314636f64ac          | OS::Heat::SoftwareDeploymentGroup                 | CREATE_FAILED      | 2016-05-12T22:48:39 | overcloud-UpdateWorkflow-msi5ojo7gp2z                                                         |
| 0                                          | 1635f2c5-d3b0-4653-be66-175cd7263c9d          | OS::Heat::SoftwareDeployment                      | CREATE_IN_PROGRESS | 2016-05-12T22:48:47 | overcloud-UpdateWorkflow-msi5ojo7gp2z-ControllerPacemakerUpgradeDeployment_Step1-4xqf2ykvzqht |
| 0                                          | 3f673a67-3f82-4694-b9e3-fdb0b1e77330          | OS::Heat::StructuredDeployment                    | UPDATE_FAILED      | 2016-05-12T22:48:55 | overcloud-ControllerAllNodesDeployment-jknfu2pzvm33                                           |
| ControllerAllNodesDeployment               | e053de98-a315-4d36-8eb2-d5bb4941a2a3          | OS::Heat::StructuredDeployments                   | UPDATE_FAILED      | 2016-05-12T22:48:55 | overcloud                                                                                     |
+--------------------------------------------+-----------------------------------------------+---------------------------------------------------+--------------------+---------------------+-----------------------------------------------------------------------------------------------+


The pcs cluster doesn't run on the controllers.

Comment 29 Zane Bitter 2016-06-01 16:05:18 UTC
Is there anything in the log to suggest what is causing the timeout now?

Comment 30 Alexander Chuzhoy 2016-06-01 19:53:59 UTC
Zane,
I believe appplying the fix from comment #25 should do the trick. Just need to have it.

Comment 31 Zane Bitter 2016-06-01 20:14:31 UTC
Sorry, I misunderstood.

Since the workaround is working for you, I've added the doc_text as a Known Issue. The upstream patch hasn't merged yet, I'll see what I can do about pushing that forward.

Comment 32 Steven Hardy 2016-07-13 09:39:01 UTC
The doc text looks good, I'll see if I can push the upstream patch forward as it automates that workaround, basically it stalled due to lack of review feedback.

Comment 33 Sofer Athlan-Guyot 2016-12-16 12:47:21 UTC
The ustream patch has been merged in 5.0.0.0rc2 upstream.  Is this open bug still relevant ?  Moving ON_QA.

Comment 34 Omri Hochman 2016-12-20 20:07:37 UTC
(In reply to Sofer Athlan-Guyot from comment #33)
> The ustream patch has been merged in 5.0.0.0rc2 upstream.  Is this open bug
> still relevant ?  Moving ON_QA.



(In reply to Steven Hardy from comment #24)
> I was expecting us to land https://review.openstack.org/#/c/307352/ instead
> of just documenting the manual workaround, but we can do both I guess if
> anyone reviews that patch.



I checked the file:
/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py

The patch from the comment #24 - doesn't seems to be included on osp8 deployed environment.  

Environment: 
------------
instack-undercloud-2.2.7-8.el7ost.noarch
openstack-puppet-modules-7.1.5-1.el7ost.noarch
openstack-tripleo-puppet-elements-0.0.5-1.el7ost.noarch
puppet-3.6.2-4.el7sat.noarch
openstack-heat-engine-5.0.1-9.el7ost.noarch
openstack-tripleo-heat-templates-kilo-0.8.14-23.el7ost.noarch
openstack-heat-templates-0-0.1.20151019.el7ost.noarch
heat-cfntools-1.2.8-2.el7.noarch
openstack-heat-api-cfn-5.0.1-9.el7ost.noarch
openstack-heat-common-5.0.1-9.el7ost.noarch
openstack-tripleo-heat-templates-0.8.14-23.el7ost.noarch
python-heatclient-1.0.0-1.el7ost.noarch
openstack-heat-api-cloudwatch-5.0.1-9.el7ost.noarch
openstack-heat-api-5.0.1-9.el7ost.noarch

Comment 35 Omri Hochman 2016-12-20 20:10:24 UTC
(In reply to Sofer Athlan-Guyot from comment #33)
> The ustream patch has been merged in 5.0.0.0rc2 upstream.  Is this open bug
> still relevant ?  Moving ON_QA.

I'm not sure if this bug is even still relevant, I don't think it's likely that there are still customers that will do 7.2 - 7.3 and then upgrade to osp8 since it's Very Old environments.

Comment 40 Sofer Athlan-Guyot 2017-01-25 01:48:46 UTC
Hi,

Closing this issue as it seems not relevant any more.  It can always be re-opened if needed.

Regards,

Comment 41 Sofer Athlan-Guyot 2017-02-09 17:14:38 UTC
*** Bug 1361746 has been marked as a duplicate of this bug. ***