Bug 1235908 - Heat error when deploying with network isolation enabled
Summary: Heat error when deploying with network isolation enabled
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: ga
: Director
Assignee: Ryan Brown
QA Contact: Marius Cornea
URL:
Whiteboard:
: 1237221 1237318 1238133 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-06-26 05:14 UTC by Dan Sneddon
Modified: 2019-11-14 06:46 UTC (History)
13 users (show)

Fixed In Version: instack-undercloud-2.1.2-9.el7ost
Doc Type: Bug Fix
Doc Text:
Previously, the keystone token expired while Orchestration worked to deploy the Overcloud, resulting in the deployment failing with an authentication error. With this release, the token timeout has been increased, resulting in the successful deployment of the Overcloud.
Clone Of:
Environment:
Last Closed: 2015-08-05 13:56:47 UTC
Target Upstream Version:


Attachments (Terms of Use)
heat-api.log and "grep -i error /var/log/heat/heat-engine.log" (480.79 KB, application/x-gzip)
2015-06-26 05:14 UTC, Dan Sneddon
no flags Details
keystone.log (11.51 MB, text/plain)
2015-06-26 18:23 UTC, Omri Hochman
no flags Details


Links
System ID Priority Status Summary Last Updated
OpenStack gerrit 237935 None None None Never
Red Hat Product Errata RHEA-2015:1549 normal SHIPPED_LIVE Red Hat Enterprise Linux OpenStack Platform director Release 2015-08-05 17:49:10 UTC

Internal Links: 1236167

Description Dan Sneddon 2015-06-26 05:14:43 UTC
Created attachment 1043383 [details]
heat-api.log and "grep -i error /var/log/heat/heat-engine.log"

Description of problem:
I did a bare metal install with network isolation enabled. It got most of the way through deployment, but during the stage where it was going through the network isolation templates I got a weird Heat error.

This happened when I ran "openstack overcloud deploy" with --use-tripleo-heat-templates and with --plan-uuid

heat-engine.log

Version-Release number of selected component (if applicable):
2015-06-25.2 poodle

How reproducible:
100%

Steps to Reproduce:
1. Deploy overcloud with network isolation.
2.
3.

Actual results:

openstack overcloud deploy --plan-uuid=<UUID> returned:
ERROR: openstack ERROR: Authentication failed. Please try again with option --include-password or export HEAT_INCLUDE_PASSWORD=1


Expected results:
The overcloud should finish deploying.

Additional info:
I attached heat-api.log and any line in heat-engine.log which had the contained "error".

Comment 3 Dan Sneddon 2015-06-26 06:28:44 UTC
Quick way to enable network isolation:

openstack overcloud deploy -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml --plan-uuid <UUID>

Comment 4 Ryan Brown 2015-06-26 13:13:03 UTC
Seems the initial error is "Unknown Property ControlPlaneNetwork" which indicates that the templates are getting parameters they aren't set up for.

Comment 5 Dan Sneddon 2015-06-26 15:18:29 UTC
(In reply to Ryan Brown from comment #4)
> Seems the initial error is "Unknown Property ControlPlaneNetwork" which
> indicates that the templates are getting parameters they aren't set up for.

No, that is actually a different bug entirely: https://bugzilla.redhat.com/show_bug.cgi?id=1235848

But simply enabling network isolation using the two -e parameters like I outlined above does not result in the "Unknown Property ControlPlaneNetwork", but instead results in the Heat errors I encountered.

Further testing after I filed this bug revealed that I was hitting this Heat error whether or not I was using network isolation. It also happened whether I was using Heat or Tuskar. The poodle I was testing on yesterday was actually 2015-06-25.8, but I understand that today's poodles are successfully deploying so I am going to test again today.

Comment 6 Omri Hochman 2015-06-26 18:18:19 UTC
reproduce - that was the deployment command : openstack overcloud deploy --plan-uuid db6ec6dc-762a-43ac-a36c-7631baa37996 --control-scale 3 --compute-scale 1 --ceph-storage-scale 1 --block-storage-scale 0 --swift-storage-scale 0 -e network_environment.yaml --debug "


console-output : 
----------------
VR": "False", "Compute-1::NeutronL3HA": "True", "Cinder-Storage-1::Image": "overcloud-full", "Controller-1::KeystoneCACertificate": "-----BEGIN CERTIFICATE-----\nMIIDNzCCAh+gAwIBAgIBATANBgkqhkiG9w0BAQUFADBTMQswCQYDVQQGEwJYWDEO\nMAwGA1UECBMFVW5zZXQxDjAMBgNVBAcTBVVuc2V0MQ4wDAYDVQQKEwVVbnNldDEU\nMBIGA1UEAxMLS2V5c3RvbmUgQ0EwHhcNMTUwNjI2MTQ0MDA5WhcNMjUwNjIzMTQ0\nMDA5WjBTMQswCQYDVQQGEwJYWDEOMAwGA1UECBMFVW5zZXQxDjAMBgNVBAcTBVVu\nc2V0MQ4wDAYDVQQKEwVVbnNldDEUMBIGA1UEAxMLS2V5c3RvbmUgQ0EwggEiMA0G\nCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDQiy2dMzVInMWuY3hm34HAkvbt0ruG\nzvSG1IpF2TRGXZrq7mNYbVCXvmuV1DKSEEEjJN4yn0nw8bED80KLRqZJEWTm7aXF\n/CeRSf90SJJFtkaiayRWZU00VAdNIiNfrEYNslwOScux+UKJglWlDEpalCYdZQAm\nJWcEqB40MnyeZkAuSq76XqOxa3qBCRLvd0t8/y6y2A7tsctK8NSYLfdoeK5lFyLk\nEZeNMmGr3SOxiWNTc8d8Ij4XPmpXfiTB6Nl+5uMa7mMultHNBKmEQ1dit/ua2uot\nQlwAv2cEqdg04ZZzV+MNbPtZZRnICGQezdZUOAaN6biSIAoBYnElCispAgMBAAGj\nFjAUMBIGA1UdEwEB/wQIMAYBAf8CAQAwDQYJKoZIhvcNAQEFBQADggEBAC58igSu\naeAkWbGX8QjY5dOoMVvXV2pEO4BVWimvuqjJABinDY/SJRZ/mNE6DsFqLVvGz39t\nDZpnFu/XgK0hTOuXc3M7cxv7KJ6KM2/IGLwoayRsMS6wRCIGIwHPd+jFAMCNvtNX\nEC9HQOpRQnIWZZZp9rV6jgvkiueLex56LNOTeKdnfswIDI7EANqJr0E30mBXwjUN\nUTvTg0MyvIwQ+kw92R9tu6HeDLXxyUHJJXombNGj9V2igd6M/RfmS+Ukqu4m8o99\nBFMoAWoVutpeFDKUlpgwjKrLbwx9cvdAW9p+UWaF4peOsciwGZIjy1uCmp4KCMxI\nEqUR6UN/o6SiPwc=\n-----END CERTIFICATE-----\n", "Compute-1::NeutronPassword": "******", "Cinder-Storage-1::ServiceNetMap": "{\"NovaVncProxyNetwork\": \"internal_api\", \"NeutronTenantNetwork\": \"tenant\", \"NovaApiNetwork\": \"internal_api\", \"CeilometerApiNetwork\": \"internal_api\", \"SwiftMgmtNetwork\": \"storage_mgmt\", \"MemcachedNetwork\": \"internal_api\", \"RabbitMqNetwork\": \"internal_api\", \"KeystoneAdminApiNetwork\": \"internal_api\", \"SwiftProxyNetwork\": \"storage\", \"CinderApiNetwork\": \"internal_api\", \"CephClusterNetwork\": \"storage_mgmt\", \"NovaMetadataNetwork\": \"internal_api\", \"RedisNetwork\": \"internal_api\", \"NeutronApiNetwork\": \"internal_api\", \"GlanceApiNetwork\": \"storage\", \"KeystonePublicApiNetwork\": \"internal_api\", \"HeatApiNetwork\": \"internal_api\", \"GlanceRegistryNetwork\": \"internal_api\", \"MysqlNetwork\": \"internal_api\", \"CephPublicNetwork\": \"storage\", \"MongoDbNetwork\": \"internal_api\", \"HorizonNetwork\": \"internal_api\", \"CinderIscsiNetwork\": \"storage\"}", "Controller-1::NeutronPublicInterfaceRawDevice": "", "Controller-1::AdminPassword": "******", "Controller-1::CeilometerMeteringSecret": "******", "Cinder-Storage-1::RabbitPassword": "guest", "Controller-1::EnableCephStorage": "False", "Cinder-Storage-1::CinderPassword": "******", "Controller-1::CinderBackendConfig": "{}", "Compute-1::SnmpdReadonlyUserName": "ro_snmp_user", "OS::stack_name": "overcloud", "Cinder-Storage-1::GlancePort": "9292", "Controller-1::NeutronPublicInterface": "nic1", "Compute-1::Debug": "", "Cinder-Storage-1::NtpServer": "", "Swift-Storage-1::MountCheck": "False", "Ceph-Storage-1::UpdateIdentifier": "", "Compute-1::CeilometerComputeAgent": "", "Compute-1::NeutronBridgeMappings": "datacentre:br-ex", "Cinder-Storage-1::GlanceProtocol": "http", "Controller-1::EnableGalera": "True", "ObjectStorageHostnameFormat": "%stackname%-objectstorage-%index%", "Controller-1::AdminToken": "******", "Controller-1::SwiftMountCheck": "False", "Controller-1::UpdateIdentifier": "", "Controller-1::NeutronNetworkVLANRanges": "datacentre:1:1000", "Controller-1::GlanceBackend": "rbd", "Compute-1::RabbitClientPort": "5672", "Controller-1::ExtraConfig": "{}", "Controller-1::CinderEnableRbdBackend": "True", "Cinder-Storage-1::RabbitClientUseSSL": "False", "Compute-1::NeutronPhysicalBridge": "br-ex", "Compute-1::NeutronEnableTunnelling": "True", "Compute-1::NovaComputeExtraConfig": "{}", "Compute-1::RabbitPassword": "******"}, "id": "2878a87c-f316-4544-ba15-a739ad10e1dc", "template_description": "No description"}}

DEBUG: heatclient.common.http curl -g -i -X GET -H 'X-Auth-Token: {SHA1}f610a16b7b98a767bb3fa7600adf2c0bd1850d1c' -H 'Content-Type: application/json' -H 'X-Auth-Url: http://192.0.2.1:5000/v2.0' -H 'Accept: application/json' -H 'User-Agent: python-heatclient' http://192.0.2.1:8004/v1/517f9f1ef9a34580b74400643938f4f3/stacks/overcloud
DEBUG: heatclient.common.http 
HTTP/1.1 401 Unauthorized
content-length: 23
www-authenticate: Keystone uri='http://192.0.2.1:5000/v2.0'
connection: keep-alive
date: Fri, 26 Jun 2015 18:12:44 GMT
content-type: text/plain
x-openstack-request-id: req-7477d674-f7d6-4645-9d54-645a0f95af5c

Authentication required

ERROR: openstack ERROR: Authentication failed. Please try again with option --include-password or export HEAT_INCLUDE_PASSWORD=1
Authentication required
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 295, in run_subcommand
    result = cmd.run(parsed_args)
  File "/usr/lib/python2.7/site-packages/cliff/command.py", line 53, in run
    self.take_action(parsed_args)
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 667, in take_action
    self._deploy_tuskar(stack, parsed_args)
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 456, in _deploy_tuskar
    self._heat_deploy(stack, overcloud_yaml, parameters, environments)
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 354, in _heat_deploy
    orchestration_client, "overcloud")
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/utils.py", line 147, in wait_for_stack_ready
    stack = orchestration_client.stacks.get(stack_name)
  File "/usr/lib/python2.7/site-packages/heatclient/v1/stacks.py", line 202, in get
    resp, body = self.client.json_request('GET', '/stacks/%s' % stack_id)
  File "/usr/lib/python2.7/site-packages/heatclient/common/http.py", line 265, in json_request
    resp = self._http_request(url, method, **kwargs)
  File "/usr/lib/python2.7/site-packages/heatclient/common/http.py", line 217, in _http_request
    'content': resp.content
HTTPUnauthorized: ERROR: Authentication failed. Please try again with option --include-password or export HEAT_INCLUDE_PASSWORD=1
Authentication required
DEBUG: openstackclient.shell clean_up DeployOvercloud
DEBUG: openstackclient.shell got an error: ERROR: Authentication failed. Please try again with option --include-password or export HEAT_INCLUDE_PASSWORD=1
Authentication required
ERROR: openstackclient.shell Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/openstackclient/shell.py", line 176, in run
    return super(OpenStackShell, self).run(argv)
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 230, in run
    result = self.run_subcommand(remainder)
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 295, in run_subcommand
    result = cmd.run(parsed_args)
  File "/usr/lib/python2.7/site-packages/cliff/command.py", line 53, in run
    self.take_action(parsed_args)
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 667, in take_action
    self._deploy_tuskar(stack, parsed_args)
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 456, in _deploy_tuskar
    self._heat_deploy(stack, overcloud_yaml, parameters, environments)
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 354, in _heat_deploy
    orchestration_client, "overcloud")
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/utils.py", line 147, in wait_for_stack_ready
    stack = orchestration_client.stacks.get(stack_name)
  File "/usr/lib/python2.7/site-packages/heatclient/v1/stacks.py", line 202, in get
    resp, body = self.client.json_request('GET', '/stacks/%s' % stack_id)
  File "/usr/lib/python2.7/site-packages/heatclient/common/http.py", line 265, in json_request
    resp = self._http_request(url, method, **kwargs)
  File "/usr/lib/python2.7/site-packages/heatclient/common/http.py", line 217, in _http_request
    'content': resp.content
HTTPUnauthorized: ERROR: Authentication failed. Please try again with option --include-password or export HEAT_INCLUDE_PASSWORD=1
Authentication required

[stack@instack ~]$

Comment 7 Omri Hochman 2015-06-26 18:23:13 UTC
Created attachment 1043634 [details]
keystone.log

adding full keystone.log from instack machine

Comment 8 Ryan Brown 2015-06-26 20:25:37 UTC
The only denials I see are prefixed with this message: 

2015-06-26 10:11:11.397 28749 WARNING keystone.common.wsgi [-] Authorization failed. Could not find user: ironic (Disable debug mode to suppress these details
.) (Disable debug mode to suppress these details.) from 127.0.0.1


Which makes me think heat somehow has credentials for the ironic user.

Comment 9 Dan Sneddon 2015-06-26 20:35:44 UTC
(In reply to Ryan Brown from comment #8)
> The only denials I see are prefixed with this message: 
> 
> 2015-06-26 10:11:11.397 28749 WARNING keystone.common.wsgi [-] Authorization
> failed. Could not find user: ironic (Disable debug mode to suppress these
> details
> .) (Disable debug mode to suppress these details.) from 127.0.0.1
> 
> 
> Which makes me think heat somehow has credentials for the ironic user.

Those authorization failed messages happened several hours before the failed deployment, though.

I have a sneaking suspicion that this is not really an authorization problem, but is erroneously being reported as one.

Comment 10 Steve Baker 2015-06-26 23:31:14 UTC
Since a 401 is being raised late in the deployment process, it is possible that the token is expiring.

Try raising the default token expiry limit even more. Also I assume "openstack overcloud deploy" is creating a token and using it for multiple heat requests as it polls for progress. It should be made aware of the token timeout and request new tokens as necessary (maybe switching to SessionClient would make this token renewal transparent).

Comment 11 Steve Baker 2015-06-28 23:20:09 UTC
If this is happening one hour into deployment then the problem is token expiry.

the undercloud /etc/keystone/keystone.conf [token]expiration is set to 3600 (1 hour).

It should be set to 14400 (4 hours) since overcloud stacks often take more than 1 hour to deploy.

It looks like this was set initially in tripleo-image-elements/elements/keystone/os-apply-config/etc/keystone/keystone.conf but the setting regressed when the undercloud switched to puppet deploy.

I will submit a change to instack-undercloud/elements/puppet-stack-config/puppet-stack-config.yaml.template to set keystone::token_expiration: 14400

Comment 12 Dan Sneddon 2015-06-29 06:43:18 UTC
The latest run where I tested network isolation on bare metal used the 2015-06-26.3 puddle, and the deployment failed because of https://bugzilla.redhat.com/show_bug.cgi?id=1236167. However, I didn't get any Heat errors, and the controller and compute nodes did get network configuration set up correctly. I got to ControllerNodesPostDeployment CREATE_COMPLETE, which is a really good sign that I didn't run in to this bug the last time.

That said, increasing the timeout is a good idea, because it isn't hard for a complex bare metal deployment to take over an hour.

Comment 13 Ryan Brown 2015-06-29 18:02:40 UTC
I posted a patch to extend the timeout here https://code.engineering.redhat.com/gerrit/#/c/51898/

Comment 14 Ryan Brown 2015-06-29 20:05:51 UTC
The gerrithub patch is at https://review.gerrithub.io/#/c/237935/

Comment 15 wes hayutin 2015-06-30 18:29:07 UTC
*** Bug 1237221 has been marked as a duplicate of this bug. ***

Comment 16 James Slagle 2015-07-01 13:40:24 UTC
merged https://code.engineering.redhat.com/gerrit/#/c/51898/

Comment 17 Mike Burns 2015-07-01 17:30:16 UTC
*** Bug 1237318 has been marked as a duplicate of this bug. ***

Comment 20 errata-xmlrpc 2015-08-05 13:56:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2015:1549

Comment 21 Steve Baker 2016-12-13 20:45:34 UTC
*** Bug 1238133 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.