Bug 1396185

Summary: rhos-director-10: After successful Upgrade from 9 -> 10 environment with SSL , failed to scale-up and add more compute-nodes to the overlcoud.
Product: Red Hat OpenStack Reporter: Omri Hochman <ohochman>
Component: rhosp-directorAssignee: Angus Thomas <athomas>
Status: CLOSED NOTABUG QA Contact: Omri Hochman <ohochman>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 10.0 (Newton)CC: dbecker, jcoufal, mandreou, mburns, morazi, rhel-osp-director-maint
Target Milestone: gaKeywords: Regression
Target Release: 10.0 (Newton)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-17 17:48:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Omri Hochman 2016-11-17 17:15:18 UTC
rhos-director-10: After successful Upgrade from 9 -> 10 environment with SSL , failed to  scale-up and add more compute-nodes to the overlcoud.  


Environment: 
-------------
instack-5.0.0-1.el7ost.noarch
instack-undercloud-5.0.0-4.el7ost.noarch
openstack-heat-common-7.0.0-6.el7ost.noarch
openstack-heat-engine-7.0.0-6.el7ost.noarch
python-heatclient-1.5.0-1.el7ost.noarch
heat-cfntools-1.3.0-2.el7ost.noarch
python-heat-tests-7.0.0-6.el7ost.noarch
python-heat-agent-0-0.7.1e6015dgit.el7ost.noarch
openstack-heat-api-7.0.0-6.el7ost.noarch
openstack-tripleo-heat-templates-5.0.0-1.7.el7ost.noarch
openstack-tripleo-heat-templates-compat-2.0.0-34.3.el7ost.noarch
openstack-heat-api-cfn-7.0.0-6.el7ost.noarch
puppet-heat-9.4.1-1.el7ost.noarch
openstack-heat-templates-0-0.8.20150605git.el7ost.noarch


Steps:
-------
(1) deploy OSP9 with SSL
(2) Upgrade from OSP9 to OSP10 
(3) Attempt to scale-up (add another Compute-node) 

Results: 
---------
(1) scale operation fails
(2) Heat stack switches to  UPDATE_FAILED
 

Original deployment command : 
------------------------------
openstack overcloud deploy --templates --control-scale 3 --compute-scale 1 --ceph-storage-scale 1   --neutron-network-type vxlan --neutron-tunnel-type
t 90 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yam
ack/ssl-heat-templates/environments/enable-tls.yaml -e /home/stack/ssl-heat-templates/environments/inject-trust-anchor.yaml

Scale command: 
----------------
#openstack overcloud deploy --templates --control-scale 3 --compute-scale 2 --ceph-storage-scale 1   --neutron-network-type vxlan --neutron-tu
 --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isol
 /home/stack/ssl-heat-templates/environments/enable-tls.yaml -e /home/stack/ssl-heat-templates/environments/inject-trust-anchor.yaml

heat-engine.log:
-----------------

2016-09-27 01:34:58.146 4404 INFO heat.engine.resources.openstack.heat.software_deployment [req-17911913-11c5-40c2-8b04-1a2224511a83 a0fa1724364a49528abc069e03de18f8 4bd96
ment to server failed: deploy_status_code : Deployment exited with non-zero status code: 6
2016-09-27 01:34:58.147 4404 INFO heat.engine.resource [req-17911913-11c5-40c2-8b04-1a2224511a83 a0fa1724364a49528abc069e03de18f8 4bd9686112fb45e0a067436e64fc1225 - - -] C
-23e9-4e81-ba5f-4afac226ed8c] Stack "overcloud-AllNodesDeploySteps-tzrutya5rhwu-ControllerDeployment_Step3-iv7noq7iwy2l" [58aa83ff-904e-4b88-a990-4bacdefb1f3f]
2016-09-27 01:34:58.147 4404 ERROR heat.engine.resource Traceback (most recent call last):
2016-09-27 01:34:58.147 4404 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 753, in _action_recorder
2016-09-27 01:34:58.147 4404 ERROR heat.engine.resource     yield
2016-09-27 01:34:58.147 4404 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 855, in _do_action
2016-09-27 01:34:58.147 4404 ERROR heat.engine.resource     yield self.action_handler_task(action, args=handler_args)
2016-09-27 01:34:58.147 4404 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/scheduler.py", line 353, in wrapper
2016-09-27 01:34:58.147 4404 ERROR heat.engine.resource     step = next(subtask)
2016-09-27 01:34:58.147 4404 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 806, in action_handler_task
2016-09-27 01:34:58.147 4404 ERROR heat.engine.resource     done = check(handler_data)
2016-09-27 01:34:58.147 4404 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resources/openstack/heat/software_deployment.py", line 435, in
2016-09-27 01:34:58.147 4404 ERROR heat.engine.resource     return self._check_complete()
2016-09-27 01:34:58.147 4404 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resources/openstack/heat/software_deployment.py", line 301, in
2016-09-27 01:34:58.147 4404 ERROR heat.engine.resource     raise exception.Error(message)
2016-09-27 01:34:58.147 4404 ERROR heat.engine.resource Error: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 6
2016-09-27 01:34:58.147 4404 ERROR heat.engine.resource 
2016-09-27 01:34:58.249 4404 DEBUG heat.engine.scheduler [req-17911913-11c5-40c2-8b04-1a2224511a83 a0fa1724364a49528abc069e03de18f8 4bd9686112fb45e0a067436e64fc1225 - - -]
thon2.7/site-packages/heat/engine/scheduler.py:280
2016-09-27 01:34:58.250 4404 DEBUG heat.engine.scheduler [req-17911913-11c5-40c2-8b04-1a2224511a83 a0fa1724364a49528abc069e03de18f8 4bd9686112fb45e0a067436e64fc1225 - - -]
lNodesDeploySteps-tzrutya5rhwu-ControllerDeployment_Step3-iv7noq7iwy2l" [58aa83ff-904e-4b88-a990-4bacdefb1f3f] complete step /usr/lib/python2.7/site-packages/heat/engine/s
2016-09-27 01:34:58.251 4404 INFO heat.engine.service [req-17911913-11c5-40c2-8b04-1a2224511a83 a0fa1724364a49528abc069e03de18f8 4bd9686112fb45e0a067436e64fc1225 - - -] St
2016-09-27 01:34:58.270 4404 DEBUG oslo_messaging._drivers.amqpdriver [req-17911913-11c5-40c2-8b04-1a2224511a83 a0fa1724364a49528abc069e03de18f8 4bd9686112fb45e0a067436e64
6414995e5a0301776079b NOTIFY exchange 'heat' topic 'notifications.error' _send /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:432
2016-09-27 01:34:58.313 4404 INFO heat.engine.stack [req-17911913-11c5-40c2-8b04-1a2224511a83 a0fa1724364a49528abc069e03de18f8 4bd9686112fb45e0a067436e64fc1225 - - -] Stac
Steps-tzrutya5rhwu-ControllerDeployment_Step3-iv7noq7iwy2l): Resource CREATE failed: Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exit
2016-09-27 01:34:58.469 4405 DEBUG heat.engine.scheduler [req-17911913-11c5-40c2-8b04-1a2224511a83 a0fa1724364a49528abc069e03de18f8 4bd9686112fb45e0a067436e64fc1225 - - -]
[9831c0b9-b2d5-4b0b-bdd6-eb400b5d7c00] running step /usr/lib/python2.7/site-packages/heat/engine/scheduler.py:216





 
from: ./list_nodes_status &> status.log  (Full: status.log file attached) 
----------
Notice: /Stage[main]/Neutron/Neutron_config[nova/tenant_name]/ensure: removed
Notice: /Stage[main]/Neutron/Neutron_config[database/max_retries]/ensure: removed
Notice: /Stage[main]/Neutron/Neutron_config[DEFAULT/router_distributed]/ensure: removed
Notice: /Stage[main]/Nova/Nova_config[DEFAULT/scheduler_host_manager]/ensure: removed
Notice: /Stage[main]/Neutron/Neutron_config[keystone_authtoken/user_domain_name]/ensure: removed
Notice: /Stage[main]/Neutron/Neutron_config[nova/user_domain_id]/ensure: removed
Notice: /Stage[main]/Swift/Package[swift]/ensure: created
Notice: /Stage[main]/Glance::Notify::Rabbitmq/Oslo::Messaging::Rabbit[glance_registry_config]/Glance_registry_config[oslo_messaging_rabbit/rabbit_port]/ensure: created
Notice: /Stage[main]/Glance::Registry::Db/Oslo::Db[glance_registry_config]/Glance_registry_config[database/connection]/value: value changed '[old secret redacted]' to '[ne
Notice: /Stage[main]/Neutron/Neutron_config[nova/auth_url]/ensure: removed
Notice: /Stage[main]/Neutron/Neutron_config[DEFAULT/nova_url]/ensure: removed
Notice: /Stage[main]/Neutron/Neutron_config[keystone_authtoken/auth_url]/ensure: removed
Notice: /Stage[main]/Aodh::Db/Oslo::Db[aodh_config]/Aodh_config[database/connection]/value: value changed '[old secret redacted]' to '[new secret redacted]'
Notice: /Stage[main]/Cinder::Db/Oslo::Db[cinder_config]/Cinder_config[database/connection]/value: value changed '[old secret redacted]' to '[new secret redacted]'
Notice: /Stage[main]/Apache/File[/etc/httpd/conf.d/openstack-dashboard.conf]/ensure: removed
Notice: /Stage[main]/Apache/File[/etc/httpd/conf.d/10-horizon_vhost.conf]/ensure: removed
Notice: /Stage[main]/Apache/File[/etc/httpd/conf.d/10-gnocchi_wsgi.conf]/ensure: removed
Notice: /Stage[main]/Apache/File[/etc/httpd/conf.d/10-aodh_wsgi.conf]/ensure: removed
Notice: /Stage[main]/Heat/Oslo::Messaging::Rabbit[heat_config]/Heat_config[oslo_messaging_rabbit/rabbit_ha_queues]/ensure: removed
Notice: /Stage[main]/Heat/Oslo::Messaging::Rabbit[heat_config]/Heat_config[oslo_messaging_rabbit/rabbit_port]/ensure: created
Notice: /Stage[main]/Heat::Db/Oslo::Db[heat_config]/Heat_config[database/connection]/value: value changed '[old secret redacted]' to '[new secret redacted]'
Notice: /Stage[main]/Heat/Oslo::Messaging::Rabbit[heat_config]/Heat_config[oslo_messaging_rabbit/rabbit_hosts]/ensure: removed
Notice: /Stage[main]/Heat/Oslo::Messaging::Notifications[heat_config]/Heat_config[oslo_messaging_notifications/driver]/ensure: removed
Notice: /Stage[main]/Apache/File[/etc/httpd/conf.d/10-ceilometer_wsgi.conf]/ensure: removed
Notice: /Stage[main]/Keystone::Db/Oslo::Db[keystone_config]/Keystone_config[database/connection]/value: value changed '[old secret redacted]' to '[new secret redacted]'
Notice: /Stage[main]/Swift::Deps/Anchor[swift::install::end]: Triggered 'refresh' from 1 events
Notice: /Stage[main]/Swift::Deps/Anchor[swift::service::begin]: Triggered 'refresh' from 1 events
Notice: /Stage[main]/Apache/Concat[/etc/httpd/conf/ports.conf]/File[/etc/httpd/conf/ports.conf]/content: content changed '{md5}96c3e4dee63555ec71d0d6677acdba87' to '{md5}d
Notice: /Stage[main]/Neutron/Neutron_config[DEFAULT/l3_ha]/ensure: removed
Notice: /Stage[main]/Glance::Deps/Anchor[glance::config::end]: Triggered 'refresh' from 7 events
Notice: /Stage[main]/Tripleo::Profile::Pacemaker::Database::Mysql/Exec[galera-ready]/returns: executed successfully
Notice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2016-11-17 15:22:19.655 392783 WARNING stevedore.named [-] Could not load sqlite
Notice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2016-11-17 15:22:19.656 392783 CRITICAL gnocchi [-] NoMatches: No 'gnocchi.indexer' driver found, loo
Notice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2016-11-17 15:22:19.656 392783 ERROR gnocchi Traceback (most recent call last):
Notice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2016-11-17 15:22:19.656 392783 ERROR gnocchi   File "/usr/bin/gnocchi-upgrade", line 10, in <module>
Notice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2016-11-17 15:22:19.656 392783 ERROR gnocchi     sys.exit(upgrade())
Notice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2016-11-17 15:22:19.656 392783 ERROR gnocchi   File "/usr/lib/python2.7/site-packages/gnocchi/cli.py"
Notice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2016-11-17 15:22:19.656 392783 ERROR gnocchi     index = indexer.get_driver(conf)



openstack stack failures list --long overcloud ( full  debug.log file attached )
----------------------------------------------------------------------------
xception: connect failed
Warning: Scope(Class[Cinder::Api]): keystone_enabled is deprecated, use auth_strategy instead.
Warning: Scope(Class[Keystone]): Fernet token is recommended in Mitaka release. The default for token_provider will be changed to 'fernet' in O release.
Warning: Scope(Class[Heat]): keystone_user_domain_id is deprecated, use the name option instead.
Warning: Scope(Class[Heat]): keystone_project_domain_id is deprecated, use the name option instead.
Warning: Scope(Class[Nova]): Could not look up qualified variable '::nova::scheduler::filter::cpu_allocation_ratio'; class ::nova::scheduler::filter has not been evaluated
Warning: Scope(Class[Nova]): Could not look up qualified variable '::nova::scheduler::filter::ram_allocation_ratio'; class ::nova::scheduler::filter has not been evaluated
Warning: Scope(Class[Nova]): Could not look up qualified variable '::nova::scheduler::filter::disk_allocation_ratio'; class ::nova::scheduler::filter has not been evaluate
Warning: Scope(Class[Mongodb::Server]): Replset specified, but no replset_members or replset_config provided.
Warning: Scope(Class[Nova::Keystone::Authtoken]): Could not look up qualified variable '::nova::api::admin_user'; class ::nova::api has not been evaluated
Warning: Scope(Class[Nova::Keystone::Authtoken]): Could not look up qualified variable '::nova::api::admin_password'; class ::nova::api has not been evaluated
Warning: Scope(Class[Nova::Keystone::Authtoken]): Could not look up qualified variable '::nova::api::admin_tenant_name'; class ::nova::api has not been evaluated
Warning: Scope(Class[Nova::Keystone::Authtoken]): Could not look up qualified variable '::nova::api::auth_uri'; class ::nova::api has not been evaluated
Warning: Scope(Class[Nova::Keystone::Authtoken]): Could not look up qualified variable '::nova::api::auth_version'; class ::nova::api has not been evaluated
Warning: Scope(Class[Nova::Keystone::Authtoken]): Could not look up qualified variable '::nova::api::identity_uri'; class ::nova::api has not been evaluated
Warning: Scope(Class[Ceilometer]): Both $metering_secret and $telemetry_secret defined, using $telemetry_secret
Warning: Scope(Haproxy::Config[haproxy]): haproxy: The $merge_options parameter will default to true in the next major release. Please review the documentation regarding t
Error: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]: Failed to call refresh: gnocchi-upgrade --config-file /etc/gnocchi/gnocchi.conf --skip-storage --create-legacy
 of [0]
Error: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]: gnocchi-upgrade --config-file /etc/gnocchi/gnocchi.conf --skip-storage --create-legacy-resource-types returned
Error: /Stage[main]/Glance::Db::Sync/Exec[glance-manage db_sync]: Failed to call refresh: glance-manage --config-file /etc/glance/glance-registry.conf db_sync returned 1 i
Error: /Stage[main]/Glance::Db::Sync/Exec[glance-manage db_sync]: glance-manage --config-file /etc/glance/glance-registry.conf db_sync returned 1 instead of one of [0]
Error: Could not prefetch keystone_tenant provider 'openstack': Execution of '/usr/bin/openstack project list --quiet --format csv --long' returned 1: An unexpected error 
 request. (HTTP 500) (Request-ID: req-e5a7760f-73b0-4573-acf3-f7fb925951ec) (tried 30, for a total of 170 seconds)
Error: Not managing Keystone_tenant[service] due to earlier Keystone API failures.
Error: /Stage[main]/Keystone::Roles::Admin/Keystone_tenant[service]/ensure: change from absent to present failed: Not managing Keystone_tenant[service] due to earlier Keys
Error: Not managing Keystone_tenant[admin] due to earlier Keystone API failures.
Error: /Stage[main]/Keystone::Roles::Admin/Keystone_tenant[admin]/ensure: change from absent to present failed: Not managing Keystone_tenant[admin] due to earlier Keystone
Error: Could not prefetch keystone_role provider 'openstack': Execution of '/usr/bin/openstack role list --quiet --format csv' returned 1: An unexpected error prevented th
TTP 500) (Request-ID: req-708dda06-31ef-44e1-8caa-8c103c5c3a53) (tried 35, for a total of 170 seconds)
Error: Not managing Keystone_role[admin] due to earlier Keystone API failures.
Error: /Stage[main]/Keystone::Roles::Admin/Keystone_role[admin]/ensure: change from absent to present failed: Not managing Keystone_role[admin] due to earlier Keystone API
Error: /Stage[main]/Keystone::Roles::Admin/Keystone_user[admin]: Could not evaluate: Execution of '/usr/bin/openstack domain list --quiet --format csv' returned 1: An unex
lfilling your request. (HTTP 500) (Request-ID: req-d859aa78-0870-452c-9fd4-f3bd4067117f) (tried 35, for a total of 170 seconds)
Warning: /Stage[main]/Keystone::Roles::Admin/Keystone_user_role[admin@admin]: Skipping because of failed dependencies
Error: Could not prefetch keystone_service provider 'openstack': Execution of '/usr/bin/openstack service list --quiet --format csv --long' returned 1: An unexpected error
r request. (HTTP 500) (Request-ID: req-7e574fe3-a6a9-4b5f-8f04-2470fd56dc98) (tried 35, for a total of 170 seconds)

Comment 2 Marios Andreou 2016-11-17 17:24:34 UTC
is the ssl significant here? i mean does the same scenario work w/out ssl setup? If so it may point to some issue with the ssl cert/config, esp as we are seeing auth related issues in comment #0. But this is just my first impression I will revisit tomorrow morning thanks please update with anything new

Comment 3 Omri Hochman 2016-11-17 17:48:31 UTC
Closing as not a bug - after Ben pointed me to missing SSL files that we need to add in OSP10 when doing Scale with SSL environment :
http://docs.openstack.org/developer/tripleo-docs/advanced_deployment/ssl.html#deploying-an-ssl-environment