Bug 1387344
Summary: | sahara-api services not running after Liberty to Mitaka Upgrade | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Randy Perryman <randy_perryman> | ||||||||||
Component: | openstack-tripleo-heat-templates | Assignee: | Marios Andreou <mandreou> | ||||||||||
Status: | CLOSED NOTABUG | QA Contact: | mlammon | ||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||
Priority: | unspecified | ||||||||||||
Version: | 8.0 (Liberty) | CC: | achernet, arkady_kanevsky, david_paterson, egafford, jcoufal, jschluet, kasmith, ltoscano, mandreou, mburns, morazi, ohochman, randy_perryman, rhel-osp-director-maint, sathlang, sumedh_sathaye, wayne_allen | ||||||||||
Target Milestone: | --- | Flags: | mandreou:
needinfo-
|
||||||||||
Target Release: | 9.0 (Mitaka) | ||||||||||||
Hardware: | Unspecified | ||||||||||||
OS: | Unspecified | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | 1381628 | Environment: | |||||||||||
Last Closed: | 2016-12-02 17:54:20 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Bug Depends On: | 1381628 | ||||||||||||
Bug Blocks: | 1305654, 1337794 | ||||||||||||
Attachments: |
|
Description
Randy Perryman
2016-10-20 16:31:54 UTC
I have just complete the Liberty to Mitaka Update and find that Sahara is now installed and trying to run, but the service is not configured and should not be installed. Hi Randy... indeed the Sahara service was new for Mitaka. It *should* be configured with reasonable defaults (i.e. even if you aren't setting any of the Sahara config explicitly) - for example a grep for 'sahara' on the current stable/mitaka branch of the tripleo-heat-templates https://github.com/openstack/tripleo-heat-templates/tree/stable/mitaka ./puppet/hieradata/database.yaml:72:sahara::db::mysql::user: sahara ./puppet/hieradata/database.yaml:73:sahara::db::mysql::host: "%{hiera('mysql_virtual_ip')}" ./puppet/controller.yaml:1310: sahara_password: {get_param: SaharaPassword} ./puppet/controller.yaml:1311: sahara_dsn: ./puppet/controller.yaml-1312- list_join: ./puppet/controller.yaml:1735: sahara::admin_password: {get_input: sahara_password} ./puppet/controller.yaml:1736: sahara::auth_uri: {get_input: keystone_auth_uri} ./puppet/controller.yaml:1737: sahara::admin_user: sahara ./puppet/controller.yaml:1738: sahara::identity_uri: {get_input: keystone_identity_uri} ./puppet/controller.yaml:1739: sahara::use_neutron: true ./puppet/hieradata/controller.yaml:59:sahara::admin_tenant_name: 'service' ./puppet/manifests/overcloud_controller_pacemaker.pp:1429: pacemaker::constraint::base { 'sahara-api-then-sahara-engine-constraint': So are you seeing a Sahara related error during the Liberty to Mitaka upgrade which you believe to be related to missing configuration? It *could* be a bug and I'm trying to understand more about what is happening here. Are you sure you've used the mitaka templates during your upgrade (e.g. Liberty templates do *not* setup config for Sahara...)? If so can we see a trace/more info about the error you get Hi, is this a real bug or just a clone of the "9-to-10" upgrade for the "8-to-9" case? The two scenarios are different: 9 has no composable roles, so Sahara is forcibly on. Unless Sahara is not working after the upgrade, there is much to be done (and I would say that this should be closed). 10 has composable roles, but this is in the scope of the other bug (rhbz#1387343). (In reply to Luigi Toscano from comment #4) > Hi, > is this a real bug or just a clone of the "9-to-10" upgrade for the "8-to-9" > case? > > The two scenarios are different: > 9 has no composable roles, so Sahara is forcibly on. Unless Sahara is not > working after the upgrade, there is much to be done (and I would say that > this should be closed). > 10 has composable roles, but this is in the scope of the other bug > (rhbz#1387343). Correction: the main bug for 10 is the original one (1381628). I'd suggest to close this one as duplicate of 1387343, and move then discuss there (I still think that 1387343 should be closed too for the reasons exposed above). *** Bug 1387343 has been marked as a duplicate of this bug. *** I was not expecting Sahara installed and configured, as I have not setup my install to support it. The reason for the bug is that the sahara-api failed to start on any of my servers. openstack-sahara-api_start_0 on overcloud-controller-1 'not running' (7): call=1433, status=complete, exitreason='none', last-rc-change='Fri Oct 21 12:05:11 2016', queued=0ms, exec=2118ms sosreports are attached to bug https://bugzilla.redhat.com/show_bug.cgi?id=1385143 from Sahara api log: 2016-10-21 12:06:19.969 25006 INFO keystonemiddleware.auth_token [-] Starting Keystone auth_token middleware 2016-10-21 12:06:19.971 25006 WARNING keystonemiddleware.auth_token [-] Use of the auth_admin_prefix, auth_host, auth_port, auth_protocol, identity_uri, admin_token, admin_user, admin_password, and admin_tenant_name configuration options was deprecated in the Mitaka release in favor of an auth_plugin and its related options. This class may be removed in a future release. 2016-10-21 12:06:19.976 25006 INFO sahara.main [-] Driver distributed successfully loaded 2016-10-21 12:06:19.979 25006 ERROR oslo.service.wsgi [-] Could not bind to :8386 2016-10-21 12:06:19.980 25006 CRITICAL sahara [-] error: [Errno 98] Address already in use 2016-10-21 12:06:19.980 25006 ERROR sahara Traceback (most recent call last): 2016-10-21 12:06:19.980 25006 ERROR sahara File "/usr/bin/sahara-api", line 10, in <module> 2016-10-21 12:06:19.980 25006 ERROR sahara sys.exit(main()) 2016-10-21 12:06:19.980 25006 ERROR sahara File "/usr/lib/python2.7/site-packages/sahara/cli/sahara_api.py", line 60, in main 2016-10-21 12:06:19.980 25006 ERROR sahara api_service = server.SaharaWSGIService("sahara-api", app) 2016-10-21 12:06:19.980 25006 ERROR sahara File "/usr/lib/python2.7/site-packages/sahara/main.py", line 70, in __init__ 2016-10-21 12:06:19.980 25006 ERROR sahara use_ssl=sslutils.is_enabled(CONF)) 2016-10-21 12:06:19.980 25006 ERROR sahara File "/usr/lib/python2.7/site-packages/oslo_service/wsgi.py", line 115, in __init__ 2016-10-21 12:06:19.980 25006 ERROR sahara self.socket = self._get_socket(host, port, backlog) 2016-10-21 12:06:19.980 25006 ERROR sahara File "/usr/lib/python2.7/site-packages/oslo_service/wsgi.py", line 143, in _get_socket 2016-10-21 12:06:19.980 25006 ERROR sahara sock = eventlet.listen(bind_addr, family, backlog=backlog) 2016-10-21 12:06:19.980 25006 ERROR sahara File "/usr/lib/python2.7/site-packages/eventlet/convenience.py", line 44, in listen 2016-10-21 12:06:19.980 25006 ERROR sahara sock.listen(backlog) 2016-10-21 12:06:19.980 25006 ERROR sahara File "/usr/lib64/python2.7/socket.py", line 224, in meth 2016-10-21 12:06:19.980 25006 ERROR sahara return getattr(self._sock,name)(*args) 2016-10-21 12:06:19.980 25006 ERROR sahara error: [Errno 98] Address already in use 2016-10-21 12:06:19.980 25006 ERROR sahara haproxy configuration: listen sahara bind 192.168.120.136:8386 transparent bind 192.168.190.125:8386 transparent server overcloud-controller-0 :8386 check fall 5 inter 2000 rise 2 ------------ to fix add the following to haproxy bind 192.168.120.136:8386 transparent bind 192.168.190.125:8386 transparent server overcloud-controller-0 192.168.140.102:8386 check fall 5 inter 2000 rise 2 server overcloud-controller-1 192.168.140.104:8386 check fall 5 inter 2000 rise 2 server overcloud-controller-0 192.168.140.106:8386 check fall 5 inter 2000 rise 2 and in sahara.conf host = 192.168.140.102 (IP of Server) So two items: 1. Why is sahara installed when it was not asked for? 2. Why was it not properly configured? (In reply to Randy Perryman from comment #14) > So two items: > 1. Why is sahara installed when it was not asked for? I can answer this: because OSP9 has no composable roles and Sahara is always installed and enabled. This is no more the case with OSP10, where it is disabled by default on new installations. > 2. Why was it not properly configured? That's the good question. If Sahara is enabled by default why isn't upgrade working with default settings. Please pursue resolution and if more information needed on our end please specify what you need. any update on this bug for why it's not getting enabled successfully on upgrade? Hi folks, to be clear the description in comment #0 is wrong (and potentially confusing) for the scenario being tracked here. I am removing the external trackers above as they do not apply for this BZ (they were brought in with the cloning of BZ 1381628 which is specific to OSP9 to OSP 10 upgrade). I think its worth clarifying that the issue seen here is: "sahara-api is not running on any of the controller nodes after the OSP8 to OSP9 upgrade". Randy has provided a trace in comment #11 of the error. From a couple of comments but notably comment #14 Randy brings up the issue of 'why is sahara installed at all?' @Randy unfortunately prior to OSP10 it isn't possible to configure the list of services that are deployed on nodes. In OSP8 there was no Sahara. In OSP9 there is Saraha. If removal of that service is important enough (I mean other than overcoming the potential misconfiguration here) then I'd suggest filing a new BZ like "remove sahara from the installed services in OSP9 because foo" - it should be tracked as a stand-alone issue if it is to be considered and effected. Randy can we please have the templates you used for this deployment? Looking at the trace in comment #11 it seems the culprit is quite clearly "2016-10-21 12:06:19.980 25006 CRITICAL sahara [-] error: [Errno 98] Address already in use". I am not sure why you hit this issue when it wasn't seen in our CI/QE testing - we may well have missed something but we should start by sanity checking the values you provided on deploy (i.e. the templates and environment files you used for the upgrade process). In the meantime I'll also followup to the Storage and Deployment teams to have a look once those are available. thanks, marios Marios that install is long gone, so I do not have the templates. I do have the two templates we run for network-environment.yaml and the dell-environment.yaml. The rest were all stock from the install. I will attach them. Also could the Address already in use be a by-product of the service starting, moving under PCS and attempt restart? Created attachment 1227325 [details]
Netwrok Enviroment Files Used
Created attachment 1227326 [details]
Dell Specific Environment
Using the following openstack-tripleo-heat-templates-2.0.0-36.el7ost.noarch ---------------------- -e ~/pilot/templates/overcloud/environments/network-isolation.yaml \ -e ~/pilot/templates/overcloud/environments/storage-environment.yaml \ -e ~/pilot/templates/overcloud/environments/puppet-pacemaker.yaml \ -e ~/pilot/templates/overcloud/environments/major-upgrade-aodh.yaml \ -e ~/pilot/templates/dell-environment.yaml \ -e ~/pilot/templates/network-environment.yaml \ --------------------- The templates are copied to ~/pilot/templates/overcloud. Created attachment 1227422 [details]
Config File
Sahara
Created attachment 1227423 [details]
api-paste
Just discovered I have not defined: SaharaApiNetwork: internal_api in my network-environment.yaml Is there a document that explicitly calls out all endpoints that need to be mapped? Can be closed not a bug |