Bug 1414502
| Summary: | Overcloud major upgrade from OSP 8 to OSP 9 fails at Installing Aodh step with error message "Could not find declared class ::aodh" in Red Hat OpenStack Platform | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Andreas Karis <akaris> |
| Component: | openstack-tripleo | Assignee: | James Slagle <jslagle> |
| Status: | CLOSED DUPLICATE | QA Contact: | Arik Chernetsky <achernet> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 9.0 (Mitaka) | CC: | aschultz, augol, mburns, mcornea, rhel-osp-director-maint, sathlang |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-02-09 16:04:21 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Following this one with the other bz. *** This bug has been marked as a duplicate of bug 1379436 *** |
While following the upgrade documentation from OSP 8 to OSP 9 to upgrade a Red Hat OpenStack Platform environment which was successfully upgraded from 7 to 8 and now needs to be upgraded to 9. Upgrade fails at step 3.4.2. Installing Aodh, with the following error: Raw [stack@undercloud tmp]$ heat deployment-list | grep -v COMPLETE WARNING (shell) "heat deployment-list" is deprecated, please use "openstack software deployment list" instead +--------------------------------------+--------------------------------------+--------------------------------------+--------+----------+---------------------+---------------------------------------------------------------------+ | id | config_id | server_id | action | status | creation_time | status_reason | +--------------------------------------+--------------------------------------+--------------------------------------+--------+----------+---------------------+---------------------------------------------------------------------+ | dcce2f94-d195-48fa-aa3b-9cbadc5b0036 | f54073ec-c8e0-4564-8db9-10f1c75c9861 | f9d0b61e-9576-452c-a666-d281de1db418 | CREATE | FAILED | 2017-01-13T20:34:57 | deploy_status_code : Deployment exited with non-zero status code: 1 | | aea2d4ee-26ff-41da-93d1-2d51ea70cc1d | 12ae463d-166b-42ce-925f-169b78a31cd5 | 952ff3eb-2ce6-4381-9622-86c370f11f51 | CREATE | FAILED | 2017-01-13T20:34:58 | deploy_status_code : Deployment exited with non-zero status code: 1 | | e7282760-351d-446f-a65c-1bb274de2eea | d9acd13f-a6ab-4688-8af4-95a8bb105d1e | d397aa94-f6f1-4cf8-b43a-f6f6e6ad6e15 | CREATE | FAILED | 2017-01-13T20:34:59 | deploy_status_code : Deployment exited with non-zero status code: 1 | +--------------------------------------+--------------------------------------+--------------------------------------+--------+----------+---------------------+---------------------------------------------------------------------+ And a more detailed look at the deployments reveals: Raw [stack@undercloud tmp]$ heat deployment-show dcce2f94-d195-48fa-aa3b-9cbadc5b0036 WARNING (shell) "heat deployment-show" is deprecated, please use "openstack software deployment show" instead { "status": "FAILED", "server_id": "f9d0b61e-9576-452c-a666-d281de1db418", "config_id": "f54073ec-c8e0-4564-8db9-10f1c75c9861", "output_values": { "deploy_stdout": "", "deploy_stderr": "Could not retrieve fact='apache_version', resolution='<anonymous>': undefined method `[]' for nil:NilClass\nCould not retrieve fact='apache_version', resolution='<anonymous>': undefined method `[]' for nil:NilClass\n\u001b[1;31mError: Puppet::Parser::AST::Resource failed with error ArgumentError: Could not find declared class ::aodh at /var/lib/heat-config/heat-config-puppet/f54073ec-c8e0-4564-8db9-10f1c75c9861.pp:30 on node controller1.example.com\nWrapped exception:\nCould not find declared class ::aodh\u001b[0m\n\u001b[1;31mError: Puppet::Parser::AST::Resource failed with error ArgumentError: Could not find declared class ::aodh at /var/lib/heat-config/heat-config-puppet/f54073ec-c8e0-4564-8db9-10f1c75c9861.pp:30 on node controller1.example.com\u001b[0m\n", "deploy_status_code": 1 }, "creation_time": "2017-01-13T20:34:57", "updated_time": "2017-01-13T20:35:36", "input_values": {}, "action": "CREATE", "status_reason": "deploy_status_code : Deployment exited with non-zero status code: 1", "id": "dcce2f94-d195-48fa-aa3b-9cbadc5b0036" } This can be fixed by applying the workaround mentioned in AODH migration fails because puppet-aodh module cannot be found by Puppet, and made the following softlink on our controller nodes, and reran this step, it worked. Raw # ln -f -s /usr/share/openstack-puppet/modules/* /etc/puppet/modules/ The upgrade reports completed: Raw Stack overcloud UPDATE_COMPLETE Overcloud Endpoint: https://cloud.example.com:13000/v2.0 Overcloud Deployed But aodh service is not up, even after a restart of the pcs cluster: Raw Failed Actions: * openstack-heat-engine_start_0 on controller1 'not running' (7): call=279, status=complete, exitreason='none', last-rc-change='Fri Jan 13 21:21:01 2017', queued=0ms, exec=2082ms * openstack-aodh-evaluator_start_0 on controller1 'not running' (7): call=238, status=complete, exitreason='none', last-rc-change='Fri Jan 13 21:20:28 2017', queued=0ms, exec=2084ms * openstack-heat-engine_start_0 on controller0 'not running' (7): call=282, status=complete, exitreason='none', last-rc-change='Fri Jan 13 21:21:01 2017', queued=0ms, exec=2083ms * openstack-aodh-evaluator_start_0 on controller0 'not running' (7): call=239, status=complete, exitreason='none', last-rc-change='Fri Jan 13 21:20:28 2017', queued=0ms, exec=2091ms * openstack-heat-engine_start_0 on controller2 'not running' (7): call=278, status=complete, exitreason='none', last-rc-change='Fri Jan 13 21:21:01 2017', queued=0ms, exec=2081ms * openstack-aodh-evaluator_start_0 on controller2 'not running' (7): call=241, status=complete, exitreason='none', last-rc-change='Fri Jan 13 21:20:28 2017', queued=0ms, exec=2084ms Logs for aodh services on all overcloud controllers will show: Raw Jan 13 21:02:39 controller0.example.com aodh-notifier[14182]: AttributeError: 'Opt' object has no attribute 'group' And the service status reveals: Raw [root@controller2 aodh]# systemctl status openstack-aodh-evaluator -l ● openstack-aodh-evaluator.service - OpenStack Alarm evaluator service Loaded: loaded (/usr/lib/systemd/system/openstack-aodh-evaluator.service; disabled; vendor preset: disabled) Active: failed (Result: start-limit) since Tue 2017-01-17 21:22:03 UTC; 13min ago Process: 31553 ExecStart=/usr/bin/aodh-evaluator --logfile /var/log/aodh/evaluator.log (code=exited, status=1/FAILURE) Main PID: 31553 (code=exited, status=1/FAILURE) Jan 17 21:22:03 controller2.example.com systemd[1]: openstack-aodh-evaluator.service: main process exited, code=exited, status=1/FAILURE Jan 17 21:22:03 controller2.example.com systemd[1]: Unit openstack-aodh-evaluator.service entered failed state. Jan 17 21:22:03 controller2.example.com systemd[1]: openstack-aodh-evaluator.service failed. Jan 17 21:22:03 controller2.example.com systemd[1]: openstack-aodh-evaluator.service holdoff time over, scheduling restart. Jan 17 21:22:03 controller2.example.com systemd[1]: start request repeated too quickly for openstack-aodh-evaluator.service Jan 17 21:22:03 controller2.example.com systemd[1]: Failed to start OpenStack Alarm evaluator service. Jan 17 21:22:03 controller2.example.com systemd[1]: Unit openstack-aodh-evaluator.service entered failed state. Jan 17 21:22:03 controller2.example.com systemd[1]: openstack-aodh-evaluator.service failed. [root@controller2 aodh]# /usr/bin/aodh-evaluator --logfile /var/log/aodh/evaluator.log No handlers could be found for logger "oslo_config.cfg" Traceback (most recent call last): File "/usr/bin/aodh-evaluator", line 10, in <module> sys.exit(evaluator()) File "/usr/lib/python2.7/site-packages/aodh/cmd/alarm.py", line 32, in evaluator conf = service.prepare_service() File "/usr/lib/python2.7/site-packages/aodh/service.py", line 70, in prepare_service keystone_client.setup_keystoneauth(conf) File "/usr/lib/python2.7/site-packages/aodh/keystone_client.py", line 145, in setup_keystoneauth if conf[CFG_GROUP].auth_type == "password-aodh-legacy": File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2950, in __getattr__ return self._conf._get(name, self._group) File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2571, in _get value = self._do_get(name, group, namespace) File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2608, in _do_get return convert(opt._get_from_namespace(namespace, group_name)) File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 811, in _get_from_namespace dname, dgroup = opt.name, opt.group AttributeError: 'Opt' object has no attribute 'group' Resolution This upstream bug report explains that one needs to upgrade keystone https://bugs.launchpad.net/keystoneauth/+bug/1505906: update python-keystoneauth1 Raw [root@controller2 oslo_config]# rpm -qa | grep keystoneauth python-keystoneauth1-1.1.0-4.el7ost.noarch [root@controller2 oslo_config]# yum update python-keystoneauth1 Now, start all aodh services on the controllers, e.g. for the openstack-aod-evaluator service: Raw [root@controller1 heat-admin]# systemctl start openstack-aodh-evaluator [root@controller1 heat-admin]# sleep 120; systemctl status openstack-aodh-evaluator ● openstack-aodh-evaluator.service - OpenStack Alarm evaluator service Loaded: loaded (/usr/lib/systemd/system/openstack-aodh-evaluator.service; disabled; vendor preset: disabled) Active: active (running) since Tue 2017-01-17 22:05:34 UTC; 2min 6s ago Main PID: 10361 (aodh-evaluator) CGroup: /system.slice/openstack-aodh-evaluator.service └─10361 /usr/bin/python2 /usr/bin/aodh-evaluator --logfile /var/log/aodh/evaluator.log Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold ...quest Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold ...ate() Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold ...icate Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold ...self) Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold ...icate Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold ...ons() Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold ...tions Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold ...opts) Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold Au..._name Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold Hint: Some lines were ellipsized, use -l to show in full. In order to start all aodh services, execute: Raw pcs resource cleanup