Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1414502

Summary: Overcloud major upgrade from OSP 8 to OSP 9 fails at Installing Aodh step with error message "Could not find declared class ::aodh" in Red Hat OpenStack Platform
Product: Red Hat OpenStack Reporter: Andreas Karis <akaris>
Component: openstack-tripleoAssignee: James Slagle <jslagle>
Status: CLOSED DUPLICATE QA Contact: Arik Chernetsky <achernet>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 9.0 (Mitaka)CC: aschultz, augol, mburns, mcornea, rhel-osp-director-maint, sathlang
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-02-09 16:04:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andreas Karis 2017-01-18 16:36:00 UTC
While following the upgrade documentation from OSP 8 to OSP 9 to upgrade a Red Hat OpenStack Platform environment which was successfully upgraded from 7 to 8 and now needs to be upgraded to 9.

Upgrade fails at step 3.4.2. Installing Aodh, with the following error:
Raw

[stack@undercloud tmp]$  heat deployment-list | grep -v COMPLETE
WARNING (shell) "heat deployment-list" is deprecated, please use "openstack software deployment list" instead
+--------------------------------------+--------------------------------------+--------------------------------------+--------+----------+---------------------+---------------------------------------------------------------------+
| id                                   | config_id                            | server_id                            | action | status   | creation_time       | status_reason                                                       |
+--------------------------------------+--------------------------------------+--------------------------------------+--------+----------+---------------------+---------------------------------------------------------------------+
| dcce2f94-d195-48fa-aa3b-9cbadc5b0036 | f54073ec-c8e0-4564-8db9-10f1c75c9861 | f9d0b61e-9576-452c-a666-d281de1db418 | CREATE | FAILED   | 2017-01-13T20:34:57 | deploy_status_code : Deployment exited with non-zero status code: 1 |
| aea2d4ee-26ff-41da-93d1-2d51ea70cc1d | 12ae463d-166b-42ce-925f-169b78a31cd5 | 952ff3eb-2ce6-4381-9622-86c370f11f51 | CREATE | FAILED   | 2017-01-13T20:34:58 | deploy_status_code : Deployment exited with non-zero status code: 1 |
| e7282760-351d-446f-a65c-1bb274de2eea | d9acd13f-a6ab-4688-8af4-95a8bb105d1e | d397aa94-f6f1-4cf8-b43a-f6f6e6ad6e15 | CREATE | FAILED   | 2017-01-13T20:34:59 | deploy_status_code : Deployment exited with non-zero status code: 1 |
+--------------------------------------+--------------------------------------+--------------------------------------+--------+----------+---------------------+---------------------------------------------------------------------+

And a more detailed look at the deployments reveals:
Raw

[stack@undercloud tmp]$  heat deployment-show dcce2f94-d195-48fa-aa3b-9cbadc5b0036
WARNING (shell) "heat deployment-show" is deprecated, please use "openstack software deployment show" instead
{
  "status": "FAILED",
  "server_id": "f9d0b61e-9576-452c-a666-d281de1db418",
  "config_id": "f54073ec-c8e0-4564-8db9-10f1c75c9861",
  "output_values": {
    "deploy_stdout": "",
    "deploy_stderr": "Could not retrieve fact='apache_version', resolution='<anonymous>': undefined method `[]' for nil:NilClass\nCould not retrieve fact='apache_version', resolution='<anonymous>': undefined method `[]' for nil:NilClass\n\u001b[1;31mError: Puppet::Parser::AST::Resource failed with error ArgumentError: Could not find declared class ::aodh at /var/lib/heat-config/heat-config-puppet/f54073ec-c8e0-4564-8db9-10f1c75c9861.pp:30 on node controller1.example.com\nWrapped exception:\nCould not find declared class ::aodh\u001b[0m\n\u001b[1;31mError: Puppet::Parser::AST::Resource failed with error ArgumentError: Could not find declared class ::aodh at /var/lib/heat-config/heat-config-puppet/f54073ec-c8e0-4564-8db9-10f1c75c9861.pp:30 on node controller1.example.com\u001b[0m\n",
    "deploy_status_code": 1
  },
  "creation_time": "2017-01-13T20:34:57",
  "updated_time": "2017-01-13T20:35:36",
  "input_values": {},
  "action": "CREATE",
  "status_reason": "deploy_status_code : Deployment exited with non-zero status code: 1",
  "id": "dcce2f94-d195-48fa-aa3b-9cbadc5b0036"
}

This can be fixed by applying the workaround mentioned in AODH migration fails because puppet-aodh module cannot be found by Puppet, and made the following softlink on our controller nodes, and reran this step, it worked.
Raw

# ln -f  -s /usr/share/openstack-puppet/modules/* /etc/puppet/modules/

The upgrade reports completed:
Raw

Stack overcloud UPDATE_COMPLETE
Overcloud Endpoint: https://cloud.example.com:13000/v2.0
Overcloud Deployed

But aodh service is not up, even after a restart of the pcs cluster:
Raw

Failed Actions:
* openstack-heat-engine_start_0 on controller1 'not running' (7): call=279, status=complete, exitreason='none',
    last-rc-change='Fri Jan 13 21:21:01 2017', queued=0ms, exec=2082ms
* openstack-aodh-evaluator_start_0 on controller1 'not running' (7): call=238, status=complete, exitreason='none',
    last-rc-change='Fri Jan 13 21:20:28 2017', queued=0ms, exec=2084ms
* openstack-heat-engine_start_0 on controller0 'not running' (7): call=282, status=complete, exitreason='none',
    last-rc-change='Fri Jan 13 21:21:01 2017', queued=0ms, exec=2083ms
* openstack-aodh-evaluator_start_0 on controller0 'not running' (7): call=239, status=complete, exitreason='none',
    last-rc-change='Fri Jan 13 21:20:28 2017', queued=0ms, exec=2091ms
* openstack-heat-engine_start_0 on controller2 'not running' (7): call=278, status=complete, exitreason='none',
    last-rc-change='Fri Jan 13 21:21:01 2017', queued=0ms, exec=2081ms
* openstack-aodh-evaluator_start_0 on controller2 'not running' (7): call=241, status=complete, exitreason='none',
    last-rc-change='Fri Jan 13 21:20:28 2017', queued=0ms, exec=2084ms

Logs for aodh services on all overcloud controllers will show:
Raw

Jan 13 21:02:39 controller0.example.com aodh-notifier[14182]: AttributeError: 'Opt' object has no attribute 'group'

And the service status reveals:
Raw

[root@controller2 aodh]# systemctl status openstack-aodh-evaluator -l
● openstack-aodh-evaluator.service - OpenStack Alarm evaluator service
   Loaded: loaded (/usr/lib/systemd/system/openstack-aodh-evaluator.service; disabled; vendor preset: disabled)
   Active: failed (Result: start-limit) since Tue 2017-01-17 21:22:03 UTC; 13min ago
  Process: 31553 ExecStart=/usr/bin/aodh-evaluator --logfile /var/log/aodh/evaluator.log (code=exited, status=1/FAILURE)
 Main PID: 31553 (code=exited, status=1/FAILURE)

Jan 17 21:22:03 controller2.example.com systemd[1]: openstack-aodh-evaluator.service: main process exited, code=exited, status=1/FAILURE
Jan 17 21:22:03 controller2.example.com systemd[1]: Unit openstack-aodh-evaluator.service entered failed state.
Jan 17 21:22:03 controller2.example.com systemd[1]: openstack-aodh-evaluator.service failed.
Jan 17 21:22:03 controller2.example.com systemd[1]: openstack-aodh-evaluator.service holdoff time over, scheduling restart.
Jan 17 21:22:03 controller2.example.com systemd[1]: start request repeated too quickly for openstack-aodh-evaluator.service
Jan 17 21:22:03 controller2.example.com systemd[1]: Failed to start OpenStack Alarm evaluator service.
Jan 17 21:22:03 controller2.example.com systemd[1]: Unit openstack-aodh-evaluator.service entered failed state.
Jan 17 21:22:03 controller2.example.com systemd[1]: openstack-aodh-evaluator.service failed.

[root@controller2 aodh]# /usr/bin/aodh-evaluator --logfile /var/log/aodh/evaluator.log
No handlers could be found for logger "oslo_config.cfg"
Traceback (most recent call last):
  File "/usr/bin/aodh-evaluator", line 10, in <module>
    sys.exit(evaluator())
  File "/usr/lib/python2.7/site-packages/aodh/cmd/alarm.py", line 32, in evaluator
    conf = service.prepare_service()
  File "/usr/lib/python2.7/site-packages/aodh/service.py", line 70, in prepare_service
    keystone_client.setup_keystoneauth(conf)
  File "/usr/lib/python2.7/site-packages/aodh/keystone_client.py", line 145, in setup_keystoneauth
    if conf[CFG_GROUP].auth_type == "password-aodh-legacy":
  File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2950, in __getattr__
    return self._conf._get(name, self._group)
  File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2571, in _get
    value = self._do_get(name, group, namespace)
  File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2608, in _do_get
    return convert(opt._get_from_namespace(namespace, group_name))
  File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 811, in _get_from_namespace
    dname, dgroup = opt.name, opt.group
AttributeError: 'Opt' object has no attribute 'group'

Resolution

This upstream bug report explains that one needs to upgrade keystone https://bugs.launchpad.net/keystoneauth/+bug/1505906: update python-keystoneauth1
Raw

[root@controller2 oslo_config]# rpm -qa | grep keystoneauth
python-keystoneauth1-1.1.0-4.el7ost.noarch
[root@controller2 oslo_config]# yum update python-keystoneauth1

Now, start all aodh services on the controllers, e.g. for the openstack-aod-evaluator service:
Raw

[root@controller1 heat-admin]#  systemctl start openstack-aodh-evaluator
[root@controller1 heat-admin]# sleep 120; systemctl status openstack-aodh-evaluator
● openstack-aodh-evaluator.service - OpenStack Alarm evaluator service
   Loaded: loaded (/usr/lib/systemd/system/openstack-aodh-evaluator.service; disabled; vendor preset: disabled)
   Active: active (running) since Tue 2017-01-17 22:05:34 UTC; 2min 6s ago
 Main PID: 10361 (aodh-evaluator)
   CGroup: /system.slice/openstack-aodh-evaluator.service
           └─10361 /usr/bin/python2 /usr/bin/aodh-evaluator --logfile /var/log/aodh/evaluator.log

Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold   ...quest
Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold   ...ate()
Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold   ...icate
Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold   ...self)
Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold   ...icate
Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold   ...ons()
Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold   ...tions
Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold   ...opts)
Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold Au..._name
Jan 17 22:07:34 controller1.example.com aodh-evaluator[10361]: 2017-01-17 22:07:34.985 10361 ERROR aodh.evaluator.threshold
Hint: Some lines were ellipsized, use -l to show in full.

In order to start all aodh services, execute:
Raw

pcs resource cleanup

Comment 1 Sofer Athlan-Guyot 2017-02-09 16:04:21 UTC
Following this one with the other bz.

*** This bug has been marked as a duplicate of bug 1379436 ***