Bug 1338954

Summary: openstack-gnocchi-statsd fails to start
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: openstack-tripleo-heat-templatesAssignee: Pradeep Kilambi <pkilambi>
Status: CLOSED ERRATA QA Contact: Yurii Prokulevych <yprokule>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 9.0 (Mitaka)CC: apevec, dbecker, fbaudin, jason.dobies, jjoyce, jschluet, lhh, mburns, morazi, rhel-osp-director-maint, sasha, tvignaud, yprokule
Target Milestone: gaKeywords: Triaged
Target Release: 9.0 (Mitaka)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-2.0.0-11.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-11 11:31:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
GnocchiUpdate_To_CEPH.log none

Description Marius Cornea 2016-05-23 17:40:17 UTC
Description of problem:
openstack-gnocchi-statsd pacemaker resources fail to start

/var/log/gnocchi/statsd.log shows:

INFO gnocchi.storage.ceph [-] Ceph storage backend use 'rados' python library
CRITICAL gnocchi [-] ObjectNotFound: error connecting to the cluster
ERROR gnocchi Traceback (most recent call last):
ERROR gnocchi   File "/usr/bin/gnocchi-statsd", line 10, in <module>
ERROR gnocchi     sys.exit(statsd())
ERROR gnocchi   File "/usr/lib/python2.7/site-packages/gnocchi/cli.py", line 61, in statsd
ERROR gnocchi     statsd_service.start()
ERROR gnocchi   File "/usr/lib/python2.7/site-packages/gnocchi/statsd.py", line 174, in start
ERROR gnocchi     stats = Stats(conf)
ERROR gnocchi   File "/usr/lib/python2.7/site-packages/gnocchi/statsd.py", line 38, in __init__
ERROR gnocchi     self.storage = storage.get_driver(self.conf)
ERROR gnocchi   File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 155, in get_driver
ERROR gnocchi     return get_driver_class(conf)(conf.storage)
ERROR gnocchi   File "/usr/lib/python2.7/site-packages/gnocchi/storage/ceph.py", line 80, in __init__
ERROR gnocchi     self.rados.connect()
ERROR gnocchi   File "/usr/lib/python2.7/site-packages/rados.py", line 429, in connect
ERROR gnocchi     raise make_ex(ret, "error connecting to the cluster")
ERROR gnocchi ObjectNotFound: error connecting to the cluster
ERROR gnocchi 
INFO gnocchi.storage.ceph [-] Ceph storage backend use 'rados' python library
CRITICAL gnocchi [-] ObjectNotFound: error connecting to the cluster


Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-liberty-2.0.0-7.el7ost.noarch
openstack-tripleo-heat-templates-2.0.0-7.el7ost.noarch
openstack-tripleo-heat-templates-kilo-2.0.0-7.el7ost.noarch

How reproducible:


Steps to Reproduce:
1. Deploy environment with 3 ctrls, 1 compute and 1 ceph storage node

Additional info:

[root@overcloud-controller-0 heat-admin]# grep -v ^# /etc/gnocchi/gnocchi.conf  | grep -v ^$
[DEFAULT]
log_dir = /var/log/gnocchi
[api]
port = 8041
host = 10.0.0.15
workers = 4
max_limit=1000
[archive_policy]
[database]
[indexer]
url = mysql+pymysql://gnocchi:xrTTwbaPqbaHTKZxbxmNeRetV.0.11/gnocchi
[keystone_authtoken]
auth_uri = http://10.0.0.11:5000/v2.0
identity_uri = http://192.168.0.29:35357
admin_user = gnocchi
admin_password = xrTTwbaPqbaHTKZxbxmNeRetV
admin_tenant_name = service
[oslo_policy]
[statsd]
resource_id = 0a8b55df-f90f-491c-8cb9-7cdecec6fc26
user_id = 27c0d3f8-e7ee-42f0-8317-72237d1c5ae3
project_id = 6c38cd8d-099a-4cb2-aecf-17be688e8616
archive_policy_name = low
flush_delay=10
[storage]
driver = ceph
ceph_pool = metrics
ceph_username = openstack
ceph_keyring = client.openstack
ceph_conffile = /etc/ceph/ceph.conf


When I try to use the client.openstack keyring I get an authentication error (22) Invalid argument error:

[root@overcloud-controller-0 ~]# ceph -k /etc/ceph/ceph.client.openstack.keyring  -c /etc/ceph/ceph.conf osd lspools
2016-05-23 17:39:50.778675 7faf00713700  0 librados: client.admin authentication error (22) Invalid argument
Error connecting to cluster: Error

Comment 3 Marius Cornea 2016-05-25 08:41:18 UTC
After setting ceph_keyring = /etc/ceph/ceph.client.openstack.keyring in /etc/gnocchi/gnocchi.conf it started ok. 


The value configured by the installer is: ceph_keyring = client.openstack

Comment 7 Jay Dobies 2016-06-08 20:08:21 UTC
Upstream backport still needs to merge: https://review.openstack.org/#/c/322312/

Comment 10 Alexander Chuzhoy 2016-06-30 22:13:43 UTC
Reproduced upon upgrade from 8.0 to 9.0:

[root@overcloud-controller-0 ~]# pcs status|grep -B1 -i stop
 Clone Set: openstack-gnocchi-metricd-clone [openstack-gnocchi-metricd]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
--
 Clone Set: openstack-gnocchi-statsd-clone [openstack-gnocchi-statsd]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]

Comment 16 Yurii Prokulevych 2016-07-22 06:26:23 UTC
Created attachment 1182715 [details]
GnocchiUpdate_To_CEPH.log

Comment 17 Yurii Prokulevych 2016-07-22 06:26:44 UTC
Overcloud deployed with next command:
-------------------------------------
openstack  overcloud deploy --libvirt-type qemu \
--ntp-server clock.redhat.com --templates \
--control-scale 3 --compute-scale 1 --ceph-storage-scale 1

Gnocchi* resources up and running.


Reconfigured gnocchi to use ceph backend(from default file) via stack update:
----------------------------------------------------------
openstack  overcloud deploy --libvirt-type qemu \
--ntp-server clock.redhat.com --templates \
--control-scale 3 --compute-scale 1 --ceph-storage-scale 1 \
-e gnocchi_ceph.yaml

$ cat gnocchi_ceph.yaml 
parameter_defaults:
    GnocchiBackend: 'rbd'


Gnocchi resources up and running.

Packages:
openstack-tripleo-heat-templates-2.0.0-16.el7ost.noarch

Comment 19 errata-xmlrpc 2016-08-11 11:31:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-1599.html