Bug 1533875 - Using the Telmetry Role with Ceph/RBD as gnocchi backend Fails in step 4 of the Deployment
Summary: Using the Telmetry Role with Ceph/RBD as gnocchi backend Fails in step 4 of t...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 12.0 (Pike)
Hardware: All
OS: Linux
low
low
Target Milestone: z2
: 12.0 (Pike)
Assignee: Giulio Fidente
QA Contact: Yogev Rabl
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-01-12 12:56 UTC by daniel parkes
Modified: 2018-03-28 17:17 UTC (History)
5 users (show)

Fixed In Version: openstack-tripleo-heat-templates-7.0.9-1.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-03-28 17:16:42 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:0602 None None None 2018-03-28 17:17:11 UTC
OpenStack gerrit 541034 None stable/pike: MERGED tripleo-heat-templates: Add CephClient and CephExternal to the Telemetry role (I0644d028c269afce4c561bbf5b8ca1f2c4addda2... 2018-02-16 17:52:16 UTC
Launchpad 1746525 None None None 2018-01-31 14:20:35 UTC

Description daniel parkes 2018-01-12 12:56:25 UTC
Description of problem:

Using the Telmetry Role with Ceph/RBD as gnocchi backend Fails in step 4 of the Deployment while running the gnocchi_db_sync container:

Using  the openstack overcloud role command to configure the roles_data.yaml file with the provided Telemetry Role.


(undercloud) [stack@under12 environments]$ tail -35 roles_data.yaml 
# Role: Telemetry                                                             #
###############################################################################
- name: Telemetry
  description: |
    Telemetry role that has all the telemetry services.
  CountDefault: 1
  networks:
    - External
    - InternalApi
    - Storage
  HostnameFormatDefault: '%stackname%-telemetry-%index%'
  ServicesDefault:
#    - OS::TripleO::Services::CephClient
    - OS::TripleO::Services::AodhApi
    - OS::TripleO::Services::AodhEvaluator
    - OS::TripleO::Services::AodhListener
    - OS::TripleO::Services::AodhNotifier
    - OS::TripleO::Services::CACerts
    - OS::TripleO::Services::CeilometerAgentCentral
    - OS::TripleO::Services::CeilometerAgentNotification
    - OS::TripleO::Services::CeilometerApi
    - OS::TripleO::Services::CeilometerCollector
    - OS::TripleO::Services::CeilometerExpirer
    - OS::TripleO::Services::CertmongerUser
    - OS::TripleO::Services::Docker
    - OS::TripleO::Services::GnocchiApi
    - OS::TripleO::Services::GnocchiMetricd
    - OS::TripleO::Services::GnocchiStatsd
    - OS::TripleO::Services::Ntp
    - OS::TripleO::Services::ContainersLogrotateCrond
    - OS::TripleO::Services::PankoApi
    - OS::TripleO::Services::Timezone
    - OS::TripleO::Services::TripleoFirewall
    - OS::TripleO::Services::TripleoPackages
    - OS::TripleO::Services::Tuned

And selecting RBD for gnocchi:

(undercloud) [stack@under12 ~]$ cat templates/environments/storage-environment.yaml | grep -i gnocch
  ## Gnocchi backend can be either 'rbd' (Ceph), 'swift' or 'file'.
  GnocchiBackend: rbd


I get and error during the stage 4 of the tripleo deployment:

overcloud.AllNodesDeploySteps.TelemetryDeployment_Step4.0:
  resource_type: OS::Heat::StructuredDeployment
  physical_resource_id: ac47855a-b1bf-420f-85b1-545a7401b2b8
  status: CREATE_FAILED
  status_reason: |
    Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2


            "Error running ['docker', 'run', '--name', 'gnocchi_db_sync', '--label', 'config_id=tripleo_step4', '--label', 'container_name=gnocchi_db_sync', '--label', 'managed_by=paunch', '--label', 'config_data={\"command\": \"/usr/bin/bootstrap_host_exec gnocchi_api su gnocchi -s /bin/bash -c \\'/usr/bin/gnocchi-upgrade --sacks-number=128\\'\", \"user\": \"root\", \"volumes\": [\"/etc/hosts:/etc/hosts:ro\", \"/etc/localtime:/etc/localtime:ro\", \"/etc/puppet:/etc/puppet:ro\", \"/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro\", \"/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro\", \"/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro\", \"/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro\", \"/dev/log:/dev/log\", \"/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro\", \"/var/lib/config-data/gnocchi/etc/my.cnf.d/tripleo.cnf:/etc/my.cnf.d/tripleo.cnf:ro\", \"/var/lib/config-data/gnocchi/etc/gnocchi/:/etc/gnocchi/:ro\", \"/var/log/containers/gnocchi:/var/log/gnocchi\", \"/var/log/containers/httpd/gnocchi-api:/var/log/httpd\", \"/etc/ceph:/etc/ceph:ro\"], \"image\": \"10.10.11.2:8787/rhosp12/openstack-gnocchi-api:12.0-20171201.1\", \"detach\": false, \"net\": \"host\", \"privileged\": false}', '--net=host', '--privileged=false', '--user=root', '--volume=/etc/hosts:/etc/hosts:ro', '--volume=/etc/localtime:/etc/localtime:ro', '--volume=/etc/puppet:/etc/puppet:ro', '--volume=/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro', '--volume=/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro', '--volume=/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro', '--volume=/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro', '--volume=/dev/log:/dev/log', '--volume=/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro', '--volume=/var/lib/config-data/gnocchi/etc/my.cnf.d/tripleo.cnf:/etc/my.cnf.d/tripleo.cnf:ro', '--volume=/var/lib/config-data/gnocchi/etc/gnocchi/:/etc/gnocchi/:ro', '--volume=/var/log/containers/gnocchi:/var/log/gnocchi', '--volume=/var/log/containers/httpd/gnocchi-api:/var/log/httpd', '--volume=/etc/ceph:/etc/ceph:ro', '10.10.11.2:8787/rhosp12/openstack-gnocchi-api:12.0-20171201.1', '/usr/bin/bootstrap_host_exec', 'gnocchi_api', 'su', 'gnocchi', '-s', '/bin/bash', '-c', \"'/usr/bin/gnocchi-upgrade\", \"--sacks-number=128'\"]. [1]", 


Looking in the telemetry node:

root@telemetry0 ~]# docker ps -a | grep -i gnocc
212aaa976372        10.10.11.2:8787/rhosp12/openstack-gnocchi-api:12.0-20171201.1               "/usr/bin/bootstrap_h"   About an hour ago   Exited (1) About an hour ago                       gnocchi_db_sync

gnocchi_db_sync id failing with error:

2018-01-12 10:08:03,118 [10] CRITICAL root: Traceback (most recent call last):
  File "/usr/bin/gnocchi-upgrade", line 10, in <module>
    sys.exit(upgrade())
  File "/usr/lib/python2.7/site-packages/gnocchi/cli.py", line 66, in upgrade
    s = storage.get_driver(conf)
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 163, in get_driver
    conf.storage, incoming, coord)
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/ceph.py", line 48, in __init__
    self.rados, self.ioctx = ceph.create_rados_connection(conf)
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/common/ceph.py", line 67, in create_rados_connection
    conf=options)
  File "cradox.pyx", line 545, in cradox.Rados.__init__ (cradox.c:6512)
  File "cradox.pyx", line 445, in cradox.requires.wrapper.validate_func (cradox.c:4724)
  File "cradox.pyx", line 588, in cradox.Rados.__setup (cradox.c:7481)
  File "cradox.pyx", line 445, in cradox.requires.wrapper.validate_func (cradox.c:4724)
  File "cradox.pyx", line 651, in cradox.Rados.conf_read_file (cradox.c:8658)
Error: error calling conf_read_file: error code 22


The Telemetry nodes don't have the ceph keys created in /etc/ceph:

[root@telemetry0 ~]# ls -l /etc/ceph/
total 4
-rw-r--r--. 1 root root 92 Oct 12 00:04 rbdmap

So the dir being mapped into de container also doesn't have the keys available-

[root@telemetry0 gnocchi]# docker inspect gnocchi_db_sync | grep ceph
                "/etc/ceph:/etc/ceph:ro"
                "Source": "/etc/ceph",
                "Destination": "/etc/ceph",


I fixed the problem adding the cephclient service to the telemetry role:

- OS::TripleO::Services::CephClient

With this modification of the role, the ceph keys get created in /etc/ceph and the deployment finishes ok.




Version-Release number of selected component (if applicable):

OSP12.

(undercloud) [stack@under12 ~]$ rpm -qa | grep -i heat-templ
openstack-tripleo-heat-templates-7.0.3-18.el7ost.noarch
(undercloud) [stack@under12 ~]$ docker images | grep -i gnocchi
10.10.11.2:8787/rhosp12/openstack-gnocchi-metricd               12.0-20171201.1     62c74841422b        5 weeks ago         958.6 MB
10.10.11.2:8787/rhosp12/openstack-gnocchi-statsd                12.0-20171201.1     98ce78f956eb        5 weeks ago         958.6 MB
10.10.11.2:8787/rhosp12/openstack-gnocchi-api                   12.0-20171201.1     a5e647422823        5 weeks ago         958.6 MB



How reproducible:

Each time we run a deployment using the custom telemetry role and ceph as the gnocchi backend


Steps to Reproduce:
1. Use the Telemetry role provided in the heat templates
2. Configure the templates so that gnocchi uses RBD as a backend
3. Deploy de overcloud.

Actual results:

Deploy fails in step 4.

Expected results:

Deploy finishes ok. 

Additional info:

Comment 4 Yogev Rabl 2018-03-16 15:00:14 UTC
verified on openstack-tripleo-heat-templates-7.0.9-6.el7ost.noarch

Comment 7 errata-xmlrpc 2018-03-28 17:16:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:0602


Note You need to log in before you can comment on or make changes to this bug.