Bug 1635346

Summary: Cinder fails to connect to internal Ceph cluster with customize name
Product: Red Hat OpenStack Reporter: Yogev Rabl <yrabl>
Component: puppet-cinderAssignee: Giulio Fidente <gfidente>
Status: CLOSED ERRATA QA Contact: Yogev Rabl <yrabl>
Severity: medium Docs Contact:
Priority: medium    
Version: 14.0 (Rocky)CC: abishop, gfidente, jjoyce, johfulto, jschluet, mburns, slinaber, tvignaud
Target Milestone: betaKeywords: Triaged
Target Release: 14.0 (Rocky)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: puppet-cinder-13.3.1-0.20181013114720.25b1ba3.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1646398 (view as bug list) Environment:
Last Closed: 2019-01-11 11:53:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1410195, 1646398    

Description Yogev Rabl 2018-10-02 17:10:57 UTC
Description of problem:
Cinder fails to connect to an internal Ceph cluster with customize name. At first try cinder-volume logs show the messages:

2018-10-01 19:53:06.141 60 ERROR cinder.volume.drivers.rbd [req-f7b133c8-f050-45e5-9bd9-a5d0f19d956f - - - - -] Error connecting to ceph cluster.: OSError: [errno 95] error connecting to the cluster
2018-10-01 19:53:06.141 60 ERROR cinder.volume.drivers.rbd Traceback (most recent call last):
2018-10-01 19:53:06.141 60 ERROR cinder.volume.drivers.rbd   File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 349, in _do_conn
2018-10-01 19:53:06.141 60 ERROR cinder.volume.drivers.rbd     client.connect()
2018-10-01 19:53:06.141 60 ERROR cinder.volume.drivers.rbd   File "rados.pyx", line 886, in rados.Rados.connect (/builddir/build/BUILD/ceph-12.2.5/build/src/pybind/rados/pyrex/rados.c:9784)
2018-10-01 19:53:06.141 60 ERROR cinder.volume.drivers.rbd OSError: [errno 95] error connecting to the cluster
2018-10-01 19:53:06.141 60 ERROR cinder.volume.drivers.rbd

As an account of it, cinder is not functional at all, though Glance and Nova are working fine with the same cluster and the cluster itself is healthy.

The Ceph directory on the container is:
# docker exec -it openstack-cinder-volume-docker-0 ls /etc/ceph
CephSite1.client.admin.keyring      CephSite1.mgr.controller-0.keyring
CephSite1.client.manila.keyring     CephSite1.mgr.controller-1.keyring
CephSite1.client.openstack.keyring  CephSite1.mgr.controller-2.keyring
CephSite1.client.radosgw.keyring    CephSite1.mon.keyring
CephSite1.conf                      rbdmap

The backend configuration in cinder.conf is

[tripleo_ceph]
backend_host=hostgroup
volume_backend_name=tripleo_ceph
volume_driver=cinder.volume.drivers.rbd.RBDDriver
rbd_ceph_conf=/etc/ceph/CephSite1.conf
rbd_user=openstack
rbd_pool=volumes
rbd_secret_uuid=<secret key>

Version-Release number of selected component (if applicable):
rhosp14/openstack-cinder-volume:2018-09-26.1
$ docker exec -it mistral_executor rpm -qa | grep ceph
ceph-ansible-3.1.5-1.el7cp.noarch

How reproducible:


Steps to Reproduce:
1. Deploy a ceph cluster with customized name
parameter_defaults:
    CephClusterName: CephSite1
2. create a volume


Actual results:
Cinder fails to connect to the Ceph cluster using the rbd driver

Expected results:
Cinder connects to the Ceph cluster with RBD driver with the given parameters from tripleo

Additional info:

Comment 1 Giulio Fidente 2018-10-02 17:30:26 UTC
We test custom Ceph cluster names upstream ... this looks more like an issue with rbd_secret_uuid this is empty?

Comment 2 Yogev Rabl 2018-10-03 13:50:36 UTC
(In reply to Giulio Fidente from comment #1)
> We test custom Ceph cluster names upstream ... this looks more like an issue
> with rbd_secret_uuid this is empty?

No, the secret key is set and set properly, cause Nova is able to run VMs with ephemeral disks on the Ceph cluster

Comment 21 Yogev Rabl 2018-11-29 13:16:36 UTC
verified

Comment 23 errata-xmlrpc 2019-01-11 11:53:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:0045