Bug 1480305 - ceph: Error: error connecting to the cluster: errno ENOTSUP in nova-compute.log
Summary: ceph: Error: error connecting to the cluster: errno ENOTSUP in nova-compute.log
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 3.0
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: rc
: 3.0
Assignee: Sébastien Han
QA Contact: Yogev Rabl
URL:
Whiteboard:
Depends On:
Blocks: 1479596
TreeView+ depends on / blocked
 
Reported: 2017-08-10 16:08 UTC by Giulio Fidente
Modified: 2018-06-26 23:46 UTC (History)
20 users (show)

Fixed In Version: RHEL: ceph-ansible-3.0.0-0.1.rc3.el7cp Ubuntu: ceph-ansible_3.0.0~rc3-2redhat1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1479596
Environment:
Last Closed: 2017-12-05 23:39:05 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible issues 1755 0 None closed Allow user to specify the mode of the openstack keys to be created 2021-01-27 15:56:32 UTC
Github ceph ceph-ansible pull 1756 0 None closed Allow user to specify the mode of the openstack keys 2021-01-27 15:56:32 UTC
Github ceph ceph-ansible pull 1759 0 None closed Set the permissions mode on all of the OpenStack keys 2021-01-27 15:56:32 UTC
Red Hat Product Errata RHBA-2017:3387 0 normal SHIPPED_LIVE Red Hat Ceph Storage 3.0 bug fix and enhancement update 2017-12-06 03:03:45 UTC

Description Giulio Fidente 2017-08-10 16:08:36 UTC
+++ This bug was initially created as a clone of Bug #1479596 +++

ceph: Error: error connecting to the cluster: errno ENOTSUP in nova-compute.log


Environment:
python-cephfs-10.2.7-28.el7cp.x86_64
openstack-nova-scheduler-16.0.0-0.20170805120344.5971dde.el7ost.noarch
ceph-mon-10.2.7-28.el7cp.x86_64
python-nova-16.0.0-0.20170805120344.5971dde.el7ost.noarch
openstack-nova-console-16.0.0-0.20170805120344.5971dde.el7ost.noarch
ceph-mds-10.2.7-28.el7cp.x86_64
libcephfs1-10.2.7-28.el7cp.x86_64
puppet-nova-11.3.0-0.20170805105252.30a205c.el7ost.noarch
puppet-ceph-2.3.1-0.20170805094345.868e6d6.el7ost.noarch
ceph-common-10.2.7-28.el7cp.x86_64
openstack-nova-conductor-16.0.0-0.20170805120344.5971dde.el7ost.noarch
openstack-nova-novncproxy-16.0.0-0.20170805120344.5971dde.el7ost.noarch
ceph-osd-10.2.7-28.el7cp.x86_64
ceph-selinux-10.2.7-28.el7cp.x86_64
openstack-nova-compute-16.0.0-0.20170805120344.5971dde.el7ost.noarch
openstack-nova-migration-16.0.0-0.20170805120344.5971dde.el7ost.noarch
openstack-nova-api-16.0.0-0.20170805120344.5971dde.el7ost.noarch
ceph-base-10.2.7-28.el7cp.x86_64
openstack-nova-common-16.0.0-0.20170805120344.5971dde.el7ost.noarch
ceph-radosgw-10.2.7-28.el7cp.x86_64
python-novaclient-9.1.0-0.20170804194758.0a53d19.el7ost.noarch
openstack-nova-placement-api-16.0.0-0.20170805120344.5971dde.el7ost.noarch



Steps to reproduce:
Deploy OC with ceph
Try to launch instance

Result:
The instance will get to state error

Looking for errors in nova-compute.log on compute node:

2017-08-08 23:22:16.425 1 ERROR nova.compute.manager 
2017-08-08 23:23:16.404 1 ERROR nova.compute.manager [req-0dadfb2e-4919-4d04-a859-57198dba71e3 - - - - -] No compute node record for host compute-0.localdomain: ComputeHostNotFound_Remote: Compute host compute-0.localdomain could not be found.
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager [req-0dadfb2e-4919-4d04-a859-57198dba71e3 - - - - -] Error updating resources for node compute-0.localdomain.: Error: error connecting to the cluster: errno ENOTSUP
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager Traceback (most recent call last):
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6563, in update_available_resource_for_node
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager     rt.update_available_resource(context, nodename)
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 610, in update_available_resource
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager     resources = self.driver.get_available_resource(nodename)
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager   File "/usr/lib/python2.7:/site-packages/nova/virt/libvirt/driver.py", line 5769, in get_available_resource
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager     disk_info_dict = self._get_local_gb_info()
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5336, in _get_local_gb_info
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager     info = LibvirtDriver._get_rbd_driver().get_pool_info()
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/storage/rbd_utils.py", line 369, in get_pool_info
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager     with RADOSClient(self) as client:
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/storage/rbd_utils.py", line 103, in __init__
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager     self.cluster, self.ioctx = driver._connect_to_rados(pool)
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/storage/rbd_utils.py", line 134, in _connect_to_rados
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager     client.connect()
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/rados.py", line 429, in connect
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager     raise make_ex(ret, "error connecting to the cluster")
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager Error: error connecting to the cluster: errno ENOTSUP
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager

--- Additional comment from Giulio Fidente on 2017-08-08 19:29:14 EDT ---

The /etc/ceph directory inside the compute container appears to be empty, while the libvirt container has it correctly populated.

--- Additional comment from Giulio Fidente on 2017-08-09 13:59:28 EDT ---

Probably due to https://bugs.launchpad.net/tripleo/+bug/1709683

--- Additional comment from John Fulton on 2017-08-09 17:33:29 EDT ---

This seems to require a change to THT and ceph-ansible:

- https://review.openstack.org/#/c/492303
- https://github.com/ceph/ceph-ansible/pull/1756

--- Additional comment from Ken Dreyer (Red Hat) on 2017-08-10 12:05:27 EDT ---

ceph-ansible PR 1756 tagged upstream as v3.0.0rc2.

Comment 3 Ken Dreyer (Red Hat) 2017-08-10 18:06:38 UTC
Looks like this also requires a second fix: https://github.com/ceph/ceph-ansible/pull/1759

Comment 5 Harish NV Rao 2017-09-16 10:03:05 UTC
Giulio, please provide the qa_ack if you or your team is going to test this fix.

Comment 10 errata-xmlrpc 2017-12-05 23:39:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3387


Note You need to log in before you can comment on or make changes to this bug.