Bug 1479596

Summary: ceph: Error: error connecting to the cluster: errno ENOTSUP in nova-compute.log
Product: Red Hat OpenStack Reporter: Alexander Chuzhoy <sasha>
Component: openstack-tripleo-heat-templatesAssignee: John Fulton <johfulto>
Status: CLOSED ERRATA QA Contact: Yogev Rabl <yrabl>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 12.0 (Pike)CC: gfidente, jdurgin, johfulto, jschluet, kdreyer, lhh, mburns, mcornea, nlevine, rhel-osp-director-maint, scohen, srevivo
Target Milestone: rcKeywords: AutomationBlocker, Triaged
Target Release: 12.0 (Pike)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-7.0.0-0.20170805163048.el7ost.noarch.rpm Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1480305 (view as bug list) Environment:
Last Closed: 2017-12-13 21:51:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1480305    
Bug Blocks:    

Description Alexander Chuzhoy 2017-08-08 23:25:59 UTC
ceph: Error: error connecting to the cluster: errno ENOTSUP in nova-compute.log


Environment:
python-cephfs-10.2.7-28.el7cp.x86_64
openstack-nova-scheduler-16.0.0-0.20170805120344.5971dde.el7ost.noarch
ceph-mon-10.2.7-28.el7cp.x86_64
python-nova-16.0.0-0.20170805120344.5971dde.el7ost.noarch
openstack-nova-console-16.0.0-0.20170805120344.5971dde.el7ost.noarch
ceph-mds-10.2.7-28.el7cp.x86_64
libcephfs1-10.2.7-28.el7cp.x86_64
puppet-nova-11.3.0-0.20170805105252.30a205c.el7ost.noarch
puppet-ceph-2.3.1-0.20170805094345.868e6d6.el7ost.noarch
ceph-common-10.2.7-28.el7cp.x86_64
openstack-nova-conductor-16.0.0-0.20170805120344.5971dde.el7ost.noarch
openstack-nova-novncproxy-16.0.0-0.20170805120344.5971dde.el7ost.noarch
ceph-osd-10.2.7-28.el7cp.x86_64
ceph-selinux-10.2.7-28.el7cp.x86_64
openstack-nova-compute-16.0.0-0.20170805120344.5971dde.el7ost.noarch
openstack-nova-migration-16.0.0-0.20170805120344.5971dde.el7ost.noarch
openstack-nova-api-16.0.0-0.20170805120344.5971dde.el7ost.noarch
ceph-base-10.2.7-28.el7cp.x86_64
openstack-nova-common-16.0.0-0.20170805120344.5971dde.el7ost.noarch
ceph-radosgw-10.2.7-28.el7cp.x86_64
python-novaclient-9.1.0-0.20170804194758.0a53d19.el7ost.noarch
openstack-nova-placement-api-16.0.0-0.20170805120344.5971dde.el7ost.noarch



Steps to reproduce:
Deploy OC with ceph
Try to launch instance

Result:
The instance will get to state error

Looking for errors in nova-compute.log on compute node:

2017-08-08 23:22:16.425 1 ERROR nova.compute.manager 
2017-08-08 23:23:16.404 1 ERROR nova.compute.manager [req-0dadfb2e-4919-4d04-a859-57198dba71e3 - - - - -] No compute node record for host compute-0.localdomain: ComputeHostNotFound_Remote: Compute host compute-0.localdomain could not be found.
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager [req-0dadfb2e-4919-4d04-a859-57198dba71e3 - - - - -] Error updating resources for node compute-0.localdomain.: Error: error connecting to the cluster: errno ENOTSUP
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager Traceback (most recent call last):
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6563, in update_available_resource_for_node
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager     rt.update_available_resource(context, nodename)
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 610, in update_available_resource
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager     resources = self.driver.get_available_resource(nodename)
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager   File "/usr/lib/python2.7:/site-packages/nova/virt/libvirt/driver.py", line 5769, in get_available_resource
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager     disk_info_dict = self._get_local_gb_info()
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5336, in _get_local_gb_info
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager     info = LibvirtDriver._get_rbd_driver().get_pool_info()
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/storage/rbd_utils.py", line 369, in get_pool_info
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager     with RADOSClient(self) as client:
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/storage/rbd_utils.py", line 103, in __init__
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager     self.cluster, self.ioctx = driver._connect_to_rados(pool)
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/storage/rbd_utils.py", line 134, in _connect_to_rados
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager     client.connect()
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/rados.py", line 429, in connect
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager     raise make_ex(ret, "error connecting to the cluster")
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager Error: error connecting to the cluster: errno ENOTSUP
2017-08-08 23:23:16.429 1 ERROR nova.compute.manager

Comment 1 Giulio Fidente 2017-08-08 23:29:14 UTC
The /etc/ceph directory inside the compute container appears to be empty, while the libvirt container has it correctly populated.

Comment 2 Giulio Fidente 2017-08-09 17:59:28 UTC
Probably due to https://bugs.launchpad.net/tripleo/+bug/1709683

Comment 3 John Fulton 2017-08-09 21:33:29 UTC
This seems to require a change to THT and ceph-ansible:

- https://review.openstack.org/#/c/492303
- https://github.com/ceph/ceph-ansible/pull/1756

Comment 4 Ken Dreyer (Red Hat) 2017-08-10 16:05:27 UTC
ceph-ansible PR 1756 tagged upstream as v3.0.0rc2.

Comment 9 John Fulton 2017-08-16 12:44:51 UTC
The upstream patch has merged https://review.openstack.org/#/c/492303

Comment 13 Yogev Rabl 2017-11-15 18:33:33 UTC
Verified on openstack-tripleo-heat-templates-7.0.3-0.20171024200825.el7ost.noarch

Comment 16 errata-xmlrpc 2017-12-13 21:51:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462