Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1389723

Summary:	nova volume detach hangs in detaching status even after detaching command is run. Running on a ceph 2.1 cluster
Product:	Red Hat OpenStack	Reporter:	rakesh-gm <rgowdege>
Component:	ceph	Assignee:	Sébastien Han <shan>
Status:	CLOSED NOTABUG	QA Contact:	Warren <wusui>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	9.0 (Mitaka)	CC:	berrange, dasmith, dcadzow, eglynn, hnallurv, jdillama, jdurgin, kchamart, lhh, nlevine, rgowdege, sbauza, sferdjao, sgordon, srevivo, vromanso
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-01-05 21:29:21 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description rakesh-gm 2016-10-28 10:18:10 UTC

Description of problem:

The nova volume does not gets detached even after running detach command. the status shows detaching. this is on a ceph cluster. 2.1  

Version-Release number of selected component (if applicable):

1. ceph version 10.2.3-10.el7cp

2. openstack-cinder-8.1.1-1.el7ost.noarch

3. openstack-nova-api-13.1.1-7.el7ost.noarch


logs -
-----------------
2016-10-28 07:30:22.495 1253 WARNING oslo_db.sqlalchemy.engines [req-14335e37-bfb6-4354-95a3-1bd46f45efb4 - - - - -] SQL connection failed. 10 attempts left.
2016-10-28 07:30:32.785 1253 INFO cinder.rpc [req-14335e37-bfb6-4354-95a3-1bd46f45efb4 - - - - -] Automatically selected cinder-scheduler objects version 1.3 as minimum se
rvice version.
2016-10-28 07:30:32.799 1253 INFO cinder.rpc [req-14335e37-bfb6-4354-95a3-1bd46f45efb4 - - - - -] Automatically selected cinder-scheduler RPC version 2.0 as minimum servic
e version.
2016-10-28 07:30:33.125 1253 INFO cinder.volume.manager [req-14335e37-bfb6-4354-95a3-1bd46f45efb4 - - - - -] Determined volume DB was not empty at startup.
2016-10-28 07:30:33.169 1253 INFO cinder.volume.manager [req-14335e37-bfb6-4354-95a3-1bd46f45efb4 - - - - -] Image-volume cache disabled for host magna083@rbd.
2016-10-28 07:30:33.172 1253 INFO oslo_service.service [req-14335e37-bfb6-4354-95a3-1bd46f45efb4 - - - - -] Starting 1 workers
2016-10-28 07:30:33.190 10880 INFO cinder.service [-] Starting cinder-volume node (version 8.1.1)
2016-10-28 07:30:33.192 10880 INFO cinder.volume.manager [req-f70b5ba6-6fcc-4e98-b0d2-0989da8e2577 - - - - -] Starting volume driver RBDDriver (1.2.0)
2016-10-28 07:35:33.308 10880 ERROR cinder.volume.drivers.rbd [req-f70b5ba6-6fcc-4e98-b0d2-0989da8e2577 - - - - -] Error connecting to ceph cluster.
2016-10-28 07:35:33.308 10880 ERROR cinder.volume.drivers.rbd Traceback (most recent call last):
2016-10-28 07:35:33.308 10880 ERROR cinder.volume.drivers.rbd   File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 338, in _connect_to_rados
2016-10-28 07:35:33.308 10880 ERROR cinder.volume.drivers.rbd     client.connect()
2016-10-28 07:35:33.308 10880 ERROR cinder.volume.drivers.rbd   File "rados.pyx", line 785, in rados.Rados.connect (rados.c:8969)
2016-10-28 07:35:33.308 10880 ERROR cinder.volume.drivers.rbd TimedOut: error connecting to the cluster
2016-10-28 07:35:33.308 10880 ERROR cinder.volume.drivers.rbd
2016-10-28 07:40:43.356 10880 ERROR cinder.volume.drivers.rbd [req-f70b5ba6-6fcc-4e98-b0d2-0989da8e2577 - - - - -] Error connecting to ceph cluster.
2016-10-28 07:40:43.356 10880 ERROR cinder.volume.drivers.rbd Traceback (most recent call last):
2016-10-28 07:40:43.356 10880 ERROR cinder.volume.drivers.rbd   File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 338, in _connect_to_rados
2016-10-28 07:40:43.356 10880 ERROR cinder.volume.drivers.rbd     client.connect()
2016-10-28 07:40:43.356 10880 ERROR cinder.volume.drivers.rbd   File "rados.pyx", line 785, in rados.Rados.connect (rados.c:8969)
2016-10-28 07:40:43.356 10880 ERROR cinder.volume.drivers.rbd TimedOut: error connecting to the cluster
2016-10-28 07:40:43.356 10880 ERROR cinder.volume.drivers.rbd

Comment 2 rakesh-gm 2016-11-07 08:42:39 UTC

workaround. 

1. open the dashboard via browser 
2. Change the status of the volume from "in use " to "available" ( Detaches from the server)

Comment 3 Derek 2016-11-16 13:46:34 UTC

Is the detach command issued in Console, Horizon, or Director?  (Trying to determine which document would be impacted.)

Comment 4 Harish NV Rao 2016-11-16 14:37:17 UTC

I checked with Rakesh. He says it's from Console.

Comment 5 Artom Lifshitz 2018-01-05 21:29:21 UTC

Hello,

I'm going to close this as NOTABUG. The trace clearly shows a timeout connecting to the ceph cluster. If by any chance this is still a problem, please reopen the bug and attach sosreports, including from the ceph machine(s).

Cheers!