Bug 1219559

Summary: cinder-volume keeps opening Ceph clients until the maximum number of opened files reached
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Tupper Cole <tcole>
Component: DistributionAssignee: Josh Durgin <jdurgin>
Status: CLOSED ERRATA QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 1.2.3CC: bhubbard, flucifre, hnallurv, kdreyer, smanjara, vumrao
Target Milestone: rc   
Target Release: 1.3.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: ceph-0.94.1-11.el7cp Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1220496 (view as bug list) Environment:
Last Closed: 2015-06-24 15:52:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1220496    

Description Tupper Cole 2015-05-07 15:22:44 UTC
Description of problem:A change in 1.2.3 (0.80.8) exposed a bug in cinder, resulting on maximum files open limits getting hit. This is described here:

https://bugs.launchpad.net/cinder/+bug/1446682

Version-Release number of selected component (if applicable):1.2.3


How reproducible:Consistently


Expected results:Admin socket gets torn down. 


Additional info:
This has been addressed by https://github.com/ceph/ceph/pull/4541/files

Comment 2 Ken Dreyer (Red Hat) 2015-05-07 16:06:43 UTC
Looks like the above PR is still undergoing review upstream.

Josh, if I'm reading the above Launchpad bug and PR correctly, I'm guessing this bug applies to both v0.80.8 and v0.94.1?

Comment 3 Josh Durgin 2015-05-07 16:19:14 UTC
Yes, this applies to all ceph versions.

Comment 4 Josh Durgin 2015-05-08 15:45:00 UTC
This only occurs when the admin socket is enabled for clients in ceph.conf, which is not the default, so this shouldn't be a release blocker imo. A simple workaround is not enabling the admin socket for nova and cinder nodes.

Comment 10 Harish NV Rao 2015-05-13 16:02:59 UTC
Hi Josh,

Should QE test with admin socket disabled or enabled? Please let us know.

Regards,
Harish

Comment 11 Brad Hubbard 2015-05-13 16:21:47 UTC
(In reply to Harish NV Rao from comment #10)
> Hi Josh,
> 
> Should QE test with admin socket disabled or enabled? Please let us know.

The leak *only* happens when admin sockets are enabled. So to test whether their is a leak you will need admin sockets enabled.

HTH.

Comment 12 Harish NV Rao 2015-05-13 16:29:59 UTC
Thanks Brad. I understand that's how we have to test this fix. 

I was not clear in my earlier question.

We are currently testing RHEL OSP 6 with 1.3.0. We would like to know whether our OSP inter-operability testing should be done with admin_socket enabled or disabled.

Comment 13 Josh Durgin 2015-05-14 03:36:38 UTC
Test with admin socket disabled until this bug is fixed. Any further testing after the fix is included in a build can have admin socket enabled.

Comment 14 Ken Dreyer (Red Hat) 2015-05-19 19:50:36 UTC
Josh replied via email that https://github.com/ceph/ceph/pull/4657 is good to pull in downstream for 1.3.0. I will do that today.

Comment 21 shilpa 2015-06-04 18:29:14 UTC
Thanks Josh. Verified on ceph-0.94.1-11.el7cp. I don't see the reported issue anymore.

Comment 23 errata-xmlrpc 2015-06-24 15:52:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2015:1183