Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1380482 - The CephFS Native Manila Driver will Flood the Share Log with Errors when it Cannot Connect to Backing CephFS Cluster
The CephFS Native Manila Driver will Flood the Share Log with Errors when it ...
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-manila (Show other bugs)
10.0 (Newton)
Unspecified Unspecified
unspecified Severity unspecified
: rc
: 10.0 (Newton)
Assigned To: Jan Provaznik
Dustin Schoenbrun
Don Domingo
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-09-29 13:49 EDT by Dustin Schoenbrun
Modified: 2017-02-20 19:19 EST (History)
7 users (show)

See Also:
Fixed In Version: openstack-manila-3.0.0-5.el7ost
Doc Type: Bug Fix
Doc Text:
Prior to this update, the Manila Ceph FS driver did not check if it could connect to the Ceph server. Consequently, if the connection to the Ceph server did not work, `manila-share` service kept crashing or respawning without any timeout. With this update, there is now a check to confirm that the Ceph connection works when initializing the Manila Ceph FS driver. As a result, the Ceph driver checks the Ceph connection on driver init, and if it fails the driver is not initialized and no further steps are performed.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-12-14 11:06:11 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
head of /var/log/manila/share.log after native cephfs driver deployed w/o actual cephfs backend (63.23 KB, text/plain)
2016-11-21 18:20 EST, Tom Barron
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Launchpad 1640169 None None None 2016-11-17 14:54 EST
OpenStack gerrit 397744 None None None 2016-11-17 14:56 EST
Red Hat Product Errata RHEA-2016:2948 normal SHIPPED_LIVE Red Hat OpenStack Platform 10 enhancement update 2016-12-14 14:55:27 EST

  None (edit)
Description Dustin Schoenbrun 2016-09-29 13:49:59 EDT
Description of problem:
When the CephFS Native Driver cannot connect to the backing CephFS Cluster, it will report an error to the Manila Share log saying that it cannot connect. It will then immediately attempt to reconnect again to the CephFS cluster where it will most likely fail again. There is seemingly no limit to the amount of retries on connecting to the CephFS cluster which will cause the Manila Share log to grow exceptionally quickly. 

Version-Release number of selected component (if applicable):
openstack-manila-3.0.0-0.20160903135125.7a16eb6.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Set up OSP-10 using Packstack, ensuring that Manila is installed.
2. Configure the CephFS Native driver but ensure that the driver cannot connect to the backing CephFS cluster.
3. Observe that the driver cannot connect to the backing CephFS cluster and that the Manila Share log is flooded with error messages. 

Actual results:
The driver appears to attempt to reconnect continuously and will flood the Manila Share log with error messages.

Expected results:
The driver should only retry a certain number of times before giving up or should space out the retries over a longer period of time.
Comment 1 Tom Barron 2016-09-30 10:31:47 EDT
We should investigate whether this is a CephFS driver-specific issue or whether any manila backend that fails to connect to external storage will do the same thing.  And if the latter, is this a problem also in cinder?
Comment 2 Paul Grist 2016-10-14 14:05:29 EDT
Targeting 10z, but if this is very problematic then consider bringing it back.
Comment 3 Jan Provaznik 2016-11-08 08:58:15 EST
upstream bug: https://bugs.launchpad.net/manila/+bug/1640169
Comment 5 Tom Barron 2016-11-21 18:20 EST
Created attachment 1222498 [details]
head of /var/log/manila/share.log after native cephfs driver deployed w/o actual cephfs backend
Comment 6 Tom Barron 2016-11-21 18:27:33 EST
I used OSPd to deploy the native cephfs backend for manila via '-e /usr/share/openstack-tripleo-heat-templates/environments/manila-cephfsnative-config.yaml' using the latest rhos10 puddle, core_puddle=2016-11-19.4.

Results are in https://bugzilla.redhat.com/attachment.cgi?id=1222498, where one can readily see that the manila share log shows that the manila share service
correctly determines that it cannot interact with the backend.  Instead of
retrying in a quick loop as reported in this BZ and in https://bugs.launchpad.net/manila/+bug/1640169 the share service instead declares:

2016-11-21 22:39:54.682 113290 ERROR oslo_service.periodic_task DriverNotInitialized: Share driver 'CephFSNativeDriver' not initialized.

This message is seen again on periodic task updates that require interaction
with the driver:

2016-11-21 22:40:54.682 113290 ERROR oslo_service.periodic_task DriverNotInitialized: Share driver 'CephFSNativeDriver' not initialized.

In other words, the current log shows behavior consistent with other backends,
and not the tight infinite loop of retries to connect to the CephFS cluster
as reported in this bug.
Comment 8 Dustin Schoenbrun 2016-11-22 17:22:58 EST
Thanks for having a look at this, Tom! Looks good to me. Marking the bug as VERIFIED.
Comment 10 errata-xmlrpc 2016-12-14 11:06:11 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html

Note You need to log in before you can comment on or make changes to this bug.