Bug 1651828

Summary: [RFE] Client should have an option to reconnect to MDS after being evicted/blocklisted
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Patrick Donnelly <pdonnell>
Component: CephFSAssignee: Patrick Donnelly <pdonnell>
Status: CLOSED ERRATA QA Contact: Yogesh Mane <ymane>
Severity: low Docs Contact: Amrita <asakthiv>
Priority: high    
Version: 3.0CC: agunn, asakthiv, ceph-eng-bugs, gfarnum, jlayton, kdreyer, mhackett, owasserm, pasik, rmandyam, sweil, vereddy, ymane
Target Milestone: ---Keywords: FutureFeature
Target Release: 5.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: ceph-16.0.0-8633.el8cp Doc Type: Enhancement
Doc Text:
.CephFS clients can now reconnect after being blocklisted by Metadata Servers (MDS) Previously, Ceph File System (CephFS) clients were blocklisted by MDS because of network partitions or other transient errors. With this release, the CephFS client can reconnect to the mount with the appropriate configurations turned ON for each client as manual remount is not needed.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-30 08:22:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1929727, 1959686    

Description Patrick Donnelly 2018-11-21 00:42:15 UTC
Description of problem:

The client_reconnect_stale config option no longer works because the blacklist/eviction logic changed significantly since it was introduced.

One option for a more robust solution is to create a new configuration option that allows the client to acquire a new cluster id (client.1234...), reconnect to the MDSs, and re-acquire all caps. In-flight ops should be retried. Cached reads and buffered writes should be dropped. Open file handles should return EIO.

Comment 8 Giridhar Ramaraju 2019-08-05 13:06:14 UTC
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. 

Regards,
Giri

Comment 9 Giridhar Ramaraju 2019-08-05 13:08:55 UTC
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. 

Regards,
Giri

Comment 21 Amrita 2021-05-12 09:58:25 UTC
Hi Patrick,

Please set the rdt flag and provide the doc text info for inclusion in the Release Notes 5.0. Comment 20, the specified bz is the doc bz and not included in the errata. 

Thanks
Amrita

Comment 33 errata-xmlrpc 2021-08-30 08:22:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.0 bug fix and enhancement), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3294