Bug 2269664 - mds: disable defer_client_eviction_on_laggy_osds by default
Summary: mds: disable defer_client_eviction_on_laggy_osds by default
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: CephFS
Version: 7.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 7.1
Assignee: Venky Shankar
QA Contact: Amarnath
Akash Raj
URL:
Whiteboard:
Depends On:
Blocks: 2267614 2298578 2298579
TreeView+ depends on / blocked
 
Reported: 2024-03-15 07:31 UTC by Venky Shankar
Modified: 2024-09-03 20:07 UTC (History)
7 users (show)

Fixed In Version: ceph-18.2.1-74.el9cp
Doc Type: No Doc Update
Doc Text:
Clone Of: 2269663
Environment:
Last Closed: 2024-06-13 14:29:42 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 58023 0 None None None 2024-03-15 07:31:49 UTC
Ceph Project Bug Tracker 64685 0 None None None 2024-03-15 07:31:49 UTC
Red Hat Issue Tracker RHCEPH-8538 0 None None None 2024-03-15 07:34:21 UTC
Red Hat Knowledge Base (Solution) 7083060 0 None None None 2024-09-03 20:07:37 UTC
Red Hat Product Errata RHSA-2024:3925 0 None None None 2024-06-13 14:29:47 UTC

Description Venky Shankar 2024-03-15 07:31:49 UTC
+++ This bug was initially created as a clone of Bug #2269663 +++

This config can result in a single client holding up mds to service other clients since once a client is deferred from eviction due to laggy OSD, a new clients cap acquire request can be possibly blocked until the other laggy client resumes operation, i.e., when the laggy OSD is considered non-laggy anymore.

Comment 5 Amarnath 2024-04-01 11:58:48 UTC
Hi All,

[ceph: root@mero017 /]#  ceph config get mds defer_client_eviction_on_laggy_osds
true
[ceph: root@mero017 /]# ceph versions
{
    "mon": {
        "ceph version 18.2.1-76.el9cp (2517f8a5ef5f5a6a22013b2fb11a591afd474668) reef (stable)": 3
    },
    "mgr": {
        "ceph version 18.2.1-76.el9cp (2517f8a5ef5f5a6a22013b2fb11a591afd474668) reef (stable)": 3
    },
    "osd": {
        "ceph version 18.2.1-76.el9cp (2517f8a5ef5f5a6a22013b2fb11a591afd474668) reef (stable)": 33
    },
    "mds": {
        "ceph version 18.2.1-76.el9cp (2517f8a5ef5f5a6a22013b2fb11a591afd474668) reef (stable)": 6
    },
    "overall": {
        "ceph version 18.2.1-76.el9cp (2517f8a5ef5f5a6a22013b2fb11a591afd474668) reef (stable)": 45
    }
}
[ceph: root@mero017 /]# 
We are validating the clients eviction also.
If clients are not evicted and if we still can access the mount points on the client then we are failing the test case 

Ref: https://github.com/red-hat-storage/cephci/blob/e01ff9a132697422bf8d320385aceed5140db553/tests/cephfs/cephfs_bugs/test_defer_client_evict_on_laggy_osd.py#L172
Log : http://magna002.ceph.redhat.com/cephci-jenkins/test-runs/18.2.1-77/Regression/cephfs/84/tier-2_cephfs_test-clients/Client_eviction_deferred_if_OSD_is_laggy_0.log

Regards,
Amarnath

Comment 8 errata-xmlrpc 2024-06-13 14:29:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Critical: Red Hat Ceph Storage 7.1 security, enhancements, and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:3925


Note You need to log in before you can comment on or make changes to this bug.