Bug 2160855

Summary: ceph --admin-daemon $socketfile client evict [-h|--help] evicts ALL clients
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Kimberly Lazarski <klazarsk>
Component: RADOSAssignee: Neeraj Pratap Singh <neesingh>
Status: ASSIGNED --- QA Contact: julpark
Severity: medium Docs Contact:
Priority: high    
Version: 5.2CC: bhubbard, ceph-eng-bugs, cephqe-warriors, neesingh, nojha, pdhiran, vshankar, vumrao
Target Milestone: ---   
Target Release: 6.1z2   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Kimberly Lazarski 2023-01-13 23:48:16 UTC
Description of problem:

# ceph --admin-daemon $socketfile client evict [-h|--help] evicts ALL clients


Customer (large financial institution) reported -h or --help on "client evict" evicts ALL clients. This was discovered when customer fat-fingered command, received an error and observed that "-h" was the first option listed in the ceph command output and then added a "--help" to "client evict" to review command help details and all active client connections were terminated and all clients added to blocklist. The blocklist needed to be cleared and all clients rebooted (faster than manually remounting all the things and restarting services) to restore production. 


Version-Release number of selected component (if applicable):

[root@serverc ~]# ceph versions 
{
    "mon": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 4
    },
    "mgr": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 4
    },
    "osd": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 9
    },
    "mds": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 1
    },
    "rgw": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 2
    },
    "overall": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 20
    }
}


How reproducible:

Very consistently.

used cl260v5.2 enablement lab cluster to exhibit defect (and also to use a known environment)

Steps to Reproduce:
1. Connect clients to cephfs server
2. # ceph --admin-daemon $socketfile client evict [-h|--help] evicts ALL clients
3. Observe all active sessions using cephfs are lost and all active clients have been added to blocklist

Actual results:

[root@serverc ~]# ceph --admin-daemon /var/run/ceph/2ae6d05a-229a-11ec-925e-52540000fa0c/ceph-mds.mycephfs.serverc.esusgp.asok client ls | grep '"id"'
        "id": 65008,
        "id": 65000,
        "id": 64996,
[root@serverc ~]# ceph --admin-daemon /var/run/ceph/2ae6d05a-229a-11ec-925e-52540000fa0c/ceph-mds.mycephfs.serverc.esusgp.asok client evict --help
[root@serverc ~]# ceph --admin-daemon /var/run/ceph/2ae6d05a-229a-11ec-925e-52540000fa0c/ceph-mds.mycephfs.serverc.esusgp.asok client ls | grep '"id"'
[root@serverc ~]# ceph osd blocklist ls 
172.25.250.10:0/689758715 2023-01-14T00:33:34.375368+0000
172.25.250.13:0/2724339963 2023-01-14T00:20:41.340339+0000
172.25.250.12:0/3828871504 2023-01-14T00:33:34.375311+0000
172.25.250.12:0/1056983785 2023-01-14T00:20:41.340431+0000
172.25.250.13:0/831930938 2023-01-14T00:33:34.375206+0000
172.25.250.10:0/2110124697 2023-01-14T00:20:41.340482+0000
listed 6 entries
[root@serverc ~]# 


Expected results:

Expect same result as if the --admin-daemon were fat-fingered (exit [value != 0], and dump help screen to guide user)

[root@serverc ~]# ceph --admin-daemon /var/run/ceph/2ae6d05a-229a-11ec-925e-52540000fa0c/ceph-mds.mycephfs.serverc.esusgp.asok client ls | grep '"id"'
        "id": 65008,
        "id": 65000,
        "id": 64996,
[root@serverc ~]# ceph --admin-daemon /var/run/ceph/2ae6d05a-229a-11ec-925e-52540000fa0c/ceph-mds.mycephfs.serverc.esusgp.asok client evict --help
[root@serverc ~]# ceph --admin-daemon /var/run/ceph/2ae6d05a-229a-11ec-925e-52540000fa0c/ceph-mds.mycephfs.serverc.esusgp.asok client evict --help

 General usage: 
 ==============
usage: ceph [-h] [-c CEPHCONF] [-i INPUT_FILE] [-o OUTPUT_FILE]
            [--setuser SETUSER] [--setgroup SETGROUP] [--id CLIENT_ID]
            [--name CLIENT_NAME] [--cluster CLUSTER]
            [--admin-daemon ADMIN_SOCKET] [-s] [-w] [--watch-debug]
            [--watch-info] [--watch-sec] [--watch-warn] [--watch-error]
            [-W WATCH_CHANNEL] [--version] [--verbose] [--concise]
            [-f {json,json-pretty,xml,xml-pretty,plain,yaml}]
            [--connect-timeout CLUSTER_TIMEOUT] [--block] [--period PERIOD]
[...]
[root@serverc ~]# echo $?
1 (or other nonzero value to indicate failure)
[root@serverc ~]# _

Additional info:

If -h | --help should either treated as help like it is nearly everywhere else in userland, or exit with a failure status and error message and/or dump the help screen. If "client evict" receives bad or ambiguous input, it must not terminate all connections. If this is intended to be a wildcard, it must be clearly documented and shall not reuse the swiches which are otherwise nearly-universally treated as a request for online help.

Comment 1 RHEL Program Management 2023-01-13 23:48:25 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.