Bug 2160855

Summary: ceph --admin-daemon $socketfile client evict [-h|--help] evicts ALL clients
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Kimberly Lazarski <klazarsk>
Component: CephFSAssignee: Neeraj Pratap Singh <neesingh>
Status: CLOSED ERRATA QA Contact: Hemanth Kumar <hyelloji>
Severity: medium Docs Contact: Rivka Pollack <rpollack>
Priority: high    
Version: 5.2CC: bhubbard, ceph-eng-bugs, cephqe-warriors, neesingh, nojha, pdhange, pdhiran, rpollack, rzarzyns, tserlin, vshankar, vumrao
Target Milestone: ---   
Target Release: 6.1z7   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: ceph-17.2.6-228 Doc Type: Bug Fix
Doc Text:
.Daemon commands now recognize Ceph implemented flags Previously, daemon commands were run and implemented while ignoring Ceph flags. Ignored flags included ‘--help’ and ‘--status’ flags. As a result, commands being run that were meant to only issue help or status information. For example, running the ‘ceph --admin-daemon $socketfile client evict –help’ command would stop all active connections, instead of running the ‘--help’ flag on the command. With this release, client connections are not stopped when running commands with ‘--help’ and ‘--status’ flags and a valid error message is emitted.
Story Points: ---
Clone Of:
: 2299169 2299170 2344299 (view as bug list) Environment:
Last Closed: 2024-08-28 17:57:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2299169, 2299170, 2344299    

Description Kimberly Lazarski 2023-01-13 23:48:16 UTC
Description of problem:

# ceph --admin-daemon $socketfile client evict [-h|--help] evicts ALL clients


Customer (large financial institution) reported -h or --help on "client evict" evicts ALL clients. This was discovered when customer fat-fingered command, received an error and observed that "-h" was the first option listed in the ceph command output and then added a "--help" to "client evict" to review command help details and all active client connections were terminated and all clients added to blocklist. The blocklist needed to be cleared and all clients rebooted (faster than manually remounting all the things and restarting services) to restore production. 


Version-Release number of selected component (if applicable):

[root@serverc ~]# ceph versions 
{
    "mon": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 4
    },
    "mgr": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 4
    },
    "osd": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 9
    },
    "mds": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 1
    },
    "rgw": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 2
    },
    "overall": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 20
    }
}


How reproducible:

Very consistently.

used cl260v5.2 enablement lab cluster to exhibit defect (and also to use a known environment)

Steps to Reproduce:
1. Connect clients to cephfs server
2. # ceph --admin-daemon $socketfile client evict [-h|--help] evicts ALL clients
3. Observe all active sessions using cephfs are lost and all active clients have been added to blocklist

Actual results:

[root@serverc ~]# ceph --admin-daemon /var/run/ceph/2ae6d05a-229a-11ec-925e-52540000fa0c/ceph-mds.mycephfs.serverc.esusgp.asok client ls | grep '"id"'
        "id": 65008,
        "id": 65000,
        "id": 64996,
[root@serverc ~]# ceph --admin-daemon /var/run/ceph/2ae6d05a-229a-11ec-925e-52540000fa0c/ceph-mds.mycephfs.serverc.esusgp.asok client evict --help
[root@serverc ~]# ceph --admin-daemon /var/run/ceph/2ae6d05a-229a-11ec-925e-52540000fa0c/ceph-mds.mycephfs.serverc.esusgp.asok client ls | grep '"id"'
[root@serverc ~]# ceph osd blocklist ls 
172.25.250.10:0/689758715 2023-01-14T00:33:34.375368+0000
172.25.250.13:0/2724339963 2023-01-14T00:20:41.340339+0000
172.25.250.12:0/3828871504 2023-01-14T00:33:34.375311+0000
172.25.250.12:0/1056983785 2023-01-14T00:20:41.340431+0000
172.25.250.13:0/831930938 2023-01-14T00:33:34.375206+0000
172.25.250.10:0/2110124697 2023-01-14T00:20:41.340482+0000
listed 6 entries
[root@serverc ~]# 


Expected results:

Expect same result as if the --admin-daemon were fat-fingered (exit [value != 0], and dump help screen to guide user)

[root@serverc ~]# ceph --admin-daemon /var/run/ceph/2ae6d05a-229a-11ec-925e-52540000fa0c/ceph-mds.mycephfs.serverc.esusgp.asok client ls | grep '"id"'
        "id": 65008,
        "id": 65000,
        "id": 64996,
[root@serverc ~]# ceph --admin-daemon /var/run/ceph/2ae6d05a-229a-11ec-925e-52540000fa0c/ceph-mds.mycephfs.serverc.esusgp.asok client evict --help
[root@serverc ~]# ceph --admin-daemon /var/run/ceph/2ae6d05a-229a-11ec-925e-52540000fa0c/ceph-mds.mycephfs.serverc.esusgp.asok client evict --help

 General usage: 
 ==============
usage: ceph [-h] [-c CEPHCONF] [-i INPUT_FILE] [-o OUTPUT_FILE]
            [--setuser SETUSER] [--setgroup SETGROUP] [--id CLIENT_ID]
            [--name CLIENT_NAME] [--cluster CLUSTER]
            [--admin-daemon ADMIN_SOCKET] [-s] [-w] [--watch-debug]
            [--watch-info] [--watch-sec] [--watch-warn] [--watch-error]
            [-W WATCH_CHANNEL] [--version] [--verbose] [--concise]
            [-f {json,json-pretty,xml,xml-pretty,plain,yaml}]
            [--connect-timeout CLUSTER_TIMEOUT] [--block] [--period PERIOD]
[...]
[root@serverc ~]# echo $?
1 (or other nonzero value to indicate failure)
[root@serverc ~]# _

Additional info:

If -h | --help should either treated as help like it is nearly everywhere else in userland, or exit with a failure status and error message and/or dump the help screen. If "client evict" receives bad or ambiguous input, it must not terminate all connections. If this is intended to be a wildcard, it must be clearly documented and shall not reuse the swiches which are otherwise nearly-universally treated as a request for online help.

Comment 1 RHEL Program Management 2023-01-13 23:48:25 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 19 errata-xmlrpc 2024-08-28 17:57:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 6.1 security, bug fix, and enhancement updates.), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:5960