Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
This project is now read‑only. Starting Monday, February 2, please use https://ibm-ceph.atlassian.net/ for all bug tracking management.

Bug 2225891

Summary: Ceph Fs down flag is not working
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Amarnath <amk>
Component: CephFSAssignee: Rishabh Dave <ridave>
Status: CLOSED NOTABUG QA Contact: Hemanth Kumar <hyelloji>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.1CC: ceph-eng-bugs, cephqe-warriors, gfarnum, mchangir, pdonnell, ridave, vshankar
Target Milestone: ---   
Target Release: 7.1z3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-11-20 13:50:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Amarnath 2023-07-25 18:00:29 UTC
Description of problem:
The Ceph Fs down flag is not working.
Even after setting it to false FS is not coming back

Test Steps followed
1. Created a filesystem
2. Mounted the file system and filled data
3. set down flag to True. This was making the FS to go down
    ceph fs set cephfs-down-flag down true 
4.Unset the flag.
    ceph fs set cephfs-down-flag down false

Test run Logs : http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-907CTT/

MDS logs : http://magna002.ceph.redhat.com/ceph-qe-logs/amar/ceph-mds.cephfs-down-flag.ceph-amk-recovery-6-1-jiiduj-node2.whnvfw.log

Parallely ran ceph fs status command : 
http://magna002.ceph.redhat.com/ceph-qe-logs/amar/mds_logs.txt


[root@ceph-amk-recovery-6-1-jiiduj-node8 cephfs_kerneljw67otoull_1]# ceph versions
{
    "mon": {
        "ceph version 17.2.6-99.el9cp (6869830013a8878a3930e23c75d8b990f6b0c491) quincy (stable)": 3
    },
    "mgr": {
        "ceph version 17.2.6-99.el9cp (6869830013a8878a3930e23c75d8b990f6b0c491) quincy (stable)": 2
    },
    "osd": {
        "ceph version 17.2.6-99.el9cp (6869830013a8878a3930e23c75d8b990f6b0c491) quincy (stable)": 12
    },
    "mds": {
        "ceph version 17.2.6-99.el9cp (6869830013a8878a3930e23c75d8b990f6b0c491) quincy (stable)": 7
    },
    "overall": {
        "ceph version 17.2.6-99.el9cp (6869830013a8878a3930e23c75d8b990f6b0c491) quincy (stable)": 24
    }
}

 


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Patrick Donnelly 2023-07-25 19:52:10 UTC
Please attach the mon logs when running these commands. Please also include `ceph fs dump` before and after these commands are run.

Comment 3 Patrick Donnelly 2023-08-14 22:24:19 UTC
The rank became damaged but the MDS log you attached is from a different time period so I cannot see what happened. Can you reproduce with

debug_mds = 20
debug_ms = 1

and attach the MDS logs please.

Comment 12 Amarnath 2023-09-17 11:42:55 UTC
Hi Venky,

Results for this tests are not consistent.
How ever, most of the times it fails(6/10).
It latest run also it failed

http://magna002.ceph.redhat.com/cephci-jenkins/test-runs/18.2.0-20/Weekly/cephfs/9/tier-4_cephfs_recovery/Taking_the_cephfs_down_with_down_flag_0.log

Regards,
Amarnath

Comment 20 Rishabh Dave 2024-06-10 11:53:38 UTC
I tried reproducing this with upstream Quincy and it failed. I also tried Reef and main, both failed.

Eventually, I also wrote 3 tests for this. First one sets "down" to true, after confirming that it is down, it sets it to "false" and then waits and checks for MDS to be "up:active" state.

Second one does this along with a creating and doing this for a FS named "cephfs-down-flag" (since that was mention in reproducing recipe provided on the BZ description). And the third test repeats the first one but 100 times. I ran all these few times and all of them passed every time meaning this bug couldn't be reproduced.

Comment 36 Red Hat Bugzilla 2025-03-27 04:25:02 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days