Bug 2294478 - [cephfs][scrub][8.x] Assertions observed on standby-replay MDS nodes when executing the cephfs scrub command.
Summary: [cephfs][scrub][8.x] Assertions observed on standby-replay MDS nodes when exe...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: CephFS
Version: 7.1
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 8.0z2
Assignee: Neeraj Pratap Singh
QA Contact: julpark
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-06-27 07:27 UTC by julpark
Modified: 2025-03-06 14:22 UTC (History)
9 users (show)

Fixed In Version: ceph-19.2.0-57.el9cp
Doc Type: Bug Fix
Doc Text:
.File System scrub is no longer allowed for standby-replay MDS Previously, Ceph File System scrub was attempted for the standby-replay MDSs, even though it was not required. As a result, assertion errors were occurring. With this fix, Ceph File System scrub is no longer allowed for standby-replay MDSs, and Ceph File System provides the proper error message when a user tries to run scrub for these MDSs.
Clone Of:
Environment:
Last Closed: 2025-03-06 14:21:56 UTC
Embargoed:
julpark: needinfo+
gfarnum: needinfo? (neesingh)
hyelloji: needinfo+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 66869 0 None None None 2024-07-31 06:46:14 UTC
Red Hat Issue Tracker RHCEPH-9405 0 None None None 2024-07-25 05:32:06 UTC
Red Hat Product Errata RHBA-2025:2457 0 None None None 2025-03-06 14:21:59 UTC

Description julpark 2024-06-27 07:27:06 UTC
Description of problem:

Standby MDS Crash when scrubbing with it.

Version-Release number of selected component (if applicable):

18.2.1-196.el9cp

How reproducible:


Steps to Reproduce:
1. standby set to true
2. max_mds to 1
3. get the standby mds
4. run scrub with the standby mds

Actual results:

2024-06-27 00:26:57,755 (cephci.cephfs_bugs.scrub_replay) [INFO] - cephci.cephci.tests.cephfs.cephfs_bugs.scrub_replay.py:84 - scrub result:{
    "return_code": 0,
    "scrub_tag": "9d65189a-1b73-478c-8503-0125557d96a3",
    "mode": "asynchronous"
}

it did allow the standby mds run scrubbing

Expected results:

it should not let it run

Additional info:


ceph version 18.2.1-196.el9cp (9837cda65eab52427ad7599909a70007ef0a72aa) reef (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x12e) [0x7f092cf4d076]
 2: /usr/lib64/ceph/libceph-common.so.2(+0x163234) [0x7f092cf4d234]
 3: (MDLog::_submit_entry(LogEvent*, MDSLogContextBase*)+0x40) [0x556e1d800a50]
 4: /usr/bin/ceph-mds(+0x1bd19e) [0x556e1d5a519e]
 5: (Locker::scatter_writebehind(ScatterLock*)+0x563) [0x556e1d710373]
 6: (Locker::_drop_locks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*, bool)+0x62f) [0x556e1d6ea91f]
 7: (Locker::drop_locks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x81) [0x556e1d6eaa41]
 8: (MDCache::request_cleanup(boost::intrusive_ptr<MDRequestImpl> const&)+0x21a) [0x556e1d6a250a]
 9: (Server::respond_to_request(boost::intrusive_ptr<MDRequestImpl> const&, int)+0x11b) [0x556e1d5bc13b]
 10: (MDCache::rdlock_dirfrags_stats_work(boost::intrusive_ptr<MDRequestImpl> const&)+0x1f1) [0x556e1d6ba111]
 11: (MDCache::rdlock_dirfrags_stats(CInode*, MDSInternalContext*)+0x6d) [0x556e1d6ba3bd]
 12: /usr/bin/ceph-mds(+0x3c75e6) [0x556e1d7af5e6]
 13: /usr/bin/ceph-mds(+0x399936) [0x556e1d781936]
 14: /usr/bin/ceph-mds(+0x399a24) [0x556e1d781a24]
 15: /usr/bin/ceph-mds(+0x1434ad) [0x556e1d52b4ad]
 16: (MDSContext::complete(int)+0x5c) [0x556e1d7f2adc]
 17: (CInode::_fetched(ceph::buffer::v15_2_0::list&, ceph::buffer::v15_2_0::list&, Context*)+0x43b) [0x556e1d7914ab]
 18: (MDSContext::complete(int)+0x5c) [0x556e1d7f2adc]
 19: (MDSIOContextBase::complete(int)+0x31c) [0x556e1d7f2f2c]
 20: (Finisher::finisher_thread_entry()+0x175) [0x7f092d0052a5]
 21: /lib64/libc.so.6(+0x89c02) [0x7f092c903c02]
 22: /lib64/libc.so.6(+0x10ec40) [0x7f092c988c40]

Comment 15 errata-xmlrpc 2025-03-06 14:21:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 8.0 security, bug fixes, and enhancement updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2025:2457


Note You need to log in before you can comment on or make changes to this bug.