Bug 2228339

Summary: mds: MDLog::_recovery_thread: handle the errors gracefully
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Jos Collin <jcollin>
Component: CephFSAssignee: Jos Collin <jcollin>
Status: VERIFIED --- QA Contact: Hemanth Kumar <hyelloji>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 5.3CC: ceph-eng-bugs, cephqe-warriors, tserlin, vshankar
Target Milestone: ---   
Target Release: 5.3z5   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: ceph-16.2.10-203.el8cp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jos Collin 2023-08-02 06:28:55 UTC
Description of problem:
A write fails if the MDS is already blocklisted due to the 'fs fail' issued by the qa tests.
Handle those write failures gracefully, even when the MDS is stopping.

Version-Release number of selected component (if applicable):
5.3

How reproducible:
test_rebuild_moved_file (tasks/data-scan) fails because mds crashes:
https://tracker.ceph.com/issues/61201

Steps to Reproduce:
https://tracker.ceph.com/issues/61201

Actual results:
asserts when the write fails in MDLog::_recovery_thread.

Expected results:
Handle those write failures gracefully.

Comment 9 Venky Shankar 2023-08-10 06:17:16 UTC
Jos - please rebase https://gitlab.cee.redhat.com/ceph/ceph/-/merge_requests/319

Comment 10 Jos Collin 2023-08-12 00:21:51 UTC
(In reply to Venky Shankar from comment #9)
> Jos - please rebase
> https://gitlab.cee.redhat.com/ceph/ceph/-/merge_requests/319

rebased.