Bug 2192479

Summary: [cee/sd][ceph-mgr]ceph-mgr daemon got lost from ceph status
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Bipin Kunal <bkunal>
Component: RADOSAssignee: Brad Hubbard <bhubbard>
Status: CLOSED ERRATA QA Contact: Pawan <pdhiran>
Severity: high Docs Contact:
Priority: unspecified    
Version: 5.2CC: akraj, amanzane, bhubbard, bhull, ceph-eng-bugs, cephqe-warriors, mcaldeir, ngangadh, nojha, pdhange, rmandyam, rzarzyns, sostapov, tserlin, vumrao
Target Milestone: ---   
Target Release: 5.3z4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-16.2.10-187.el8cp Doc Type: Bug Fix
Doc Text:
.Manager continues to send beacons in the event of an error during authentication check Previously, if an error was encountered when performing an authentication check with a monitor, the manager would get into a state where it would no longer have an active connection. Due to this, the manager could no longer send beacons and the monitor would mark it as lost. With this fix, a session (active con) is reopened in the event of an error and the manager is able to continue to send beacons and is no longer marked as lost.
Story Points: ---
Clone Of: 2171847 Environment:
Last Closed: 2023-07-19 16:19:10 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2171847    
Bug Blocks: 2224323, 2210690    

Description Bipin Kunal 2023-05-02 06:42:10 UTC
This bug was initially created as a copy of Bug #2171847

I am copying this bug because: 



Description of problem:

There were three ceph-mgr running in cluster and the active mgr got lost from ceph status after it was replaced with stand-by daemon due to being unresponsive.

Version-Release number of selected component (if applicable):
RHCS 5.2 async (16.2.8-85.el8cp)


Actual results:
- The ceph-mgr daemon was lost 

Expected results:
- The ceph-mgr daemon shouldn't be lost.

- I suspect hitting the same issue as reported under BZ 2106031

Comment 1 RHEL Program Management 2023-05-02 06:42:21 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 3 Scott Ostapovicz 2023-06-14 15:34:42 UTC
Kicking the can down the road as this has AGAIN missed the z-stream timeline.  Moved from z4 to z5.

Comment 5 amansan 2023-06-20 15:02:16 UTC
*** Bug 2215880 has been marked as a duplicate of this bug. ***

Comment 21 errata-xmlrpc 2023-07-19 16:19:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.3 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:4213