Bug 2208456

Summary: mgr pod restarts
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Elvir Kuric <ekuric>
Component: RBDAssignee: Ilya Dryomov <idryomov>
Status: NEW --- QA Contact: Manasa <mgowri>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.1CC: ceph-eng-bugs, cephqe-warriors, vereddy
Target Milestone: ---   
Target Release: 6.1z2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Elvir Kuric 2023-05-19 07:04:35 UTC
Created attachment 1965611 [details]
mgr pod log prior restart

Description of problem:

We noticed that mgr pod will restart while running tests.

Version-Release number of selected component (if applicable):

ceph versions
{
"mon": {
"ceph version 17.2.6-47.el9cp (6add4f24d1eff88e1db808ecdc16fd5b2db96dd4) quincy (stable)": 3
},
"mgr": {
"ceph version 17.2.6-47.el9cp (6add4f24d1eff88e1db808ecdc16fd5b2db96dd4) quincy (stable)": 1
},
"osd": {
"ceph version 17.2.6-47.el9cp (6add4f24d1eff88e1db808ecdc16fd5b2db96dd4) quincy (stable)": 21
},
"mds": {
"ceph version 17.2.6-47.el9cp (6add4f24d1eff88e1db808ecdc16fd5b2db96dd4) quincy (stable)": 2
},
"rbd-mirror": {
"ceph version 17.2.6-47.el9cp (6add4f24d1eff88e1db808ecdc16fd5b2db96dd4) quincy (stable)": 1
},
"rgw": {
"ceph version 17.2.6-47.el9cp (6add4f24d1eff88e1db808ecdc16fd5b2db96dd4) quincy (stable)": 1
},
"overall": {
"ceph version 17.2.6-47.el9cp (6add4f24d1eff88e1db808ecdc16fd5b2db96dd4) quincy (stable)": 29
}
}



How reproducible:
The issue was visible while running test with mixed rw ( fio randrw ) 70% write, 30% read


Steps to Reproduce:
Our environment was ODF DR ( rbd mirroring ), below are steps we followed
1. create ODF DR with above ceph version
2. on primary cluster issue test ( randrw , 70% write, 30% read ) 
3. let test to run for 12h or so

Note: This issue was visible once up to now ( but we also test this version for couple days ) , so it might happen that reproducer does not work. 

Actual results:
mgr pods restarts. Check logs

Expected results:
mgr pod not to restart 

Additional info:

Attached logs

Comment 1 RHEL Program Management 2023-05-19 07:04:47 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.