Bug 2224671 - [Tracker for Ceph BZ #2231784] /builddir/build/BUILD/ceph-17.2.6/src/osd/osd_types.h: 4882: FAILED ceph_assert(it != missing.end()) [NEEDINFO]
Summary: [Tracker for Ceph BZ #2231784] /builddir/build/BUILD/ceph-17.2.6/src/osd/osd_...
Keywords:
Status: NEW
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ceph
Version: 4.13
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Radoslaw Zarzynski
QA Contact: Elad
URL:
Whiteboard:
Depends On: 2231784
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-07-21 23:11 UTC by Alexander Chuzhoy
Modified: 2024-09-10 11:29 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2231784 (view as bug list)
Environment:
Last Closed:
Embargoed:
rzarzyns: needinfo? (sasha)


Attachments (Terms of Use)

Description Alexander Chuzhoy 2023-07-21 23:11:41 UTC
Versions:
mcg-operator.v4.13.0-rhodf
odf-operator.v4.13.0-rhodf
ocs-operator.v4.13.0-rhodf
OCP: 4.13.0


The cluster was running for 51 days and there was no issue.
Was checking the API performance dashboard today, selecting 2 weeks period...

Apparently this is resource intensive operation.



oc get pod -A|grep -v Run|grep -v Comple
NAMESPACE                                          NAME                                                                          READY   STATUS             RESTARTS         AGE
openshift-storage                                  rook-ceph-osd-1-88fc6f54d-xxfzt                                               1/2     CrashLoopBackOff   20 (4m43s ago)   85m


oc logs -n openshift-storage rook-ceph-osd-1-88fc6f54d-xxfzt|grep FAIL
Defaulted container "osd" out of: osd, log-collector, blkdevmapper (init), activate (init), expand-bluefs (init), chown-container-data-dir (init)
/builddir/build/BUILD/ceph-17.2.6/src/osd/osd_types.h: 4882: FAILED ceph_assert(it != missing.end())
/builddir/build/BUILD/ceph-17.2.6/src/osd/osd_types.h: 4882: FAILED ceph_assert(it != missing.end())
/builddir/build/BUILD/ceph-17.2.6/src/osd/osd_types.h: 4882: FAILED ceph_assert(it != missing.end())
/builddir/build/BUILD/ceph-17.2.6/src/osd/osd_types.h: 4882: FAILED ceph_assert(it != missing.end()

Comment 3 Alexander Chuzhoy 2023-07-24 14:27:55 UTC
Note: After I rebooted all the 3 nodes in this compact (only 3 controllers and 0 workers) cluster, the issue didn't reproduce

Comment 4 Blaine Gardner 2023-07-25 15:28:15 UTC
Since the issue has been resolved, I don't think this is urgent. It still seems wise to leave this open until we have someone available who can take a look at the must-gather to see if there are any clear error indications. It's possible this could have been a random issue with a memory block becoming corrupt in RAM or on disk.


Note You need to log in before you can comment on or make changes to this bug.