Bug 2230067
| Summary: | [GSS] ceph crash rocksdb::port::Mutex::Mutex | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | kelwhite |
| Component: | ceph | Assignee: | Radoslaw Zarzynski <rzarzyns> |
| ceph sub component: | RADOS | QA Contact: | Elad <ebenahar> |
| Status: | NEW --- | Docs Contact: | |
| Severity: | high | ||
| Priority: | unspecified | CC: | bniver, mcaldeir, muagarwa, nojha, odf-bz-bot, rzarzyns, sostapov |
| Version: | 4.11 | Flags: | rzarzyns:
needinfo?
(kelwhite) |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Description of problem (please be detailed as possible and provide log snippests): ceph mon-b is crashing with the following assert: "archived": "2023-08-03 19:41:43.584662", "backtrace": [ "[0x3ffc9df8fde]", "gsignal()", "abort()", "(rocksdb::port::Mutex::Mutex(bool)+0) [0x2aa1715ac68]", "ceph-mon(+0x75adba) [0x2aa1715adba]", "(rocksdb::InstrumentedMutex::Lock()+0xda) [0x2aa1709ddba]", "ceph-mon(+0x55e748) [0x2aa16f5e748]", "(rocksdb::Cleanable::~Cleanable()+0x2a) [0x2aa17108552]", "(rocksdb::DBIter::~DBIter()+0x520) [0x2aa16fd6120]", "(rocksdb::ArenaWrappedDBIter::~ArenaWrappedDBIter()+0x30) [0x2aa171718a0]", "(std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release()+0x5a) [0x2aa16c4e122]", "(std::_Sp_counted_ptr<MonitorDBStore::WholeStoreIteratorImpl*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()+0x62) [0x2aa16cabe0a]", "(std::_Rb_tree<unsigned long, std::pair<unsigned long const, Monitor::SyncProvider>, std::_Select1st<std::pair<unsigned long const, Monitor::SyncProvider> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, Monitor::SyncProvider> > >::_M_erase(std::_Rb_tree_node<std::pair<unsigned long const, Monitor::SyncProvider> >*)+0xe6) [0x2aa16cb3646]", "(Monitor::~Monitor()+0x394) [0x2aa16c971a4]", "(Monitor::~Monitor()+0x16) [0x2aa16c979ce]", "main()", "__libc_start_main()", "ceph-mon(+0x247404) [0x2aa16c47404]", "[(nil)]" "ceph_version": "16.2.8-84.el8cp", "crash_id": "2023-07-28T20:42:49.295902Z_db4ed366-a88e-47f8-bfef-de0eb3c90660", "entity_name": "mon.b", "os_id": "rhel", "os_name": "Red Hat Enterprise Linux", "os_version": "8.6 (Ootpa)", "os_version_id": "8.6", "process_name": "ceph-mon", "stack_sig": "2a56dfb3ea296f126d13a277cc531950fd2f183e2c4a986b67436b8cbea6dba7", "timestamp": "2023-07-28T20:42:49.295902Z", "utsname_hostname": "rook-ceph-mon-b-bfbc4c9fd-xhtv8", "utsname_machine": "s390x", "utsname_release": "4.18.0-372.52.1.el8_6.s390x", "utsname_sysname": "Linux", "utsname_version": "#1 SMP Fri Mar 31 06:14:27 EDT 2023" Is there a way to prevent this crash? What does this crash mean? seems were asking for another mutex when we already have one? Version of all relevant components (if applicable): ODF 4.11 Ceph 5.2 Is there any workaround available to the best of your knowledge? No Additional info: Found an upstream tracker that seems to be the same issue https://tracker.ceph.com/issues/60268