Description of problem (please be detailed as possible and provide log snippets): - The OSD pods are in CLBO state with several restarts of containers. ----------------------------------------------- rook-ceph-osd-0-588b7db67b-r9hnr 1/2 CrashLoopBackOff 329 (2m2s ago) 5d14h 10.130.2.33 dvtslocnw03-data.nbsdev.co.uk <none> <none> rook-ceph-osd-2-767b5c54f5-rdwrj 1/2 CrashLoopBackOff 330 (88s ago) 5d14h 10.128.4.40 dvtslocnw01-data.nbsdev.co.uk <none> <none> ----------------------------------------------- - The devices are attached to the node and there're no issues with the disk. - The ceph osd pods are crashed with the below error: ----------------------------------------------- 2022-05-17T08:22:47.459126416Z debug -3> 2022-05-17T08:22:47.416+0000 7f10dcdd6080 1 bluefs _allocate unable to allocate 0x400000 on bdev 1, allocator name block, allocator type hybrid, capacity 0x4b0000000 0, block size 0x1000, free 0xd8e089000, fragmentation 0.586552, allocated 0x0 2022-05-17T08:22:47.459126416Z debug -2> 2022-05-17T08:22:47.416+0000 7f10dcdd6080 -1 bluefs _allocate allocation failed, needed 0x4000002022-05-17T08:22:47.459139711Z 2022-05-17T08:22:47.459139711Z debug -1> 2022-05-17T08:22:47.424+0000 7f10dcdd6080 -1 /builddir/build/BUILD/ceph-16.2.7/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_and_sync_log(std::unique_lo ck<std::mutex>&, uint64_t, uint64_t)' thread 7f10dcdd6080 time 2022-05-17T08:22:47.417513+0000 2022-05-17T08:22:47.459139711Z /builddir/build/BUILD/ceph-16.2.7/src/os/bluestore/BlueFS.cc: 2554: FAILED ceph_assert(r == 0) 2022-05-17T08:22:47.459139711Z 2022-05-17T08:22:47.459139711Z ceph version 16.2.7-98.el8cp (b20d33c3b301e005bed203d3cad7245da3549f80) pacific (stable) 2022-05-17T08:22:47.459139711Z 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x55cd61fc2e3c] 2022-05-17T08:22:47.459139711Z 2: ceph-osd(+0x56b056) [0x55cd61fc3056] 2022-05-17T08:22:47.459139711Z 3: (BlueFS::_flush_and_sync_log(std::unique_lock<std::mutex>&, unsigned long, unsigned long)+0x1c93) [0x55cd626c24f3] 2022-05-17T08:22:47.459139711Z 4: (BlueFS::_fsync(BlueFS::FileWriter*, std::unique_lock<std::mutex>&)+0x322) [0x55cd626c2f22] 2022-05-17T08:22:47.459139711Z 5: (BlueRocksWritableFile::Sync()+0x6c) [0x55cd626ea79c] 2022-05-17T08:22:47.459139711Z 6: (rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x1f) [0x55cd62b83aef] 2022-05-17T08:22:47.459139711Z 7: (rocksdb::WritableFileWriter::SyncInternal(bool)+0x402) [0x55cd62c95262] 2022-05-17T08:22:47.459139711Z 8: (rocksdb::WritableFileWriter::Sync(bool)+0x88) [0x55cd62c968a8] 2022-05-17T08:22:47.459139711Z 9: (rocksdb::BuildTable(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::Env*, rocksdb::FileSystem*, rocksdb::ImmutableCFOptions co nst&, rocksdb::MutableCFOptions const&, rocksdb::FileOptions const&, rocksdb::TableCache*, rocksdb::InternalIteratorBase<rocksdb::Slice>*, std::vector<std::unique_ptr<rocksdb::FragmentedRangeTombstoneIterator, s td::default_delete<rocksdb::FragmentedRangeTombstoneIterator> >, std::allocator<std::unique_ptr<rocksdb::FragmentedRangeTombstoneIterator, std::default_delete<rocksdb::FragmentedRangeTombstoneIterator> > > >, ro cksdb::FileMetaData*, rocksdb::InternalKeyComparator const&, std::vector<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> >, std::allocator<std::uniqu e_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> > > > const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > con st&, std::vector<unsigned long, std::allocator<unsigned long> >, unsigned long, rocksdb::SnapshotChecker*, rocksdb::CompressionType, unsigned long, rocksdb::CompressionOptions const&, bool, rocksdb::InternalStat s*, rocksdb::TableFileCreationReason, rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, int, unsigned long, unsigned long, rocksdb::Env::WriteLifeTimeHint, unsigned long)+0x2ddb) [ 0x55cd62d63a0b] ----------------------------------------------- Version of all relevant components (if applicable): - v4.10 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? - Two OSD pods are restarting which makes the cluster unstable. Is there any workaround available to the best of your knowledge? N/A Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? N/A Can this issue reproducible? No, specific to the cu environment. Can this issue reproduce from the UI? N/A If this is a regression, please provide more details to justify this: N/A Steps to Reproduce: N/A Actual results: - The OSD pods are in CLBO state. Expected results: - The OSD pods should be running fine. Additional info: In the next comments