Bug 2037279
| Summary: | [Azure] OSDs go into CLBO state while mounting an RBD PVC | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Rachael <rgeorge> | |
| Component: | rook | Assignee: | Subham Rai <srai> | |
| Status: | CLOSED ERRATA | QA Contact: | Rachael <rgeorge> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 4.9 | CC: | akupczyk, assingh, bniver, kramdoss, madam, mhackett, mmuench, muagarwa, nojha, ocs-bugs, odf-bz-bot, owasserm, shan, sheggodu, srai, tnielsen | |
| Target Milestone: | --- | Keywords: | Reopened | |
| Target Release: | ODF 4.10.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | 4.10.0-115 | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2041964 2043179 2043185 2043186 (view as bug list) | Environment: | ||
| Last Closed: | 2022-04-13 18:51:24 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2041964, 2043179, 2043185, 2043186 | |||
|
Description
Rachael
2022-01-05 11:18:15 UTC
The OSD.2 log [1] shows the following bluestore assert. Neha, who could take a look?
2021-12-28T18:56:21.713686028Z debug 2021-12-28T18:56:21.712+0000 7f4e5efec700 1 osd.2 pg_epoch: 475 pg[2.8( v 443'248574 (290'246134,443'248574] local-lis/les=442/443 n=527 ec=12/12 lis/c=462/239 les/c/f=463/240/0 sis=475) [1,2,0] r=1 lpr=475 pi=[239,475)/4 crt=443'248574 lcod 0'0 mlcod 0'0 unknown NOTIFY mbc={}] state<Start>: transitioning to Stray
2021-12-28T18:56:22.766209808Z /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: In function 'virtual int64_t HybridAllocator::allocate(uint64_t, uint64_t, uint64_t, int64_t, PExtentVector*)' thread 7f4e5ffee700 time 2021-12-28T18:56:22.766157+0000
2021-12-28T18:56:22.766209808Z /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: 31: FAILED ceph_assert(want % unit == 0)
2021-12-28T18:56:22.766415811Z /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: In function 'virtual int64_t HybridAllocator::allocate(uint64_t, uint64_t, uint64_t, int64_t, PExtentVector*)' thread 7f4e5b7e5700 time 2021-12-28T18:56:22.766379+0000
2021-12-28T18:56:22.766415811Z /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: 31: FAILED ceph_assert(want % unit == 0)
2021-12-28T18:56:22.767038720Z /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: In function 'virtual int64_t HybridAllocator::allocate(uint64_t, uint64_t, uint64_t, int64_t, PExtentVector*)' thread 7f4e5dfea700 time 2021-12-28T18:56:22.766990+0000
2021-12-28T18:56:22.767038720Z /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: 31: FAILED ceph_assert(want % unit == 0)
2021-12-28T18:56:22.767359025Z /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: In function 'virtual int64_t HybridAllocator::allocate(uint64_t, uint64_t, uint64_t, int64_t, PExtentVector*)' thread 7f4e607ef700 time 2021-12-28T18:56:22.767327+0000
2021-12-28T18:56:22.767359025Z /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: 31: FAILED ceph_assert(want % unit == 0)
2021-12-28T18:56:22.772643302Z ceph version 16.2.0-146.el8cp (56f5e9cfe88a08b6899327eca5166ca1c4a392aa) pacific (stable)
2021-12-28T18:56:22.772643302Z 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x5639c40de27a]
2021-12-28T18:56:22.772643302Z 2: ceph-osd(+0x567494) [0x5639c40de494]
2021-12-28T18:56:22.772643302Z 3: (HybridAllocator::allocate(unsigned long, unsigned long, unsigned long, long, std::vector<bluestore_pextent_t, mempool::pool_allocator<(mempool::pool_index_t)5, bluestore_pextent_t> >*)+0x97f) [0x5639c481663f]
2021-12-28T18:56:22.772643302Z 4: (BlueStore::_do_alloc_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>, boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*)+0x70e) [0x5639c46e8a8e]
2021-12-28T18:56:22.772643302Z 5: (BlueStore::_do_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, unsigned long, ceph::buffer::v15_2_0::list&, unsigned int)+0x694) [0x5639c4721b04]
2021-12-28T18:56:22.772643302Z 6: (BlueStore::_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&, unsigned long, unsigned long, ceph::buffer::v15_2_0::list&, unsigned int)+0xd1) [0x5639c4722c71]
2021-12-28T18:56:22.772643302Z 7: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0x2077) [0x5639c4726cb7]
2021-12-28T18:56:22.772643302Z 8: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x316) [0x5639c4727ff6]
2021-12-28T18:56:22.772643302Z 9: (non-virtual thunk to PrimaryLogPG::queue_transactions(std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<OpRequest>)+0x58) [0x5639c4384058]
2021-12-28T18:56:22.772643302Z 10: (ReplicatedBackend::submit_transaction(hobject_t const&, object_stat_sum_t const&, eversion_t const&, std::unique_ptr<PGTransaction, std::default_delete<PGTransaction> >&&, eversion_t const&, eversion_t const&, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&&, std::optional<pg_hit_set_history_t>&, Context*, unsigned long, osd_reqid_t, boost::intrusive_ptr<OpRequest>)+0xc9d) [0x5639c456dbdd]
2021-12-28T18:56:22.772643302Z 11: (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*, PrimaryLogPG::OpContext*)+0xce0) [0x5639c42ef570]
2021-12-28T18:56:22.772643302Z 12: (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0x115d) [0x5639c434b45d]
2021-12-28T18:56:22.772643302Z 13: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2de2) [0x5639c4353fd2]
2021-12-28T18:56:22.772643302Z 14: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0xd1c) [0x5639c435b31c]
2021-12-28T18:56:22.772643302Z 15: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x309) [0x5639c41e4789]
2021-12-28T18:56:22.772643302Z 16: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x68) [0x5639c4441968]
2021-12-28T18:56:22.772643302Z 17: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xa58) [0x5639c4204738]
2021-12-28T18:56:22.772643302Z 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5c4) [0x5639c486bc74]
2021-12-28T18:56:22.772643302Z 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x5639c486e914]
2021-12-28T18:56:22.772643302Z 20: /lib64/libpthread.so.0(+0x817a) [0x7f4e838e317a]
2021-12-28T18:56:22.772643302Z 21: clone()
2021-12-28T18:56:22.772676103Z debug 2021-12-28T18:56:22.771+0000 7f4e5ffee700 -1 /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: In function 'virtual int64_t HybridAllocator::allocate(uint64_t, uint64_t, uint64_t, int64_t, PExtentVector*)' thread 7f4e607ef700 time 2021-12-28T18:56:22.767327+0000
2021-12-28T18:56:22.772676103Z /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: 31: FAILED ceph_assert(want % unit == 0)
2021-12-28T18:56:22.772676103Z
2021-12-28T18:56:22.772676103Z ceph version 16.2.0-146.el8cp (56f5e9cfe88a08b6899327eca5166ca1c4a392aa) pacific (stable)
2021-12-28T18:56:22.772676103Z 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x5639c40de27a]
2021-12-28T18:56:22.772676103Z 2: ceph-osd(+0x567494) [0x5639c40de494]
2021-12-28T18:56:22.772676103Z 3: (HybridAllocator::allocate(unsigned long, unsigned long, unsigned long, long, std::vector<bluestore_pextent_t, mempool::pool_allocator<(mempool::pool_index_t)5, bluestore_pextent_t> >*)+0x97f) [0x5639c481663f]
2021-12-28T18:56:22.772676103Z 4: (BlueStore::_do_alloc_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>, boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*)+0x70e) [0x5639c46e8a8e]
2021-12-28T18:56:22.772676103Z 5: (BlueStore::_do_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, unsigned long, ceph::buffer::v15_2_0::list&, unsigned int)+0x694) [0x5639c4721b04]
2021-12-28T18:56:22.772676103Z 6: (BlueStore::_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&, unsigned long, unsigned long, ceph::buffer::v15_2_0::list&, unsigned int)+0xd1) [0x5639c4722c71]
2021-12-28T18:56:22.772676103Z 7: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0x2077) [0x5639c4726cb7]
2021-12-28T18:56:22.772676103Z 8: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x316) [0x5639c4727ff6]
2021-12-28T18:56:22.772676103Z 9: (non-virtual thunk to PrimaryLogPG::queue_transactions(std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<OpRequest>)+0x58) [0x5639c4384058]
2021-12-28T18:56:22.772676103Z 10: (ReplicatedBackend::submit_transaction(hobject_t const&, object_stat_sum_t const&, eversion_t const&, std::unique_ptr<PGTransaction, std::default_delete<PGTransaction> >&&, eversion_t const&, eversion_t const&, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&&, std::optional<pg_hit_set_history_t>&, Context*, unsigned long, osd_reqid_t, boost::intrusive_ptr<OpRequest>)+0xc9d) [0x5639c456dbdd]
2021-12-28T18:56:22.772676103Z 11: (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*, PrimaryLogPG::OpContext*)+0xce0) [0x5639c42ef570]
2021-12-28T18:56:22.772676103Z 12: (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0x115d) [0x5639c434b45d]
2021-12-28T18:56:22.772676103Z 13: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2de2) [0x5639c4353fd2]
2021-12-28T18:56:22.772676103Z 14: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0xd1c) [0x5639c435b31c]
2021-12-28T18:56:22.772676103Z 15: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x309) [0x5639c41e4789]
2021-12-28T18:56:22.772676103Z 16: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x68) [0x5639c4441968]
2021-12-28T18:56:22.772676103Z 17: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xa58) [0x5639c4204738]
2021-12-28T18:56:22.772676103Z 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5c4) [0x5639c486bc74]
2021-12-28T18:56:22.772676103Z 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x5639c486e914]
2021-12-28T18:56:22.772676103Z 20: /lib64/libpthread.so.0(+0x817a) [0x7f4e838e317a]
2021-12-28T18:56:22.772676103Z 21: clone()
2021-12-28T18:56:22.772676103Z
2021-12-28T18:56:22.772692403Z *** Caught signal (Aborted) **
2021-12-28T18:56:22.772692403Z in thread 7f4e5ffee700 thread_name:tp_osd_tp
2021-12-28T18:56:22.779175698Z ceph version 16.2.0-146.el8cp (56f5e9cfe88a08b6899327eca5166ca1c4a392aa) pacific (stable)
2021-12-28T18:56:22.779175698Z 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x5639c40de27a]
2021-12-28T18:56:22.779175698Z 2: ceph-osd(+0x567494) [0x5639c40de494]
2021-12-28T18:56:22.779175698Z 3: (HybridAllocator::allocate(unsigned long, unsigned long, unsigned long, long, std::vector<bluestore_pextent_t, mempool::pool_allocator<(mempool::pool_index_t)5, bluestore_pextent_t> >*)+0x97f) [0x5639c481663f]
2021-12-28T18:56:22.779175698Z 4: (BlueStore::_do_alloc_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>, boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*)+0x70e) [0x5639c46e8a8e]
2021-12-28T18:56:22.779175698Z 5: (BlueStore::_do_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, unsigned long, ceph::buffer::v15_2_0::list&, unsigned int)+0x694) [0x5639c4721b04]
2021-12-28T18:56:22.779175698Z 6: (BlueStore::_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&, unsigned long, unsigned long, ceph::buffer::v15_2_0::list&, unsigned int)+0xd1) [0x5639c4722c71]
2021-12-28T18:56:22.779175698Z 7: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0x2077) [0x5639c4726cb7]
2021-12-28T18:56:22.779175698Z 8: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x316) [0x5639c4727ff6]
2021-12-28T18:56:22.779175698Z 9: (non-virtual thunk to PrimaryLogPG::queue_transactions(std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<OpRequest>)+0x58) [0x5639c4384058]
2021-12-28T18:56:22.779175698Z 10: (ReplicatedBackend::submit_transaction(hobject_t const&, object_stat_sum_t const&, eversion_t const&, std::unique_ptr<PGTransaction, std::default_delete<PGTransaction> >&&, eversion_t const&, eversion_t const&, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&&, std::optional<pg_hit_set_history_t>&, Context*, unsigned long, osd_reqid_t, boost::intrusive_ptr<OpRequest>)+0xc9d) [0x5639c456dbdd]
2021-12-28T18:56:22.779175698Z 11: (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*, PrimaryLogPG::OpContext*)+0xce0) [0x5639c42ef570]
2021-12-28T18:56:22.779175698Z 12: (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0x115d) [0x5639c434b45d]
2021-12-28T18:56:22.779175698Z 13: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2de2) [0x5639c4353fd2]
2021-12-28T18:56:22.779175698Z 14: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0xd1c) [0x5639c435b31c]
2021-12-28T18:56:22.779175698Z 15: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x309) [0x5639c41e4789]
2021-12-28T18:56:22.779175698Z 16: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x68) [0x5639c4441968]
2021-12-28T18:56:22.779175698Z 17: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xa58) [0x5639c4204738]
2021-12-28T18:56:22.779175698Z 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5c4) [0x5639c486bc74]
2021-12-28T18:56:22.779175698Z 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x5639c486e914]
2021-12-28T18:56:22.779175698Z 20: /lib64/libpthread.so.0(+0x817a) [0x7f4e838e317a]
2021-12-28T18:56:22.779175698Z 21: clone()
2021-12-28T18:56:22.779214398Z debug
[1] http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-067zi3c33-uo/j-067zi3c33-uo_20211228T093525/logs/failed_testcase_ocs_logs_1640694899/test_multiple_sc_one_pool_rep2_comp_ocs_logs/ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256-71a98052bdd86ccc10967af82739b04d6b2305eb6721cb0744470de2c284fd15/namespaces/openshift-storage/pods/rook-ceph-osd-2-c95644c47-sz89g/osd/osd/logs/current.log
(In reply to Travis Nielsen from comment #4) > The OSD.2 log [1] shows the following bluestore assert. Neha, who could take > a look? This assert does not ring a bell for me, but @akupczyk is the best person to take a look at it. How reproducible is this? Can we capture logs with higher debug levels, like debug_bluestore=20? > > 2021-12-28T18:56:21.713686028Z debug 2021-12-28T18:56:21.712+0000 > 7f4e5efec700 1 osd.2 pg_epoch: 475 pg[2.8( v 443'248574 > (290'246134,443'248574] local-lis/les=442/443 n=527 ec=12/12 lis/c=462/239 > les/c/f=463/240/0 sis=475) [1,2,0] r=1 lpr=475 pi=[239,475)/4 crt=443'248574 > lcod 0'0 mlcod 0'0 unknown NOTIFY mbc={}] state<Start>: transitioning to > Stray > 2021-12-28T18:56:22.766209808Z > /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: In > function 'virtual int64_t HybridAllocator::allocate(uint64_t, uint64_t, > uint64_t, int64_t, PExtentVector*)' thread 7f4e5ffee700 time > 2021-12-28T18:56:22.766157+0000 > 2021-12-28T18:56:22.766209808Z > /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: 31: > FAILED ceph_assert(want % unit == 0) > 2021-12-28T18:56:22.766415811Z > /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: In > function 'virtual int64_t HybridAllocator::allocate(uint64_t, uint64_t, > uint64_t, int64_t, PExtentVector*)' thread 7f4e5b7e5700 time > 2021-12-28T18:56:22.766379+0000 > 2021-12-28T18:56:22.766415811Z > /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: 31: > FAILED ceph_assert(want % unit == 0) > 2021-12-28T18:56:22.767038720Z > /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: In > function 'virtual int64_t HybridAllocator::allocate(uint64_t, uint64_t, > uint64_t, int64_t, PExtentVector*)' thread 7f4e5dfea700 time > 2021-12-28T18:56:22.766990+0000 > 2021-12-28T18:56:22.767038720Z > /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: 31: > FAILED ceph_assert(want % unit == 0) > 2021-12-28T18:56:22.767359025Z > /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: In > function 'virtual int64_t HybridAllocator::allocate(uint64_t, uint64_t, > uint64_t, int64_t, PExtentVector*)' thread 7f4e607ef700 time > 2021-12-28T18:56:22.767327+0000 > 2021-12-28T18:56:22.767359025Z > /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: 31: > FAILED ceph_assert(want % unit == 0) > 2021-12-28T18:56:22.772643302Z ceph version 16.2.0-146.el8cp > (56f5e9cfe88a08b6899327eca5166ca1c4a392aa) pacific (stable) > 2021-12-28T18:56:22.772643302Z 1: (ceph::__ceph_assert_fail(char const*, > char const*, int, char const*)+0x158) [0x5639c40de27a] > 2021-12-28T18:56:22.772643302Z 2: ceph-osd(+0x567494) [0x5639c40de494] > 2021-12-28T18:56:22.772643302Z 3: (HybridAllocator::allocate(unsigned long, > unsigned long, unsigned long, long, std::vector<bluestore_pextent_t, > mempool::pool_allocator<(mempool::pool_index_t)5, bluestore_pextent_t> > >*)+0x97f) [0x5639c481663f] > 2021-12-28T18:56:22.772643302Z 4: > (BlueStore::_do_alloc_write(BlueStore::TransContext*, > boost::intrusive_ptr<BlueStore::Collection>, > boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*)+0x70e) > [0x5639c46e8a8e] > 2021-12-28T18:56:22.772643302Z 5: > (BlueStore::_do_write(BlueStore::TransContext*, > boost::intrusive_ptr<BlueStore::Collection>&, > boost::intrusive_ptr<BlueStore::Onode>, unsigned long, unsigned long, > ceph::buffer::v15_2_0::list&, unsigned int)+0x694) [0x5639c4721b04] > 2021-12-28T18:56:22.772643302Z 6: > (BlueStore::_write(BlueStore::TransContext*, > boost::intrusive_ptr<BlueStore::Collection>&, > boost::intrusive_ptr<BlueStore::Onode>&, unsigned long, unsigned long, > ceph::buffer::v15_2_0::list&, unsigned int)+0xd1) [0x5639c4722c71] > 2021-12-28T18:56:22.772643302Z 7: > (BlueStore::_txc_add_transaction(BlueStore::TransContext*, > ceph::os::Transaction*)+0x2077) [0x5639c4726cb7] > 2021-12-28T18:56:22.772643302Z 8: > (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore:: > CollectionImpl>&, std::vector<ceph::os::Transaction, > std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, > ThreadPool::TPHandle*)+0x316) [0x5639c4727ff6] > 2021-12-28T18:56:22.772643302Z 9: (non-virtual thunk to > PrimaryLogPG::queue_transactions(std::vector<ceph::os::Transaction, > std::allocator<ceph::os::Transaction> >&, > boost::intrusive_ptr<OpRequest>)+0x58) [0x5639c4384058] > 2021-12-28T18:56:22.772643302Z 10: > (ReplicatedBackend::submit_transaction(hobject_t const&, object_stat_sum_t > const&, eversion_t const&, std::unique_ptr<PGTransaction, > std::default_delete<PGTransaction> >&&, eversion_t const&, eversion_t > const&, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&&, > std::optional<pg_hit_set_history_t>&, Context*, unsigned long, osd_reqid_t, > boost::intrusive_ptr<OpRequest>)+0xc9d) [0x5639c456dbdd] > 2021-12-28T18:56:22.772643302Z 11: > (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*, > PrimaryLogPG::OpContext*)+0xce0) [0x5639c42ef570] > 2021-12-28T18:56:22.772643302Z 12: > (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0x115d) [0x5639c434b45d] > 2021-12-28T18:56:22.772643302Z 13: > (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2de2) > [0x5639c4353fd2] > 2021-12-28T18:56:22.772643302Z 14: > (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, > ThreadPool::TPHandle&)+0xd1c) [0x5639c435b31c] > 2021-12-28T18:56:22.772643302Z 15: > (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, > ThreadPool::TPHandle&)+0x309) [0x5639c41e4789] > 2021-12-28T18:56:22.772643302Z 16: > (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, > boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x68) [0x5639c4441968] > 2021-12-28T18:56:22.772643302Z 17: (OSD::ShardedOpWQ::_process(unsigned > int, ceph::heartbeat_handle_d*)+0xa58) [0x5639c4204738] > 2021-12-28T18:56:22.772643302Z 18: > (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5c4) > [0x5639c486bc74] > 2021-12-28T18:56:22.772643302Z 19: > (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x5639c486e914] > 2021-12-28T18:56:22.772643302Z 20: /lib64/libpthread.so.0(+0x817a) > [0x7f4e838e317a] > 2021-12-28T18:56:22.772643302Z 21: clone() > 2021-12-28T18:56:22.772676103Z debug 2021-12-28T18:56:22.771+0000 > 7f4e5ffee700 -1 > /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: In > function 'virtual int64_t HybridAllocator::allocate(uint64_t, uint64_t, > uint64_t, int64_t, PExtentVector*)' thread 7f4e607ef700 time > 2021-12-28T18:56:22.767327+0000 > 2021-12-28T18:56:22.772676103Z > /builddir/build/BUILD/ceph-16.2.0/src/os/bluestore/HybridAllocator.cc: 31: > FAILED ceph_assert(want % unit == 0) > 2021-12-28T18:56:22.772676103Z > 2021-12-28T18:56:22.772676103Z ceph version 16.2.0-146.el8cp > (56f5e9cfe88a08b6899327eca5166ca1c4a392aa) pacific (stable) > 2021-12-28T18:56:22.772676103Z 1: (ceph::__ceph_assert_fail(char const*, > char const*, int, char const*)+0x158) [0x5639c40de27a] > 2021-12-28T18:56:22.772676103Z 2: ceph-osd(+0x567494) [0x5639c40de494] > 2021-12-28T18:56:22.772676103Z 3: (HybridAllocator::allocate(unsigned long, > unsigned long, unsigned long, long, std::vector<bluestore_pextent_t, > mempool::pool_allocator<(mempool::pool_index_t)5, bluestore_pextent_t> > >*)+0x97f) [0x5639c481663f] > 2021-12-28T18:56:22.772676103Z 4: > (BlueStore::_do_alloc_write(BlueStore::TransContext*, > boost::intrusive_ptr<BlueStore::Collection>, > boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*)+0x70e) > [0x5639c46e8a8e] > 2021-12-28T18:56:22.772676103Z 5: > (BlueStore::_do_write(BlueStore::TransContext*, > boost::intrusive_ptr<BlueStore::Collection>&, > boost::intrusive_ptr<BlueStore::Onode>, unsigned long, unsigned long, > ceph::buffer::v15_2_0::list&, unsigned int)+0x694) [0x5639c4721b04] > 2021-12-28T18:56:22.772676103Z 6: > (BlueStore::_write(BlueStore::TransContext*, > boost::intrusive_ptr<BlueStore::Collection>&, > boost::intrusive_ptr<BlueStore::Onode>&, unsigned long, unsigned long, > ceph::buffer::v15_2_0::list&, unsigned int)+0xd1) [0x5639c4722c71] > 2021-12-28T18:56:22.772676103Z 7: > (BlueStore::_txc_add_transaction(BlueStore::TransContext*, > ceph::os::Transaction*)+0x2077) [0x5639c4726cb7] > 2021-12-28T18:56:22.772676103Z 8: > (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore:: > CollectionImpl>&, std::vector<ceph::os::Transaction, > std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, > ThreadPool::TPHandle*)+0x316) [0x5639c4727ff6] > 2021-12-28T18:56:22.772676103Z 9: (non-virtual thunk to > PrimaryLogPG::queue_transactions(std::vector<ceph::os::Transaction, > std::allocator<ceph::os::Transaction> >&, > boost::intrusive_ptr<OpRequest>)+0x58) [0x5639c4384058] > 2021-12-28T18:56:22.772676103Z 10: > (ReplicatedBackend::submit_transaction(hobject_t const&, object_stat_sum_t > const&, eversion_t const&, std::unique_ptr<PGTransaction, > std::default_delete<PGTransaction> >&&, eversion_t const&, eversion_t > const&, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&&, > std::optional<pg_hit_set_history_t>&, Context*, unsigned long, osd_reqid_t, > boost::intrusive_ptr<OpRequest>)+0xc9d) [0x5639c456dbdd] > 2021-12-28T18:56:22.772676103Z 11: > (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*, > PrimaryLogPG::OpContext*)+0xce0) [0x5639c42ef570] > 2021-12-28T18:56:22.772676103Z 12: > (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0x115d) [0x5639c434b45d] > 2021-12-28T18:56:22.772676103Z 13: > (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2de2) > [0x5639c4353fd2] > 2021-12-28T18:56:22.772676103Z 14: > (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, > ThreadPool::TPHandle&)+0xd1c) [0x5639c435b31c] > 2021-12-28T18:56:22.772676103Z 15: > (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, > ThreadPool::TPHandle&)+0x309) [0x5639c41e4789] > 2021-12-28T18:56:22.772676103Z 16: > (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, > boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x68) [0x5639c4441968] > 2021-12-28T18:56:22.772676103Z 17: (OSD::ShardedOpWQ::_process(unsigned > int, ceph::heartbeat_handle_d*)+0xa58) [0x5639c4204738] > 2021-12-28T18:56:22.772676103Z 18: > (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5c4) > [0x5639c486bc74] > 2021-12-28T18:56:22.772676103Z 19: > (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x5639c486e914] > 2021-12-28T18:56:22.772676103Z 20: /lib64/libpthread.so.0(+0x817a) > [0x7f4e838e317a] > 2021-12-28T18:56:22.772676103Z 21: clone() > 2021-12-28T18:56:22.772676103Z > 2021-12-28T18:56:22.772692403Z *** Caught signal (Aborted) ** > 2021-12-28T18:56:22.772692403Z in thread 7f4e5ffee700 thread_name:tp_osd_tp > 2021-12-28T18:56:22.779175698Z ceph version 16.2.0-146.el8cp > (56f5e9cfe88a08b6899327eca5166ca1c4a392aa) pacific (stable) > 2021-12-28T18:56:22.779175698Z 1: (ceph::__ceph_assert_fail(char const*, > char const*, int, char const*)+0x158) [0x5639c40de27a] > 2021-12-28T18:56:22.779175698Z 2: ceph-osd(+0x567494) [0x5639c40de494] > 2021-12-28T18:56:22.779175698Z 3: (HybridAllocator::allocate(unsigned long, > unsigned long, unsigned long, long, std::vector<bluestore_pextent_t, > mempool::pool_allocator<(mempool::pool_index_t)5, bluestore_pextent_t> > >*)+0x97f) [0x5639c481663f] > 2021-12-28T18:56:22.779175698Z 4: > (BlueStore::_do_alloc_write(BlueStore::TransContext*, > boost::intrusive_ptr<BlueStore::Collection>, > boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*)+0x70e) > [0x5639c46e8a8e] > 2021-12-28T18:56:22.779175698Z 5: > (BlueStore::_do_write(BlueStore::TransContext*, > boost::intrusive_ptr<BlueStore::Collection>&, > boost::intrusive_ptr<BlueStore::Onode>, unsigned long, unsigned long, > ceph::buffer::v15_2_0::list&, unsigned int)+0x694) [0x5639c4721b04] > 2021-12-28T18:56:22.779175698Z 6: > (BlueStore::_write(BlueStore::TransContext*, > boost::intrusive_ptr<BlueStore::Collection>&, > boost::intrusive_ptr<BlueStore::Onode>&, unsigned long, unsigned long, > ceph::buffer::v15_2_0::list&, unsigned int)+0xd1) [0x5639c4722c71] > 2021-12-28T18:56:22.779175698Z 7: > (BlueStore::_txc_add_transaction(BlueStore::TransContext*, > ceph::os::Transaction*)+0x2077) [0x5639c4726cb7] > 2021-12-28T18:56:22.779175698Z 8: > (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore:: > CollectionImpl>&, std::vector<ceph::os::Transaction, > std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, > ThreadPool::TPHandle*)+0x316) [0x5639c4727ff6] > 2021-12-28T18:56:22.779175698Z 9: (non-virtual thunk to > PrimaryLogPG::queue_transactions(std::vector<ceph::os::Transaction, > std::allocator<ceph::os::Transaction> >&, > boost::intrusive_ptr<OpRequest>)+0x58) [0x5639c4384058] > 2021-12-28T18:56:22.779175698Z 10: > (ReplicatedBackend::submit_transaction(hobject_t const&, object_stat_sum_t > const&, eversion_t const&, std::unique_ptr<PGTransaction, > std::default_delete<PGTransaction> >&&, eversion_t const&, eversion_t > const&, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&&, > std::optional<pg_hit_set_history_t>&, Context*, unsigned long, osd_reqid_t, > boost::intrusive_ptr<OpRequest>)+0xc9d) [0x5639c456dbdd] > 2021-12-28T18:56:22.779175698Z 11: > (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*, > PrimaryLogPG::OpContext*)+0xce0) [0x5639c42ef570] > 2021-12-28T18:56:22.779175698Z 12: > (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0x115d) [0x5639c434b45d] > 2021-12-28T18:56:22.779175698Z 13: > (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2de2) > [0x5639c4353fd2] > 2021-12-28T18:56:22.779175698Z 14: > (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, > ThreadPool::TPHandle&)+0xd1c) [0x5639c435b31c] > 2021-12-28T18:56:22.779175698Z 15: > (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, > ThreadPool::TPHandle&)+0x309) [0x5639c41e4789] > 2021-12-28T18:56:22.779175698Z 16: > (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, > boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x68) [0x5639c4441968] > 2021-12-28T18:56:22.779175698Z 17: (OSD::ShardedOpWQ::_process(unsigned > int, ceph::heartbeat_handle_d*)+0xa58) [0x5639c4204738] > 2021-12-28T18:56:22.779175698Z 18: > (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5c4) > [0x5639c486bc74] > 2021-12-28T18:56:22.779175698Z 19: > (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x5639c486e914] > 2021-12-28T18:56:22.779175698Z 20: /lib64/libpthread.so.0(+0x817a) > [0x7f4e838e317a] > 2021-12-28T18:56:22.779175698Z 21: clone() > 2021-12-28T18:56:22.779214398Z debug > > > [1] > http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-067zi3c33- > uo/j-067zi3c33-uo_20211228T093525/logs/failed_testcase_ocs_logs_1640694899/ > test_multiple_sc_one_pool_rep2_comp_ocs_logs/ocs_must_gather/quay-io-rhceph- > dev-ocs-must-gather-sha256- > 71a98052bdd86ccc10967af82739b04d6b2305eb6721cb0744470de2c284fd15/namespaces/ > openshift-storage/pods/rook-ceph-osd-2-c95644c47-sz89g/osd/osd/logs/current. > log Rachael, please go into the toolbox and run ceph config set osd debug_bluestore 20 Thanks Rachel, please try again and reopen if you still see the issue. Yeah, looks like we need a fix in rook. Travis, please take a look. Also, I can see that the same is present in 4.6/4.7/4.8 as well so this is not a regression. Probably something which we never tested? https://github.com/red-hat-storage/rook/blob/d1788e1616734109b104bfae687833a7ee3ca747/pkg/operator/ceph/cluster/osd/spec.go#L158 (In reply to Mudit Agarwal from comment #15) > Yeah, looks like we need a fix in rook. > Travis, please take a look. > > Also, I can see that the same is present in 4.6/4.7/4.8 as well so this is > not a regression. Probably something which we never tested? > https://github.com/red-hat-storage/rook/blob/ > d1788e1616734109b104bfae687833a7ee3ca747/pkg/operator/ceph/cluster/osd/spec. > go#L158 I think it was me who did the typo while doing the changes in rook a long time back. I'll modify it to 8192. Assigning it to myself. Already in the latest sync of 4.10. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1372 |