Description of problem: When adding source info from another OSD, check if an object that needs recovery is present in its missing set. If yes, do not include the OSD as a missing loc else we might end with a following failure /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.0.1-2590-gea1c8ca/rpm/el7/BUILD/ceph-14.0.1-2590-gea1c8ca/src/osd/ECBackend.cc: 1547: FAILED ceph_assert(!(*m).is_missing(hoid)) ceph version 14.0.1-2590-gea1c8ca (ea1c8caf95758ce122b97d7d708086b9eff3187f) nautilus (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) [0x55bc80abc070] 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x55bc80abc23e] 3: (ECBackend::get_all_avail_shards(hobject_t const&, std::set<pg_shard_t, std::less<pg_shard_t>, std::allocator<pg_shard_t> > const&, std::set<int, std::less<int>, std::allocator<int> >&, std::map<shard_id_t, pg_shard_t, std::less<shard_id_t>, std::allocator<std::pair<shard_id_t const, pg_shard_t> > >&, bool)+0xc 1c) [0x55bc80f0a5cc] 4: (ECBackend::get_min_avail_to_read_shards(hobject_t const&, std::set<int, std::less<int>, std::allocator<int> > const&, bool, bool, std::map<pg_shard_t, std::vector<std::pair<int, int>, std::allocator<std::pair<int, int> > >, std::less<pg_shard_t>, std::allocator<std::pair<pg_shard_t const, std::vector<std::pair <int, int>, std::allocator<std::pair<int, int> > > > > >*)+0x104) [0x55bc80f0a734] 5: (ECBackend::continue_recovery_op(ECBackend::RecoveryOp&, RecoveryMessages*)+0x313) [0x55bc80f0f433] 6: (ECBackend::run_recovery_op(PGBackend::RecoveryHandle*, int)+0x1457) [0x55bc80f137e7] 7: (PrimaryLogPG::maybe_kick_recovery(hobject_t const&)+0x27d) [0x55bc80d66fad] 8: (PrimaryLogPG::wait_for_degraded_object(hobject_t const&, boost::intrusive_ptr<OpRequest>)+0x48) [0x55bc80d673a8] 9: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x1949) [0x55bc80da7169] 10: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0xbd4) [0x55bc80daac14] 11: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x1a9) [0x55bc80bf61d9] 12: (PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x62) [0x55bc80e80832] 13: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xa0c) [0x55bc80c0f7cc] 14: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x433) [0x55bc8120ab53] 15: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55bc8120dbf0] 16: (()+0x7e25) [0x7f61fea2ee25] 17: (clone()+0x6d) [0x7f61fd8f7bad]
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:0911