Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
This project is now read‑only. Starting Monday, February 2, please use Jira Cloud for all bug tracking management.

Bug 1661713

Summary: [support] OSD asserts in DBObjectMap.cc: 662: FAILED assert(state.legacy)
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Mike Hackett <mhackett>
Component: RADOSAssignee: Brad Hubbard <bhubbard>
Status: CLOSED WONTFIX QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.1CC: bhubbard, ceph-eng-bugs, dzafman, jdurgin, kchai, mhackett, nojha, torkil, tpetr, vumrao
Target Milestone: rc   
Target Release: 3.*   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-01-08 15:52:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
OSD log none

Description Mike Hackett 2018-12-22 15:45:14 UTC
Description of problem:

During leveldb compaction OSD asserts with the following:

2018-12-22 11:02:18.538671 7fe8c7e2b700  1 leveldb: Generated table #509857: 44245 keys, 2142602 bytes
2018-12-22 11:02:18.663360 7fe8c7e2b700  1 leveldb: Generated table #509858: 58140 keys, 2135933 bytes
2018-12-22 11:02:18.751228 7fe8c7e2b700  1 leveldb: Generated table #509859: 59005 keys, 2136061 bytes
2018-12-22 11:02:18.850778 7fe8c7e2b700  1 leveldb: Generated table #509860: 58279 keys, 2135710 bytes
2018-12-22 11:02:18.964761 7fe8c7e2b700  1 leveldb: Generated table #509861: 55658 keys, 2084112 bytes
2018-12-22 11:02:18.964771 7fe8c7e2b700  1 leveldb: Compacted 1@2 + 9@3 files => 19192019 bytes
2018-12-22 11:02:18.981652 7fe8c7e2b700  1 leveldb: compacted to: files[ 0 3 51 138 0 0 0 ]
2018-12-22 11:02:18.982213 7fe8c7e2b700  1 leveldb: Delete type=2 #509815

2018-12-22 11:02:19.006828 7fe8c7e2b700  1 leveldb: Delete type=2 #509816

2018-12-22 11:02:19.045234 7fe8c7e2b700  1 leveldb: Delete type=2 #509817

2018-12-22 11:02:19.045766 7fe8c7e2b700  1 leveldb: Delete type=2 #509838

2018-12-22 11:02:19.046293 7fe8c7e2b700  1 leveldb: Delete type=2 #507765

2018-12-22 11:02:19.048877 7fe8c7e2b700  1 leveldb: Delete type=2 #509818

2018-12-22 11:02:19.049309 7fe8c7e2b700  1 leveldb: Delete type=2 #507803

2018-12-22 11:02:19.059027 7fe8c7e2b700  1 leveldb: Delete type=2 #507804

2018-12-22 11:02:19.059440 7fe8c7e2b700  1 leveldb: Delete type=2 #507805

2018-12-22 11:02:19.060090 7fe8c7e2b700  1 leveldb: Delete type=2 #509814

2018-12-22 11:03:35.732313 7fe8e6e9f700 -1 /builddir/build/BUILD/ceph-12.2.5/src/os/filestore/DBObjectMap.cc: In function 'virtual int DBObjectMap::rm_keys(const ghobject_t&, const std::set<std::basic_string<char> >&, const SequencerPosition*)' thread 7fe8e6e9f700 time 2018-12-22 11:03:35.706007
/builddir/build/BUILD/ceph-12.2.5/src/os/filestore/DBObjectMap.cc: 662: FAILED assert(state.legacy)

 ceph version 12.2.5-42.el7cp (82d52d7efa6edec70f6a0fc306f40b89265535fb) luminous (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x5610b5b27880]
 2: (DBObjectMap::rm_keys(ghobject_t const&, std::set<std::string, std::less<std::string>, std::allocator<std::string> > const&, SequencerPosition const*)+0xcc1) [0x5610b5a6ca71]
 3: (FileStore::_omap_rmkeys(coll_t const&, ghobject_t const&, std::set<std::string, std::less<std::string>, std::allocator<std::string> > const&, SequencerPosition const&)+0xac) [0x5610b58e121c]
 4: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0x2185) [0x5610b590b585]
 5: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, unsigned long, ThreadPool::TPHandle*)+0x3b) [0x5610b590f0cb]
 6: (FileStore::_do_op(FileStore::OpSequencer*, ThreadPool::TPHandle&)+0x3fa) [0x5610b590f4fa]
 7: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa8e) [0x5610b5b2e46e]
 8: (ThreadPool::WorkThread::entry()+0x10) [0x5610b5b2f350]
 9: (()+0x7dd5) [0x7fe8f8429dd5]
 10: (clone()+0x6d) [0x7fe8f751ab3d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


Need to confirm if OSD media is bad and we possibly have leveldb corruption due to this which leads to assertion when compacting (deleting).


Version-Release number of selected component (if applicable):
RHCS 3.1z1 
ceph version 12.2.5-59.el7cp

How reproducible:
In customer environment restart of the OSD and it crashes.

Upstream tracker with similar failure but no actions:

http://tracker.ceph.com/issues/34321

Comment 3 Mike Hackett 2018-12-22 18:20:04 UTC
Created attachment 1516272 [details]
OSD log