Bug 1661713 - [support] OSD asserts in DBObjectMap.cc: 662: FAILED assert(state.legacy)
Summary: [support] OSD asserts in DBObjectMap.cc: 662: FAILED assert(state.legacy)
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: RADOS
Version: 3.1
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: 3.*
Assignee: Brad Hubbard
QA Contact: ceph-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-12-22 15:45 UTC by Mike Hackett
Modified: 2019-01-08 15:52 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-01-08 15:52:57 UTC
Target Upstream Version:


Attachments (Terms of Use)
OSD log (3.30 MB, text/plain)
2018-12-22 18:20 UTC, Mike Hackett
no flags Details


Links
System ID Priority Status Summary Last Updated
Ceph Project Bug Tracker 34321 None None None 2018-12-22 15:45:14 UTC

Description Mike Hackett 2018-12-22 15:45:14 UTC
Description of problem:

During leveldb compaction OSD asserts with the following:

2018-12-22 11:02:18.538671 7fe8c7e2b700  1 leveldb: Generated table #509857: 44245 keys, 2142602 bytes
2018-12-22 11:02:18.663360 7fe8c7e2b700  1 leveldb: Generated table #509858: 58140 keys, 2135933 bytes
2018-12-22 11:02:18.751228 7fe8c7e2b700  1 leveldb: Generated table #509859: 59005 keys, 2136061 bytes
2018-12-22 11:02:18.850778 7fe8c7e2b700  1 leveldb: Generated table #509860: 58279 keys, 2135710 bytes
2018-12-22 11:02:18.964761 7fe8c7e2b700  1 leveldb: Generated table #509861: 55658 keys, 2084112 bytes
2018-12-22 11:02:18.964771 7fe8c7e2b700  1 leveldb: Compacted 1@2 + 9@3 files => 19192019 bytes
2018-12-22 11:02:18.981652 7fe8c7e2b700  1 leveldb: compacted to: files[ 0 3 51 138 0 0 0 ]
2018-12-22 11:02:18.982213 7fe8c7e2b700  1 leveldb: Delete type=2 #509815

2018-12-22 11:02:19.006828 7fe8c7e2b700  1 leveldb: Delete type=2 #509816

2018-12-22 11:02:19.045234 7fe8c7e2b700  1 leveldb: Delete type=2 #509817

2018-12-22 11:02:19.045766 7fe8c7e2b700  1 leveldb: Delete type=2 #509838

2018-12-22 11:02:19.046293 7fe8c7e2b700  1 leveldb: Delete type=2 #507765

2018-12-22 11:02:19.048877 7fe8c7e2b700  1 leveldb: Delete type=2 #509818

2018-12-22 11:02:19.049309 7fe8c7e2b700  1 leveldb: Delete type=2 #507803

2018-12-22 11:02:19.059027 7fe8c7e2b700  1 leveldb: Delete type=2 #507804

2018-12-22 11:02:19.059440 7fe8c7e2b700  1 leveldb: Delete type=2 #507805

2018-12-22 11:02:19.060090 7fe8c7e2b700  1 leveldb: Delete type=2 #509814

2018-12-22 11:03:35.732313 7fe8e6e9f700 -1 /builddir/build/BUILD/ceph-12.2.5/src/os/filestore/DBObjectMap.cc: In function 'virtual int DBObjectMap::rm_keys(const ghobject_t&, const std::set<std::basic_string<char> >&, const SequencerPosition*)' thread 7fe8e6e9f700 time 2018-12-22 11:03:35.706007
/builddir/build/BUILD/ceph-12.2.5/src/os/filestore/DBObjectMap.cc: 662: FAILED assert(state.legacy)

 ceph version 12.2.5-42.el7cp (82d52d7efa6edec70f6a0fc306f40b89265535fb) luminous (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x5610b5b27880]
 2: (DBObjectMap::rm_keys(ghobject_t const&, std::set<std::string, std::less<std::string>, std::allocator<std::string> > const&, SequencerPosition const*)+0xcc1) [0x5610b5a6ca71]
 3: (FileStore::_omap_rmkeys(coll_t const&, ghobject_t const&, std::set<std::string, std::less<std::string>, std::allocator<std::string> > const&, SequencerPosition const&)+0xac) [0x5610b58e121c]
 4: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0x2185) [0x5610b590b585]
 5: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, unsigned long, ThreadPool::TPHandle*)+0x3b) [0x5610b590f0cb]
 6: (FileStore::_do_op(FileStore::OpSequencer*, ThreadPool::TPHandle&)+0x3fa) [0x5610b590f4fa]
 7: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa8e) [0x5610b5b2e46e]
 8: (ThreadPool::WorkThread::entry()+0x10) [0x5610b5b2f350]
 9: (()+0x7dd5) [0x7fe8f8429dd5]
 10: (clone()+0x6d) [0x7fe8f751ab3d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


Need to confirm if OSD media is bad and we possibly have leveldb corruption due to this which leads to assertion when compacting (deleting).


Version-Release number of selected component (if applicable):
RHCS 3.1z1 
ceph version 12.2.5-59.el7cp

How reproducible:
In customer environment restart of the OSD and it crashes.

Upstream tracker with similar failure but no actions:

http://tracker.ceph.com/issues/34321

Comment 3 Mike Hackett 2018-12-22 18:20:04 UTC
Created attachment 1516272 [details]
OSD log


Note You need to log in before you can comment on or make changes to this bug.