Bug 1594760 - mds, multimds: failed assertion in mds post failover
Summary: mds, multimds: failed assertion in mds post failover
Keywords:
Status: CLOSED DUPLICATE of bug 1559749
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: CephFS
Version: 3.0
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: z5
: 3.0
Assignee: Patrick Donnelly
QA Contact: ceph-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-06-25 11:23 UTC by Venky Shankar
Modified: 2018-06-29 23:58 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-29 23:58:06 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 23154 0 None None None 2018-06-25 11:22:59 UTC

Description Venky Shankar 2018-06-25 11:23:00 UTC
Description of problem:

In a multimds setup, post failover of mds rank 0 results in mds being unresponsive for a certain period followed by usual reconnection and eviction of unresponsive clients. At times there is a failed assertion in the now active mds:

1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x7f02ab91f850]
 2: (MDCache::request_get(metareqid_t)+0x267) [0x7f02ab6d7967]
 3: (Server::handle_slave_request_reply(MMDSSlaveRequest*)+0x314) [0x7f02ab68d6c4]
 4: (Server::handle_slave_request(MMDSSlaveRequest*)+0x9ab) [0x7f02ab68edfb]
 5: (Server::dispatch(Message*)+0x633) [0x7f02ab68fad3]
 6: (MDSRank::handle_deferrable_message(Message*)+0x804) [0x7f02ab6068f4]
 7: (MDSRank::_dispatch(Message*, bool)+0x1e3) [0x7f02ab614573]
 8: (MDSRankDispatcher::ms_dispatch(Message*)+0x15) [0x7f02ab6153b5]
 9: (MDSDaemon::ms_dispatch(Message*)+0xf3) [0x7f02ab5fdff3]
 10: (DispatchQueue::entry()+0x792) [0x7f02abc03be2]
 11: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f02ab9a4fbd]
 12: (()+0x7e25) [0x7f02a93f9e25]


Note You need to log in before you can comment on or make changes to this bug.