1594760 – mds, multimds: failed assertion in mds post failover

Bug 1594760 - mds, multimds: failed assertion in mds post failover

Summary: mds, multimds: failed assertion in mds post failover

Keywords:
Status:	CLOSED DUPLICATE of bug 1559749
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	CephFS
Sub Component:
Version:	3.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	medium
Target Milestone:	z5
Target Release:	3.0
Assignee:	Patrick Donnelly
QA Contact:	ceph-qe-bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-06-25 11:23 UTC by Venky Shankar
Modified:	2018-06-29 23:58 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-06-29 23:58:06 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Ceph Project Bug Tracker	23154	0	None	None	None	2018-06-25 11:22:59 UTC

Description Venky Shankar 2018-06-25 11:23:00 UTC

Description of problem:

In a multimds setup, post failover of mds rank 0 results in mds being unresponsive for a certain period followed by usual reconnection and eviction of unresponsive clients. At times there is a failed assertion in the now active mds:

1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x7f02ab91f850]
 2: (MDCache::request_get(metareqid_t)+0x267) [0x7f02ab6d7967]
 3: (Server::handle_slave_request_reply(MMDSSlaveRequest*)+0x314) [0x7f02ab68d6c4]
 4: (Server::handle_slave_request(MMDSSlaveRequest*)+0x9ab) [0x7f02ab68edfb]
 5: (Server::dispatch(Message*)+0x633) [0x7f02ab68fad3]
 6: (MDSRank::handle_deferrable_message(Message*)+0x804) [0x7f02ab6068f4]
 7: (MDSRank::_dispatch(Message*, bool)+0x1e3) [0x7f02ab614573]
 8: (MDSRankDispatcher::ms_dispatch(Message*)+0x15) [0x7f02ab6153b5]
 9: (MDSDaemon::ms_dispatch(Message*)+0xf3) [0x7f02ab5fdff3]
 10: (DispatchQueue::entry()+0x792) [0x7f02abc03be2]
 11: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f02ab9a4fbd]
 12: (()+0x7e25) [0x7f02a93f9e25]

Note You need to log in before you can comment on or make changes to this bug.