Created attachment 1496682 [details] Failed MDS log Description of problem: MDS crashed when running scrub_path command from admin_daemon. command and console output: [root@host083 ceph]# ceph --admin-daemon ceph-mds.magna083.asok scrub_path /kernel2/test/file_dstdir/localhost.localdomain/thrd_25 admin_socket: exception: exception: no data returned from admin socket 0> 2018-10-23 11:34:56.051305 7fce51417700 -1 /builddir/build/BUILD/ceph-12.2.5/src/mds/CDir.cc: In function 'void CDir::fetch(MDSInternalContextBase*, boost::string_view, bool)' thread 7fce51417700 time 2018-10-23 11:34:56.047450 /builddir/build/BUILD/ceph-12.2.5/src/mds/CDir.cc: 1473: FAILED assert(is_auth()) ceph version 12.2.5-59.el7cp (d4b9f17b56b3348566926849313084dd6efc2ca2) luminous (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x562d0af01210] 2: (CDir::fetch(MDSInternalContextBase*, boost::basic_string_view<char, std::char_traits<char> >, bool)+0x8a8) [0x562d0adc8258] 3: (CDir::fetch(MDSInternalContextBase*, bool)+0x30) [0x562d0adc8350] 4: (()+0x4f5666) [0x562d0adf8666] 5: (()+0x5003be) [0x562d0ae033be] 6: (Continuation::_continue_function(int, int)+0x1aa) [0x562d0ae11aba] 7: (Continuation::Callback::finish(int)+0x10) [0x562d0ae11ba0] 8: (Context::complete(int)+0x9) [0x562d0abc8589] 9: (MDSIOContextBase::complete(int)+0xa4) [0x562d0ae4b824] 10: (Finisher::finisher_thread_entry()+0x198) [0x562d0af00188] 11: (()+0x7dd5) [0x7fce5c259dd5] 12: (clone()+0x6d) [0x7fce5b336ead] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 2018-10-23 11:34:56.100199 7fce51417700 -1 *** Caught signal (Aborted) ** in thread 7fce51417700 thread_name:fn_anonymous ceph version 12.2.5-59.el7cp (d4b9f17b56b3348566926849313084dd6efc2ca2) luminous (stable) 1: (()+0x5bd6f1) [0x562d0aec06f1] 2: (()+0xf5d0) [0x7fce5c2615d0] 3: (gsignal()+0x37) [0x7fce5b26f207] 4: (abort()+0x148) [0x7fce5b2708f8] 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x284) [0x562d0af01384] 6: (CDir::fetch(MDSInternalContextBase*, boost::basic_string_view<char, std::char_traits<char> >, bool)+0x8a8) [0x562d0adc8258] 7: (CDir::fetch(MDSInternalContextBase*, bool)+0x30) [0x562d0adc8350] 8: (()+0x4f5666) [0x562d0adf8666] 9: (()+0x5003be) [0x562d0ae033be] 10: (Continuation::_continue_function(int, int)+0x1aa) [0x562d0ae11aba] 11: (Continuation::Callback::finish(int)+0x10) [0x562d0ae11ba0] 12: (Context::complete(int)+0x9) [0x562d0abc8589] 13: (MDSIOContextBase::complete(int)+0xa4) [0x562d0ae4b824] 14: (Finisher::finisher_thread_entry()+0x198) [0x562d0af00188] 15: (()+0x7dd5) [0x7fce5c259dd5] 16: (clone()+0x6d) [0x7fce5b336ead] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- begin dump of recent events --- Version-Release number of selected component (if applicable): ceph version 12.2.5-59.el7cp (d4b9f17b56b3348566926849313084dd6efc2ca2) luminous (stable) How reproducible: 1/1 Steps to Reproduce: 1. Create cluster with active - active MDS 2. get a dir path from "get subtree" command 3. Run scrub_path Actual results: MDS crashed and standby MDS become active Expected results: There should not be crash Additional info: NA
Scrub does not work properly in multimds setup. I'm working on this issue.
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. Regards, Giri
No. I hasn't finished the code