Bug 1642015
| Summary: | MDS crashed when running scrub_path command in admin-daemon | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Ramakrishnan Periyasamy <rperiyas> | ||||
| Component: | CephFS | Assignee: | Yan, Zheng <zyan> | ||||
| Status: | CLOSED WONTFIX | QA Contact: | Hemanth Kumar <hyelloji> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 3.1 | CC: | ceph-eng-bugs, hnallurv, hyelloji, jbrier, pasik, pdonnell, tserlin, zyan | ||||
| Target Milestone: | rc | ||||||
| Target Release: | 4.1 | ||||||
| Hardware: | All | ||||||
| OS: | All | ||||||
| Whiteboard: | NeedsDev | ||||||
| Fixed In Version: | Doc Type: | Known Issue | |||||
| Doc Text: |
.The Ceph Metadata Server might crash during scrub with multiple MDS
This issue is triggered when the `scrub_path` command is run in an environment with multiple Ceph Metadata Servers.
There is no workaround at this time.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2020-02-28 00:35:22 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1629656 | ||||||
| Attachments: |
|
||||||
Scrub does not work properly in multimds setup. I'm working on this issue. Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. Regards, Giri Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. Regards, Giri No. I hasn't finished the code |
Created attachment 1496682 [details] Failed MDS log Description of problem: MDS crashed when running scrub_path command from admin_daemon. command and console output: [root@host083 ceph]# ceph --admin-daemon ceph-mds.magna083.asok scrub_path /kernel2/test/file_dstdir/localhost.localdomain/thrd_25 admin_socket: exception: exception: no data returned from admin socket 0> 2018-10-23 11:34:56.051305 7fce51417700 -1 /builddir/build/BUILD/ceph-12.2.5/src/mds/CDir.cc: In function 'void CDir::fetch(MDSInternalContextBase*, boost::string_view, bool)' thread 7fce51417700 time 2018-10-23 11:34:56.047450 /builddir/build/BUILD/ceph-12.2.5/src/mds/CDir.cc: 1473: FAILED assert(is_auth()) ceph version 12.2.5-59.el7cp (d4b9f17b56b3348566926849313084dd6efc2ca2) luminous (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x562d0af01210] 2: (CDir::fetch(MDSInternalContextBase*, boost::basic_string_view<char, std::char_traits<char> >, bool)+0x8a8) [0x562d0adc8258] 3: (CDir::fetch(MDSInternalContextBase*, bool)+0x30) [0x562d0adc8350] 4: (()+0x4f5666) [0x562d0adf8666] 5: (()+0x5003be) [0x562d0ae033be] 6: (Continuation::_continue_function(int, int)+0x1aa) [0x562d0ae11aba] 7: (Continuation::Callback::finish(int)+0x10) [0x562d0ae11ba0] 8: (Context::complete(int)+0x9) [0x562d0abc8589] 9: (MDSIOContextBase::complete(int)+0xa4) [0x562d0ae4b824] 10: (Finisher::finisher_thread_entry()+0x198) [0x562d0af00188] 11: (()+0x7dd5) [0x7fce5c259dd5] 12: (clone()+0x6d) [0x7fce5b336ead] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 2018-10-23 11:34:56.100199 7fce51417700 -1 *** Caught signal (Aborted) ** in thread 7fce51417700 thread_name:fn_anonymous ceph version 12.2.5-59.el7cp (d4b9f17b56b3348566926849313084dd6efc2ca2) luminous (stable) 1: (()+0x5bd6f1) [0x562d0aec06f1] 2: (()+0xf5d0) [0x7fce5c2615d0] 3: (gsignal()+0x37) [0x7fce5b26f207] 4: (abort()+0x148) [0x7fce5b2708f8] 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x284) [0x562d0af01384] 6: (CDir::fetch(MDSInternalContextBase*, boost::basic_string_view<char, std::char_traits<char> >, bool)+0x8a8) [0x562d0adc8258] 7: (CDir::fetch(MDSInternalContextBase*, bool)+0x30) [0x562d0adc8350] 8: (()+0x4f5666) [0x562d0adf8666] 9: (()+0x5003be) [0x562d0ae033be] 10: (Continuation::_continue_function(int, int)+0x1aa) [0x562d0ae11aba] 11: (Continuation::Callback::finish(int)+0x10) [0x562d0ae11ba0] 12: (Context::complete(int)+0x9) [0x562d0abc8589] 13: (MDSIOContextBase::complete(int)+0xa4) [0x562d0ae4b824] 14: (Finisher::finisher_thread_entry()+0x198) [0x562d0af00188] 15: (()+0x7dd5) [0x7fce5c259dd5] 16: (clone()+0x6d) [0x7fce5b336ead] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- begin dump of recent events --- Version-Release number of selected component (if applicable): ceph version 12.2.5-59.el7cp (d4b9f17b56b3348566926849313084dd6efc2ca2) luminous (stable) How reproducible: 1/1 Steps to Reproduce: 1. Create cluster with active - active MDS 2. get a dir path from "get subtree" command 3. Run scrub_path Actual results: MDS crashed and standby MDS become active Expected results: There should not be crash Additional info: NA