Bug 1485783
| Summary: | [CephFS] Standby-Replay daemon is hanging in "resolve" state while trying to take over rank | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Ramakrishnan Periyasamy <rperiyas> |
| Component: | CephFS | Assignee: | Patrick Donnelly <pdonnell> |
| Status: | CLOSED ERRATA | QA Contact: | Ramakrishnan Periyasamy <rperiyas> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.0 | CC: | ceph-eng-bugs, hnallurv, icolle, john.spray, kdreyer, rperiyas, zyan |
| Target Milestone: | rc | ||
| Target Release: | 3.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | RHEL: ceph-12.2.1-1.el7cp Ubuntu: ceph_12.2.1-2redhat1xenial | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-12-05 23:41:09 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Ramakrishnan Periyasamy
2017-08-28 05:49:57 UTC
Since you have two active MDS daemons and one standby replay, the standby replay daemon will be arbitrarily picking one of the active ones to follow. If the other active daemon is killed, then the standby replay will not replace it. John, The standby-replay daemon is configured for rank 1 MDS when existing MDS node goes for reboot then the standby-replay MDS is not replacing the failed MDS. It is in replay --> resolve state but not moving to up:active state and fs in degraded. Please check this pastebin link http://pastebin.test.redhat.com/511697 Ramakrishnan, do you have logs for the MDS daemons? The reason is that there were about 8k subtrees in directory /. MDS calls MDCache::try_subtree_merge("root dirfrag") during process resolve message. try_subtree_merge() calls MDCache::try_subtree_merge_at() for each subtree.
try_subtree_merge_at() calls MDCache::show_subtrees(15) when it about to return.
So MDCache::try_subtree_merge("root dirfrag") prink about 64M lines of message (when debug_mds >= 15). printing these message took several minutes.
This issue happens only when there are lots of subtrees and debug_mds >= 10.
Did you use 'ceph.dir.pin'? or these subtree were automatically created by balancer?
FYI: Please set debug_mds to 10 during mds QE test. 'debug_mds == 20' is too verbose, it significantly slow down mds.
open ticket http://tracker.ceph.com/issues/21221 Yes I've used "ceph.dir.pin". There are total 40k directories pinned among 2 acitve MDS(i.e. 20k pin on each MDS). please don't use ceph.dir.pin this way. It's better to create dir0 and dir1, set ceph.dir.pin on dir0 and dir1. Then create lots of sub-directories in dir1 and dir2. Moving this bz to verified state. verified in ceph version 12.2.1-10.el7cp (5ba1c3fa606d7bf16f72756b0026f04a40297673) luminous (stable) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3387 |