Bug 1554593
Summary: | Kernel and fuse client mount hangs forever | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | liuwei <wliu> |
Component: | CephFS | Assignee: | Yan, Zheng <zyan> |
Status: | CLOSED ERRATA | QA Contact: | Ramakrishnan Periyasamy <rperiyas> |
Severity: | high | Docs Contact: | Aron Gunn <agunn> |
Priority: | high | ||
Version: | 3.0 | CC: | agunn, bhubbard, ceph-eng-bugs, ceph-qe-bugs, dzafman, jbiao, john.spray, jquinn, kchai, kdreyer, linuxkidd, pdonnell, rperiyas, tchandra, tserlin, vumrao, wliu, zyan |
Target Milestone: | z3 | ||
Target Release: | 3.0 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | RHEL: ceph-12.2.4-10.el7cp Ubuntu: ceph_12.2.4-14redhat1xenial | Doc Type: | Bug Fix |
Doc Text: |
.Sending large amounts of metadata to another MDS will not cause an exporting mount to fail
Previously, when sending large amounts of metadata to another MDS would cause the client mounts to fail. Resulting in the failure of heartbeat beacons sent to the Ceph Monitors. The exporting MDS would be marked by the Ceph Monitors as laggy/unavailable and then be removed, allowing the standby MDS to take over. In this release, the MDS limits the time it spends exporting metadata, allowing mounts to be processed promptly.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-05-15 18:20:29 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
liuwei
2018-03-13 00:53:05 UTC
Search the logs for when the first pg went to state "unknown" compare that to entries in the other logs (cluster, mon, etc) around that time. Something major happened for all pgs to go to state "unknown" and we should have logged something about it. pgs: 100.000% pgs unknown 2176 unknown the first issue (not able to mount cephfs) can be related to the second isssue. To mount a client, mds need to records session information in object store. The second issue prevents osd from handling any request. So mount hangs (In reply to Yan, Zheng from comment #19) > the first issue (not able to mount cephfs) can be related to the second > isssue. To mount a client, mds need to records session information in object > store. The second issue prevents osd from handling any request. So mount > hangs Hi Yan, The timeline we have says the first issue (not able to mount cephfs) was seen well before the second issue (MGR can not communicate with OSDs). "1. The original issue, where the cephFS was unmount able / intermittent mounts even when the ceph status showed OK ie all GOOD." yes, I looks like balancer issue. It exports too much at a time. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1563 |