Bug 1364551
Summary: | GlusterFS lost track of 7,800+ file paths preventing self-heal | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Peter Portante <pportant> |
Component: | replicate | Assignee: | Krutika Dhananjay <kdhananj> |
Status: | CLOSED ERRATA | QA Contact: | storage-qa-internal <storage-qa-internal> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | rhgs-3.1 | CC: | amukherj, asrivast, nchilaka, pkarampu, rabhat, rhinduja, rhs-bugs, sankarshan, smohan |
Target Milestone: | --- | Keywords: | ZStream |
Target Release: | RHGS 3.2.0 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.8.4-11 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-03-23 05:44:24 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1351515 |
Description
Peter Portante
2016-08-05 17:00:44 UTC
We were running RHGS 3.1.2 with a set of patches from Vijay, and then upgraded to 3.1.3 during the self-heal process we applied. hi Peter, Based on the info so far it seems to have happened because of a feature called optimistic changelog for directory operations where the entries are marked bad after a failure happens, i.e. there is no pre-operation marking that is done. So if the failure happens in such a way that before the marking is done we lose network connectivity to both the bricks then we lose track of which directory needs healing. We would need sosreports of the machines and sample gfids which went into this state to confirm the theory. If we confirm the theory we will give a volume set option to turn optimistic change log off. There is a way to mount the filesystem to turn this off as well, which we can use until this patch is merged. Pranith Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html |