Bug 1745026
Summary: | endless heal gluster volume; incrementing number of files to heal when all peers in volume are up | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | tvanberlo <tvanberlo> |
Component: | fuse | Assignee: | bugs <bugs> |
Status: | CLOSED NOTABUG | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.1 | CC: | bugs |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-09-06 14:05:59 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
tvanberlo@vangenechten.com
2019-08-23 13:52:36 UTC
What I forgot to mention and the reason I opened this as a bug report for gluster fuse: When this happens we tested on a gluster mount to see where the files were written to, and on only 2 of the 3 members the data (metadata for arbiter) was written. The file was not found on 1 member of the gluster cluster. When we remounted our gluster mounts(like described in the opening post), the healing finished and no other files were added to the 'files to heal list'. Yesterday a gluster member node was rebooted and today I repeated the test to see where the data is written to. Now all data is written to all members but the heal is still in progress. How long does it take for a gluster mount to notice a reappearing member and to write to a node that was down? The cluster has been reinstalled with the latest version(ovirt + hci). Now the heal is fast and reliable. I won't be able to provide more info. Apparently the issue reappeared when I rebooted a second node (the arbiter). So on a freshly installed system the issue is also present. I tracked the issue to the firewall. When the firewall is disabled, the gluster volume heals fast. But when the firewall is enabled it never ends healing. Disabling the firewall causes other issues in ovirt, so this is not advisable. (all vms went offline and moved to 1 hypervisor and migration couldn't be done) I'm closing this bug report, because it seems to be an ovirt related issue. |