Bug 1196433
| Summary: | [RFE] [HC] entry into maintenance mode should consider whether self-heal is ongoing | ||
|---|---|---|---|
| Product: | [oVirt] ovirt-engine | Reporter: | Paul Cuzner <pcuzner> |
| Component: | RFEs | Assignee: | Ramesh N <rnachimu> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | RamaKasturi <knarra> |
| Severity: | medium | Docs Contact: | |
| Priority: | high | ||
| Version: | --- | CC: | bmcclain, bugs, eheftman, gklein, lsurette, pkarampu, rbalakri, rnachimu, sabose, srevivo, ykaul |
| Target Milestone: | ovirt-4.1.0-beta | Keywords: | FutureFeature, Improvement |
| Target Release: | 4.1.0.2 | Flags: | rule-engine:
ovirt-4.1+
bmcclain: planning_ack+ sabose: devel_ack+ rule-engine: testing_ack+ |
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | ovirt-engine-4.1.0-0.4 | Doc Type: | Enhancement |
| Doc Text: |
Previously, in GlusterFS, if a node went down and then returned, GlusterFS would automatically initiate a self-heal operation. During this operation, which could be time-consuming, a subsequent maintenance mode action within the same GlusterFS replica set could result in a split brain scenario.
In this release, if a Gluster host is performing a self-heal activity, administrators will not be able to move it into maintenance mode. In extreme cases, administrators can use the force option to forcefully move a host into maintenance mode.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-03-27 11:07:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Gluster | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1205641 | ||
| Bug Blocks: | 1177771, 1277939, 1415664 | ||
|
Description
Paul Cuzner
2015-02-26 00:16:19 UTC
Is there a way to know when self-heal is going on for a volume. Using "gluster status all tasks" - we know when rebalance/remove-brick is going on. Is there something similar for self-heal? Added Pranith's reply - "I think gluster volume heal statistics command tells whether self-heal is going on or not. But once every 10 minutes it will show in-progress." Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release. *** Bug 1196438 has been marked as a duplicate of this bug. *** Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA. oVirt 4.0 beta has been released, moving to RC milestone. oVirt 4.0 beta has been released, moving to RC milestone. Hi Ramesh Will this feature affect the UI in any way? Does this change need to be described in the Administration Guide, possibly in the Gluster chapter: https://access.redhat.com/documentation/en/red-hat-virtualization/4.0/single/administration-guide/#sect-Cluster_Utilization (In reply to emma heftman from comment #8) > Hi Ramesh > Will this feature affect the UI in any way? Does this change need to be > described in the Administration Guide, possibly in the Gluster chapter: > > https://access.redhat.com/documentation/en/red-hat-virtualization/4.0/single/ > administration-guide/#sect-Cluster_Utilization Yes. It does affects the UI. You will see following new options in the host maintenance dialog box. This will be shown only when a host supports Gluster services. 1. Ignore Gluster Quorum and Self-Heal validations By default oVirt/RHEV-M will check the gluster quorum is not lost when you move the host to maintenance. Also It checks that there is no self-heal activity which will be affected as part of moving the host to maintenance. User can avoid these checks by checking this option. This should be used only in rare situation when there is no other way to do maintenance activity on the node. 2. Stop Gluster service This option can be used if the user wants to stop all gluster services while moving the host maintenance. Verified and works fine with build Red Hat Virtualization Manager Version: 4.1.1.2-0.1.el7 Ovirt does not allow the host to be moved to maintenance if there are any unsynced entries present in the brick. It throws the following error "Error while executing action: Cannot switch the following Host(s) to Maintenance mode: host_name. Unsynced entries present in following gluster bricks: [<gluster_ip>:/gluster_bricks/data/data, <gluster_ip>:/gluster_bricks/engine/engine, <gluster_ip>:/gluster_bricks/vmstore/vmstore]. When one of the brick in the volume is down and if user tries to move another node to maintenance by stopping glusterd services, ovirt displays an error "Error while executing action: Cannot switch the following Host(s) to Maintenance mode: <hostname>.Gluster quorum will be lost for the following Volumes: data,vmstore,engine. When one of the node is already in maintenance with glusterd services stopped, Ovirt does not allow you to move another node into maintenance since quourm will be lost for the volumes. Ovirt allows user to move more than one node to maintenance with out stopping glusterd services as all the bricks will be up on the nodes and quorum for volumes will not be lost in this case. ovirt allows user to move node to maintenance though self heal is going on if user ignores the quourm and self heal validations by checking "Ignore quorum and self-heal validations" |