Bug 1516313
Summary: | Bringing down data bricks in cyclic order results in arbiter brick becoming the source for heal. | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Karthik U S <ksubrahm> |
Component: | arbiter | Assignee: | Karthik U S <ksubrahm> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | |
Severity: | unspecified | Docs Contact: | |
Priority: | high | ||
Version: | 3.13 | CC: | amukherj, asrivast, bugs, knarra, nchilaka, ravishankar, rcyriac, rhinduja, rhs-bugs, sabose, sasundar, storage-qa-internal |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.13.2 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | 1482064 | Environment: | |
Last Closed: | 2018-01-23 21:37:19 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1482064, 1566131 | ||
Bug Blocks: | 1401969, 1530334 |
Comment 1
Worker Ant
2017-11-22 13:05:02 UTC
COMMIT: https://review.gluster.org/18843 committed in release-3.13 by \"Karthik U S\" <ksubrahm> with a commit message- cluster/afr: Fix for arbiter becoming source Problem: When eager-lock is on, and two writes happen in parallel on a FD we were observing the following behaviour: - First write fails on one data brick - Since the post-op is not yet happened, the inode refresh will get both the data bricks as readable and set it in the inode context - In flight split brain check see both the data bricks as readable and allows the second write - Second write fails on the other data brick - Now the post-op happens and marks both the data bricks as bad and arbiter will become source for healing Fix: Adding one more variable called write_suvol in inode context and it will have the in memory representation of the writable subvols. Inode refresh will not update this value and its lifetime is pre-op through unlock in the afr transaction. Initially the pre-op will set this value same as read_subvol in inode context and then in the in flight split brain check we will use this value instead of read_subvol. After all the checks we will update the value of this and set the read_subvol same as this to avoid having incorrect value in that. Change-Id: I2ef6904524ab91af861d59690974bbc529ab1af3 BUG: 1516313 Signed-off-by: karthik-us <ksubrahm> (cherry picked from commit 19f9bcff4aada589d4321356c2670ed283f02c03) This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.13.0, please open a new bug report. glusterfs-3.13.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-December/000087.html [2] https://www.gluster.org/pipermail/gluster-users/ Found some issues in the patch which can again lead to the same problem of arbiter becoming source. Changing the state back to Post. REVIEW: https://review.gluster.org/19192 (cluster/afr: Fixing the flaws in arbiter becoming source patch) posted (#1) for review on release-3.13 by Karthik U S COMMIT: https://review.gluster.org/19192 committed in release-3.13 by \"Karthik U S\" <ksubrahm> with a commit message- cluster/afr: Fixing the flaws in arbiter becoming source patch Problem: Setting the write_subvol value to read_subvol in case of metadata transaction during pre-op (commit 19f9bcff4aada589d4321356c2670ed283f02c03) might lead to the original problem of arbiter becoming source. Scenario: 1) All bricks are up and good 2) 2 writes w1 and w2 are in progress in parallel 3) ctx->read_subvol is good for all the subvolumes 4) w1 succeeds on brick0 and fails on brick1, yet to do post-op on the disk 5) read/lookup comes on the same file and refreshes read_subvols back to all good 6) metadata transaction happens which makes ctx->write_subvol to be assigned with ctx->read_subvol which is all good 7) w2 succeeds on brick1 and fails on brick0 and this will update the brick in reverse order leading to arbiter becoming source Fix: Instead of setting the ctx->write_subvol to ctx->read_subvol in the pre-op statge, if there is a metadata transaction, check in the function __afr_set_in_flight_sb_status() if it is a data/metadata transaction. Use the value of ctx->write_subvol if it is a data transactions and ctx->read_subvol value for other transactions. With this patch we assign the value of ctx->write_subvol in the afr_transaction_perform_fop() with the on disk value, instead of assigning it in the afr_changelog_pre_op() with the in memory value. Change-Id: Id2025a7e965f0578af35b1abaac793b019c43cc4 BUG: 1516313 Signed-off-by: karthik-us <ksubrahm> (cherry picked from commit ba149bac92d169ae2256dbc75202dc9e5d06538e) This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.13.2, please open a new bug report. glusterfs-3.13.2 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2018-January/000089.html [2] https://www.gluster.org/pipermail/gluster-users/ |