Bug 1426508
Summary: | Application VM paused after add brick operation and VM didn't come up after power cycle. | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Krutika Dhananjay <kdhananj> |
Component: | sharding | Assignee: | Krutika Dhananjay <kdhananj> |
Status: | CLOSED EOL | QA Contact: | bugs <bugs> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.10 | CC: | amukherj, bugs, csaba, kdhananj, pasik, ravishankar, rcyriac, rgowdapp, rhs-bugs, rtalur, sabose, sasundar, srangana, storage-qa-internal |
Target Milestone: | --- | Keywords: | Triaged |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1420623 | Environment: | |
Last Closed: | 2018-06-20 18:25:34 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1387878, 1427207, 1440635 |
Description
Krutika Dhananjay
2017-02-24 05:25:45 UTC
REVIEW: https://review.gluster.org/16747 (features/shard: Put onus of choosing the inode to resolve on individual fops) posted (#1) for review on release-3.10 by Krutika Dhananjay (kdhananj) REVIEW: https://review.gluster.org/16748 (features/shard: Fix EIO error on add-brick) posted (#1) for review on release-3.10 by Krutika Dhananjay (kdhananj) COMMIT: https://review.gluster.org/16747 committed in release-3.10 by Shyamsundar Ranganathan (srangana) ------ commit 0d797ff57e78dd841768b5cd03a2bc1315404d81 Author: Krutika Dhananjay <kdhananj> Date: Wed Feb 22 14:43:46 2017 +0530 features/shard: Put onus of choosing the inode to resolve on individual fops Backport of: https://review.gluster.org/16709 ... as opposed to adding checks in "common" functions to choose the inode to resolve based local->fop, which is rather ugly and prone to errors. Change-Id: I55ede087b6ff8e9a76276c2636410c69f567bc0f BUG: 1426508 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: https://review.gluster.org/16747 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu> CentOS-regression: Gluster Build System <jenkins.org> REVIEW: https://review.gluster.org/16748 (features/shard: Fix EIO error on add-brick) posted (#2) for review on release-3.10 by Shyamsundar Ranganathan (srangana) COMMIT: https://review.gluster.org/16748 committed in release-3.10 by Shyamsundar Ranganathan (srangana) ------ commit d9d357c328ee84f939a88e25a44dc0c4038f1b20 Author: Krutika Dhananjay <kdhananj> Date: Tue May 17 15:37:18 2016 +0530 features/shard: Fix EIO error on add-brick Backport of: https://review.gluster.org/14419 DHT seems to link inode during lookup even before initializing inode ctx with layout information, which comes after directory healing. Consider two parallel writes. As part of the first write, shard sends lookup on .shard which in its return path would cause DHT to link .shard inode. Now at this point, when a second write is wound, inode_find() of .shard succeeds and as a result of this, shard goes to create the participant shards by issuing MKNODs under .shard. Since the layout is yet to be initialized, mknod fails in dht call path with EIO, leading to VM pauses. The fix involves shard maintaining a flag to denote whether a fresh lookup on .shard completed one network trip. If it didn't, all inode_find()s in fop path will be followed by a lookup before proceeding with the next stage of the fop. Big thanks to Raghavendra G and Pranith Kumar K for the RCA and subsequent inputs and feedback on the patch. Change-Id: Ibe59f6804a9c2ec95fbeaef1dc26858f16b8fcb5 BUG: 1426508 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: https://review.gluster.org/16748 NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Shyamsundar Ranganathan <srangana> Smoke: Gluster Build System <jenkins.org> I am moving this bug back to the ASIGNED state, as there still seem to be issues post the provided fix(s) (see, http://lists.gluster.org/pipermail/gluster-users/2017-March/030323.html ). Although the issues may not pertain to the shard component, as this bug deals with the same/similar use-case, changing the status to reflect the same. Once root causes are confirmed, we can branch this bug as we see fit, to provide the right component and other details. (In reply to Shyamsundar from comment #7) > I am moving this bug back to the ASIGNED state, as there still seem to be > issues post the provided fix(s) (see, > http://lists.gluster.org/pipermail/gluster-users/2017-March/030323.html ). > > Although the issues may not pertain to the shard component, as this bug > deals with the same/similar use-case, changing the status to reflect the > same. > > Once root causes are confirmed, we can branch this bug as we see fit, to > provide the right component and other details. Thanks for moving it back to ASSIGNED and it makes sense. Earlier, the problem was seen when the add-brick operation was corrupting the VM images, now the fix-layout+rebalance operation is causing the problem. *** Bug 1440637 has been marked as a duplicate of this bug. *** REVIEW: https://review.gluster.org/17021 (features/shard: Fix vm corruption upon fix-layout) posted (#1) for review on release-3.10 by Krutika Dhananjay (kdhananj) REVIEW: https://review.gluster.org/17022 (features/shard: Initialize local->fop in readv) posted (#1) for review on release-3.10 by Krutika Dhananjay (kdhananj) COMMIT: https://review.gluster.org/17022 committed in release-3.10 by Shyamsundar Ranganathan (srangana) ------ commit 1d98b9b1197ec6e2b5229d20a28ebb551ae41a14 Author: Krutika Dhananjay <kdhananj> Date: Mon Apr 10 11:04:31 2017 +0530 features/shard: Initialize local->fop in readv Backport of: > Change-Id: I9008ca9960df4821636501ae84f93a68f370c67f > BUG: 1440051 > Reviewed on: https://review.gluster.org/17014 > (cherry-picked from commit a4bb716be1f27be50e44d8167300e8b078a1f862) Change-Id: I9008ca9960df4821636501ae84f93a68f370c67f BUG: 1426508 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: https://review.gluster.org/17022 Smoke: Gluster Build System <jenkins.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> COMMIT: https://review.gluster.org/17021 committed in release-3.10 by Shyamsundar Ranganathan (srangana) ------ commit 6e3054b42f9aef1e35b493fbb002ec47e1ba27ce Author: Krutika Dhananjay <kdhananj> Date: Thu Apr 6 18:10:41 2017 +0530 features/shard: Fix vm corruption upon fix-layout Backport of: > Change-Id: I8a2e97d91ba3275fbc7174a008c7234fa5295d36 > BUG: 1440051 > Reviewed on: https://review.gluster.org/17010 > (cherry-picked from commit 99c8c0b03a3368d81756440ab48091e1f2430a5f) shard's writev implementation, as part of identifying presence of participant shards that aren't in memory, first sends an MKNOD on these shards, and upon EEXIST error, looks up the shards before proceeding with the writes. The VM corruption was caused when the following happened: 1. DHT had n subvolumes initially. 2. Upon add-brick + fix-layout, the layout of .shard changed although the existing shards under it were yet to be migrated to their new hashed subvolumes. 3. During this time, there were writes on the VM falling in regions of the file whose corresponding shards were already existing under .shard. 4. Sharding xl sent MKNOD on these shards, now creating them in their new hashed subvolumes although there already exist shard blocks for this region with valid data. 5. All subsequent writes were wound on these newly created copies. The net outcome is that both copies of the shard didn't have the correct data. This caused the affected VMs to be unbootable. FIX: For want of better alternatives in DHT, the fix changes shard fops to do a LOOKUP before the MKNOD and upon EEXIST error, perform another lookup. Change-Id: I8a2e97d91ba3275fbc7174a008c7234fa5295d36 BUG: 1426508 RCA'd-by: Raghavendra Gowdappa <rgowdapp> Reported-by: Mahdi Adnan <mahdi.adnan> Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: https://review.gluster.org/17021 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu> REVIEW: https://review.gluster.org/17119 (cluster/dht: Pass the req dict instead of NULL in dht_attr2()) posted (#1) for review on release-3.10 by Krutika Dhananjay (kdhananj) COMMIT: https://review.gluster.org/17119 committed in release-3.10 by Raghavendra Talur (rtalur) ------ commit cb961e3beb70542d8aced3e33e5902fbd2ae69ae Author: Krutika Dhananjay <kdhananj> Date: Thu Apr 20 10:08:02 2017 +0530 cluster/dht: Pass the req dict instead of NULL in dht_attr2() Backport of: > Change-Id: Id7823fd932b4e5a9b8779ebb2b612a399c0ef5f0 > BUG: 1440051 > Reviewed on: https://review.gluster.org/17085 > (cherry-picked from commit d60ca8e96bbc16b13f8f3456f30ebeb16d0d1e47) This bug was causing VMs to pause during rebalance. When qemu winds down a STAT, shard fills the trusted.glusterfs.shard.file-size attribute in the req dict which DHT doesn't wind its STAT fop with upon detecting the file has undergone migration. As a result shard doesn't find the value to this key in the unwind path, causing it to fail the STAT with EINVAL. Also, the same bug exists in other fops too, which is also fixed in this patch. Change-Id: Id7823fd932b4e5a9b8779ebb2b612a399c0ef5f0 BUG: 1426508 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: https://review.gluster.org/17119 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Raghavendra G <rgowdapp> CentOS-regression: Gluster Build System <jenkins.org> REVIEW: https://review.gluster.org/17134 (cluster/dht: Pass the correct xdata in fremovexattr fop) posted (#1) for review on release-3.10 by Krutika Dhananjay (kdhananj) REVIEW: https://review.gluster.org/17134 (cluster/dht: Pass the correct xdata in fremovexattr fop) posted (#2) for review on release-3.10 by Krutika Dhananjay (kdhananj) COMMIT: https://review.gluster.org/17134 committed in release-3.10 by Raghavendra Talur (rtalur) ------ commit 0a98c72dc0a6a00161bdc0a714e52e648b69cf24 Author: Krutika Dhananjay <kdhananj> Date: Thu Apr 27 11:53:24 2017 +0530 cluster/dht: Pass the correct xdata in fremovexattr fop Backport of: > Change-Id: Id84bc87e48f435573eba3b24d3fb3c411fd2445d > BUG: 1440051 > Reviewed-on: https://review.gluster.org/17126 > (cherry-picked from ab88f655e6423f51e2f2fac9265ff4d4f5c3e579) Change-Id: Id84bc87e48f435573eba3b24d3fb3c411fd2445d BUG: 1426508 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: https://review.gluster.org/17134 NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Raghavendra G <rgowdapp> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Shyamsundar Ranganathan <srangana> Smoke: Gluster Build System <jenkins.org> Two more patches have been merged in 3.10 branch related to this bug. However, we have not yet established that the bug is completely fixed. As Shyam explained in comment 7 for 3.10.1, it would be good to keep this bug open and close it only after testing thoroughly. Leaving it in POST state and not adding to release 3.10.2 tracker. Krutika, I have provided reasons for not taking closing this bug with 3.10.2 release in comment 19. If root cause is completed and if it has been established that the bug is fixed then please let me know and I would add it to the release notes. (In reply to Raghavendra Talur from comment #20) > Krutika, I have provided reasons for not taking closing this bug with 3.10.2 > release in comment 19. If root cause is completed and if it has been > established that the bug is fixed then please let me know and I would add it > to the release notes. Sure, I will get the users' feedback on whether the fixes worked for them before we even think of closing this bug. -Krutika This bug reported is against a version of Gluster that is no longer maintained (or has been EOL'd). See https://www.gluster.org/release-schedule/ for the versions currently maintained. As a result this bug is being closed. If the bug persists on a maintained version of gluster or against the mainline gluster repository, request that it be reopened and the Version field be marked appropriately. |