Bug 1440635
Summary: | Application VMs with their disk images on sharded-replica 3 volume are unable to boot after performing rebalance | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Krutika Dhananjay <kdhananj> |
Component: | distribute | Assignee: | Krutika Dhananjay <kdhananj> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.8 | CC: | amukherj, bugs, kdhananj, knarra, ndevos, rcyriac, rgowdapp, rhinduja, rhs-bugs, sasundar, storage-qa-internal |
Target Milestone: | --- | Keywords: | Reopened, Triaged |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.8.12 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | 1440051 | Environment: | |
Last Closed: | 2017-05-29 04:58:58 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1426508, 1440051, 1440637 | ||
Bug Blocks: | 1431410, 1440754 |
Description
Krutika Dhananjay
2017-04-10 07:26:19 UTC
REVIEW: https://review.gluster.org/17019 (features/shard: Fix vm corruption upon fix-layout) posted (#1) for review on release-3.8 by Krutika Dhananjay (kdhananj) REVIEW: https://review.gluster.org/17020 (features/shard: Initialize local->fop in readv) posted (#1) for review on release-3.8 by Krutika Dhananjay (kdhananj) COMMIT: https://review.gluster.org/17020 committed in release-3.8 by jiffin tony Thottan (jthottan) ------ commit d5d599abaa598062885abc7ad8226faf26d11e64 Author: Krutika Dhananjay <kdhananj> Date: Mon Apr 10 11:04:31 2017 +0530 features/shard: Initialize local->fop in readv Backport of: https://review.gluster.org/17014 Change-Id: I4d2f0a3f533009038d48579db5a8a2a048b77ca1 BUG: 1440635 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: https://review.gluster.org/17020 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu> CentOS-regression: Gluster Build System <jenkins.org> REVIEW: https://review.gluster.org/17019 (features/shard: Fix vm corruption upon fix-layout) posted (#2) for review on release-3.8 by Krutika Dhananjay (kdhananj) COMMIT: https://review.gluster.org/17019 committed in release-3.8 by jiffin tony Thottan (jthottan) ------ commit d71ec72b981d110199c3376f39f91b704241975c Author: Krutika Dhananjay <kdhananj> Date: Thu Apr 6 18:10:41 2017 +0530 features/shard: Fix vm corruption upon fix-layout Backport of: https://review.gluster.org/17010 shard's writev implementation, as part of identifying presence of participant shards that aren't in memory, first sends an MKNOD on these shards, and upon EEXIST error, looks up the shards before proceeding with the writes. The VM corruption was caused when the following happened: 1. DHT had n subvolumes initially. 2. Upon add-brick + fix-layout, the layout of .shard changed although the existing shards under it were yet to be migrated to their new hashed subvolumes. 3. During this time, there were writes on the VM falling in regions of the file whose corresponding shards were already existing under .shard. 4. Sharding xl sent MKNOD on these shards, now creating them in their new hashed subvolumes although there already exist shard blocks for this region with valid data. 5. All subsequent writes were wound on these newly created copies. The net outcome is that both copies of the shard didn't have the correct data. This caused the affected VMs to be unbootable. FIX: For want of better alternatives in DHT, the fix changes shard fops to do a LOOKUP before the MKNOD and upon EEXIST error, perform another lookup. Change-Id: I1a5d3515b42e2e5583c407d1b4aff44d7ce472eb BUG: 1440635 RCA'd-by: Raghavendra Gowdappa <rgowdapp> Reported-by: Mahdi Adnan <mahdi.adnan> Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: https://review.gluster.org/17019 CentOS-regression: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> Reviewed-by: jiffin tony Thottan <jthottan> Fix is not yet complete as there are still issues around this use case. Moving the bug status back to POST. This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.11, please open a new bug report. glusterfs-3.8.11 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/packaging/2017-April/000289.html [2] https://www.gluster.org/pipermail/gluster-users/ I'm moving this bug back to ASSIGNED state as Satheesaran is seeing VM pause issue post rebalance. Upon first look at the logs, it seems like DHT is looking up the vm in the wrong sub-volume leading to fop failure with ENOENT which qemu acts on by pausing the VM. -Krutika REVIEW: https://review.gluster.org/17121 (cluster/dht: Pass the req dict instead of NULL in dht_attr2()) posted (#1) for review on release-3.8 by Krutika Dhananjay (kdhananj) COMMIT: https://review.gluster.org/17121 committed in release-3.8 by Niels de Vos (ndevos) ------ commit ba17362ea9eb642614a69c4f8a6ea2c2648cb5d8 Author: Krutika Dhananjay <kdhananj> Date: Thu Apr 20 10:08:02 2017 +0530 cluster/dht: Pass the req dict instead of NULL in dht_attr2() Backport of: https://review.gluster.org/17085 This bug was causing VMs to pause during rebalance. When qemu winds down a STAT, shard fills the trusted.glusterfs.shard.file-size attribute in the req dict which DHT doesn't wind its STAT fop with upon detecting the file has undergone migration. As a result shard doesn't find the value to this key in the unwind path, causing it to fail the STAT with EINVAL. Also, the same bug exists in other fops too, which is also fixed in this patch. Change-Id: I56273b1a65347dabd38bc6bdd12d618f68287a00 BUG: 1440635 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: https://review.gluster.org/17121 Smoke: Gluster Build System <jenkins.org> Reviewed-by: Raghavendra G <rgowdapp> CentOS-regression: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> REVIEW: https://review.gluster.org/17148 (cluster/dht: Pass the correct xdata in fremovexattr fop) posted (#1) for review on release-3.8 by Krutika Dhananjay (kdhananj) REVIEW: https://review.gluster.org/17148 (cluster/dht: Pass the correct xdata in fremovexattr fop) posted (#2) for review on release-3.8 by Krutika Dhananjay (kdhananj) COMMIT: https://review.gluster.org/17148 committed in release-3.8 by Niels de Vos (ndevos) ------ commit 5dbe4fa649b8c486b2abdba660a53f7ae1198ef0 Author: Krutika Dhananjay <kdhananj> Date: Thu Apr 27 11:53:24 2017 +0530 cluster/dht: Pass the correct xdata in fremovexattr fop Backport of: https://review.gluster.org/17126 Change-Id: Id84bc87e48f435573eba3b24d3fb3c411fd2445d BUG: 1440635 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: https://review.gluster.org/17148 NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> Reviewed-by: Niels de Vos <ndevos> This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.12, please open a new bug report. glusterfs-3.8.12 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2017-May/000072.html [2] https://www.gluster.org/pipermail/gluster-users/ |