Bug 1410425
Summary: | [GNFS+EC] Cthon failures/issues with Lock/Special Test cases on disperse volume with GNFS mount | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Pranith Kumar K <pkarampu> |
Component: | disperse | Assignee: | Pranith Kumar K <pkarampu> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | |
Severity: | unspecified | Docs Contact: | |
Priority: | urgent | ||
Version: | mainline | CC: | amukherj, aspandey, bugs, jahernan, jthottan, kkeithle, msaini, nchilaka, ndevos, pkarampu, rcyriac, rhinduja, rhs-bugs, sarumuga, skoduri, storage-qa-internal |
Target Milestone: | --- | Keywords: | Triaged |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.11.0 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | 1408705 | Environment: | |
Last Closed: | 2017-05-30 18:38:06 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1408705 | ||
Bug Blocks: |
Comment 1
Pranith Kumar K
2017-01-05 13:17:30 UTC
The problem is that the locked range is treated exactly as it's done in inodelk (i.e. the range is aligned to multiples of 512). However in this case it doesn't make sense to do that transformation because the range itself doesn't have any meaning for ec (if a later write tries to modify any region of the file, the proper inodelk will be taken). I think we should remove the transformation and simply pass the input values to the lower subvolumes. (In reply to Xavier Hernandez from comment #2) > The problem is that the locked range is treated exactly as it's done in > inodelk (i.e. the range is aligned to multiples of 512). However in this > case it doesn't make sense to do that transformation because the range > itself doesn't have any meaning for ec (if a later write tries to modify any > region of the file, the proper inodelk will be taken). > > I think we should remove the transformation and simply pass the input values > to the lower subvolumes. Good that we are on the same page on this. I have the patch ready. I cloned it to send that patch :-). I wonder what happens for mandatory locking, where a write will be rejected if there is a lock in a region. Will post the patch as soon as I am done finding the answer. Mandatory locks will need some additional work. If mandatory locks are handled by the features/locks xlator, it will need some additional info for each write to know the real offset/length of each write. Otherwise I don't see a way to allow a fine grained mandatory lock support for ec. (In reply to Xavier Hernandez from comment #4) > Mandatory locks will need some additional work. > > If mandatory locks are handled by the features/locks xlator, it will need > some additional info for each write to know the real offset/length of each > write. Otherwise I don't see a way to allow a fine grained mandatory lock > support for ec. Yay! I was thinking of passing that in xdata as well. Okay for now let's send this off, we can work on that as part of another bug. Only other bug I saw is doing dispatch_all for nonblocking locks. Posix locks does this lock merging etc right, so when we lock and unlock it may truncate the lock range. i.e. If we already have a lock from 0-10 and then we do a lock for 5-15 by same fd/owner it will become a single lock 0-15. Now if we do unlock from 5-15 because some other node hit EAGAIN because a parallel conflicting lock on say range 11-12. Then the resulting unlock on 5-15 will lead to 0-5 In afr it is always wound incrementally one node after the other. (In reply to Pranith Kumar K from comment #5) > Yay! I was thinking of passing that in xdata as well. Okay for now let's > send this off, we can work on that as part of another bug. Yes, that can be done in another bug. > Only other bug I > saw is doing dispatch_all for nonblocking locks. Posix locks does this lock > merging etc right, so when we lock and unlock it may truncate the lock range. > > i.e. If we already have a lock from 0-10 and then we do a lock for 5-15 by > same fd/owner it will become a single lock 0-15. Now if we do unlock from > 5-15 because some other node hit EAGAIN because a parallel conflicting lock > on say range 11-12. Then the resulting unlock on 5-15 will lead to 0-5 > > In afr it is always wound incrementally one node after the other. It seems ok to me to do an incremental locking for lk fop. REVIEW: http://review.gluster.org/16445 (cluster/ec: Fix cthon failures observed with ec volumes) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu) REVIEW: https://review.gluster.org/16445 (cluster/ec: Fix cthon failures observed with ec volumes) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu) COMMIT: https://review.gluster.org/16445 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit f2406fa6155267fa747d9342092ee7709a2531a9 Author: Pranith Kumar K <pkarampu> Date: Fri Jan 27 16:17:49 2017 +0530 cluster/ec: Fix cthon failures observed with ec volumes Since EC already winds one write after other there is no need to align application fcntl locks with ec blocks. Also added this locking to be done as a transaction to prevent partial upgrade/downgrade of locks happening. BUG: 1410425 Change-Id: I7ce8955c2174f62b11e5cb16140e30ff0f7c4c31 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: https://review.gluster.org/16445 Smoke: Gluster Build System <jenkins.org> Reviewed-by: Xavier Hernandez <xhernandez> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.11.0, please open a new bug report. glusterfs-3.11.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-May/000073.html [2] https://www.gluster.org/pipermail/gluster-users/ |