Bug 1578325 - Input/Output errors on a disperse volume with concurrent reads and writes
Summary: Input/Output errors on a disperse volume with concurrent reads and writes
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: disperse
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Xavi Hernandez
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1582056 1582057 1582058
TreeView+ depends on / blocked
 
Reported: 2018-05-15 09:35 UTC by Xavi Hernandez
Modified: 2018-10-23 15:08 UTC (History)
1 user (show)

Fixed In Version: glusterfs-5.0
Clone Of:
: 1582056 1582057 1582058 (view as bug list)
Environment:
Last Closed: 2018-10-23 15:08:41 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Xavi Hernandez 2018-05-15 09:35:55 UTC
Description of problem:

When parallel-writes is enabled and multiple reads and writes (non-overlapping) are sent concurrently, sometimes writes fail with an Input/Output error of file not found error.

Version-Release number of selected component (if applicable): mainline


How reproducible:

Randomly on a Ganesha mount. I've been unable to reproduce it on FUSE.

Steps to Reproduce:
1. Create a disperse volume
2. Create a Ganesha cluster using gfapi
3. Mount three clients to three different NFS servers
4. From one client run Bonnie++
5. From second client run ls -la in a loop
6. From third client run du -sh in a loop

Actual results:

Bonnie++ fails in the rewriting test.

Expected results:

Bonnie++ shouldn't fail

Additional info:

As a workaround, disabling parallel-writes hides the problem.

Comment 1 Worker Ant 2018-05-15 09:45:13 UTC
REVIEW: https://review.gluster.org/20024 (cluster/ec: Fix pre-op xattrop management) posted (#1) for review on master by Xavi Hernandez

Comment 2 Worker Ant 2018-05-24 04:59:53 UTC
COMMIT: https://review.gluster.org/20024 committed in master by "Xavi Hernandez" <xhernandez> with a commit message- cluster/ec: Fix pre-op xattrop management

Multiple pre-op xattrop can be simultaneously being processed. On the cbk
it was checked if the fop was waiting for some specific data (like size and
version) and, if so, it was assumed that this answer should contain that
data.

This is not true, since a fop can be waiting for some data, but it may come
from the xattrop of another fop.

This patch differentiates between needing some information and providing it.

This is related to parallel writes. Disabling them fixed the problem, but
also prevented concurrent reads. A change has been made so that disabling
parallel writes still allows parallel reads.

Fixes: bz#1578325
Change-Id: I74772ad6b80b7b37805da93d5ec3ae099e96b041
Signed-off-by: Xavi Hernandez <xhernandez>

Comment 3 Shyamsundar 2018-10-23 15:08:41 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-5.0, please open a new bug report.

glusterfs-5.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2018-October/000115.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.