Bug 1424805

Summary: Gluster returns EINVAL for close() when linux-aio and use-compound-fops are both enabled
Product: [Community] GlusterFS Reporter: nh2 <nh2-redhatbugzilla>
Component: libgfapiAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED EOL QA Contact: Sudhir D <sdharane>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.10CC: bugs, jbyers, nh2-redhatbugzilla
Target Milestone: ---Keywords: Reopened, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-20 18:26:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description nh2 2017-02-19 15:33:10 UTC
Description of problem:

I have a program that does, in this order, `open()`, `sendfile()`, `close()` syscalls to copy a file.

When I have both `storage.linux-aio on` and `cluster.use-compound-fops on`, there is a race condition that makes the `close()` return `errno = EINVAL`.

This is especially suspicious because `close()` cannot result in `EINVAL` according to `man 2 close`.

`EINVAL` is also returned for other operations besides `close()`, e.g. for `fsync()`.

Turning off either of `storage.linux-aio` or `cluster.use-compound-fops` fixes the problem, so I assume that the bug only exists if both are enabled.

Version-Release number of selected component (if applicable):

3.9.1

How reproducible:

Always

Steps to Reproduce:
1. Over a list of local files (you need multiple because it's a nondeterministic race)
2. Copy each file to a gluster volume by using `open + sendfile + close` (properly checking return code and errno  of close()).

Actual results:

After a few files, close will return -1 and errno will be set to 22 (EINVAL).

Expected results:

All files copy without problem.

Additional info:

I found this commit https://github.com/gluster/glusterfs/commit/88d772c05c45c467bfccebfc51f6a0e0ea9ca287
which seems to deal with a similar issue; skimming over it, it seems to suggest that somehow the close is issued before the write operation is done.

Comment 1 Shyamsundar 2017-02-21 13:53:11 UTC
The combination of aio and compound seems to be causing the problem, if the replicate folks could take a first look (as compound is being consumed by them till now, to my knowledge) it would possibly help move this issue faster.

Reassigning to Pranith for further consideration.

Comment 2 Kaushal 2017-03-08 12:32:17 UTC
This bug is getting closed because GlusterFS-3.9 has reached its end-of-life [1].

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please open a new bug against the newer release.

[1]: https://www.gluster.org/community/release-schedule/

Comment 3 nh2 2017-03-08 19:56:38 UTC
Reopening for 3.10.

Comment 4 Shyamsundar 2018-06-20 18:26:08 UTC
This bug reported is against a version of Gluster that is no longer maintained (or has been EOL'd). See https://www.gluster.org/release-schedule/ for the versions currently maintained.

As a result this bug is being closed.

If the bug persists on a maintained version of gluster or against the mainline gluster repository, request that it be reopened and the Version field be marked appropriately.