Bug 1424805

Summary:	Gluster returns EINVAL for close() when linux-aio and use-compound-fops are both enabled
Product:	[Community] GlusterFS	Reporter:	nh2 <nh2-redhatbugzilla>
Component:	libgfapi	Assignee:	Pranith Kumar K <pkarampu>
Status:	CLOSED EOL	QA Contact:	Sudhir D <sdharane>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	3.10	CC:	bugs, jbyers, nh2-redhatbugzilla
Target Milestone:	---	Keywords:	Reopened, Triaged
Target Release:	---
Hardware:	Unspecified
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-06-20 18:26:08 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description nh2 2017-02-19 15:33:10 UTC

Description of problem:

I have a program that does, in this order, `open()`, `sendfile()`, `close()` syscalls to copy a file.

When I have both `storage.linux-aio on` and `cluster.use-compound-fops on`, there is a race condition that makes the `close()` return `errno = EINVAL`.

This is especially suspicious because `close()` cannot result in `EINVAL` according to `man 2 close`.

`EINVAL` is also returned for other operations besides `close()`, e.g. for `fsync()`.

Turning off either of `storage.linux-aio` or `cluster.use-compound-fops` fixes the problem, so I assume that the bug only exists if both are enabled.

Version-Release number of selected component (if applicable):

3.9.1

How reproducible:

Always

Steps to Reproduce:
1. Over a list of local files (you need multiple because it's a nondeterministic race)
2. Copy each file to a gluster volume by using `open + sendfile + close` (properly checking return code and errno  of close()).

Actual results:

After a few files, close will return -1 and errno will be set to 22 (EINVAL).

Expected results:

All files copy without problem.

Additional info:

I found this commit https://github.com/gluster/glusterfs/commit/88d772c05c45c467bfccebfc51f6a0e0ea9ca287
which seems to deal with a similar issue; skimming over it, it seems to suggest that somehow the close is issued before the write operation is done.

Comment 1 Shyamsundar 2017-02-21 13:53:11 UTC

The combination of aio and compound seems to be causing the problem, if the replicate folks could take a first look (as compound is being consumed by them till now, to my knowledge) it would possibly help move this issue faster.

Reassigning to Pranith for further consideration.

Comment 2 Kaushal 2017-03-08 12:32:17 UTC

This bug is getting closed because GlusterFS-3.9 has reached its end-of-life [1].

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please open a new bug against the newer release.

[1]: https://www.gluster.org/community/release-schedule/

Comment 3 nh2 2017-03-08 19:56:38 UTC

Reopening for 3.10.

Comment 4 Shyamsundar 2018-06-20 18:26:08 UTC

This bug reported is against a version of Gluster that is no longer maintained (or has been EOL'd). See https://www.gluster.org/release-schedule/ for the versions currently maintained.

As a result this bug is being closed.

If the bug persists on a maintained version of gluster or against the mainline gluster repository, request that it be reopened and the Version field be marked appropriately.