Bug 1301474

Summary: [GSS]Intermittent file creation fail,while doing concurrent writes on distributed volume has more than 40 bricks
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Riyas Abdulrasak <rnalakka>
Component: distributeAssignee: Raghavendra G <rgowdapp>
Status: CLOSED ERRATA QA Contact: Prasad Desala <tdesala>
Severity: medium Docs Contact:
Priority: medium    
Version: rhgs-3.1CC: amukherj, bkunal, kramdoss, nbalacha, rabhat, rcyriac, rgowdapp, rhinduja, rhs-bugs, sheggodu, smohan, storage-qa-internal
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.4.0   
Hardware: All   
OS: All   
Whiteboard: dht-directory-consistency, dht-gss-ask, dht-gss, dht-3.2.0-proposed, rebase
Fixed In Version: glusterfs-3.12.2-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1304962 (view as bug list) Environment:
Last Closed: 2018-09-04 06:27:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1369312    
Bug Blocks: 1304962, 1369781, 1472361, 1474007, 1503135    

Description Riyas Abdulrasak 2016-01-25 08:04:36 UTC
Description of problem:
File creation fails intermittently on a distributed volume, while trying to write concurrently from two clients at the same time. 


Version-Release number of selected component (if applicable):

glusterfs-3.7.1-11.el6rhs.x86_64
glusterfs-server-3.7.1-11.el6rhs.x86_64

Red Hat Gluster Storage Server 3.1

How reproducible:

Not always , Intermittent 


Steps to Reproduce:

1. Create a distributed volume having 60 bricks or a relative high number of bricks on different nodes.

2. Mount the volume on two different clients. 

3. Try to create directories/files from simultaneously from both the clients using the below script. 

for i in {1..9}
do
mkdir /mnt/test/$i
cd /mnt/test/$i
touch $HOSTNAME.$i
done

Actual results:

For some files , it fails to create from one of the server. At times it gives , the "No such file/directory" for the file , like below. 

[root@server2 ~]# ./test.sh
mkdir: cannot create directory `/mnt/test/1': File exists
mkdir: cannot create directory `/mnt/test/2': File exists
mkdir: cannot create directory `/mnt/test/5': File exists
mkdir: cannot create directory `/mnt/test/6': File exists
touch: cannot touch `server2.6': No such file or directory
mkdir: cannot create directory `/mnt/test/7': File exists
mkdir: cannot create directory `/mnt/test/9': File exists

[root@server1 ~]# ./test.sh
mkdir: cannot create directory `/mnt/test/3': File exists
mkdir: cannot create directory `/mnt/test/4': File exists
mkdir: cannot create directory `/mnt/test/8': File exists


And at some different point of time it failed giving , stale file handle. 

[root@server2 ~]# ./test.sh
mkdir: cannot create directory `/mnt/test/4': File exists
touch: cannot touch `server2.4': Stale file handle
mkdir: cannot create directory `/mnt/test/6': File exists
mkdir: cannot create directory `/mnt/test/7': File exists
mkdir: cannot create directory `/mnt/test/8': File exists
mkdir: cannot create directory `/mnt/test/9': File exists



[root@server1 ~]# ./test.sh
mkdir: cannot create directory `/mnt/test/1': File exists
mkdir: cannot create directory `/mnt/test/2': File exists
mkdir: cannot create directory `/mnt/test/3': File exists
mkdir: cannot create directory `/mnt/test/5': File exists


Expected results:

The file creation should work without any issue. 


Additional info:

[2016-01-16 02:19:01.799335] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2016-01-16 03:00:36.570750] W [fuse-bridge.c:484:fuse_entry_cbk] 0-glusterfs-fuse: 936606817: MKDIR() /pathtofile/0006/897 => -1 (File exists)
[2016-01-16 03:00:36.667744] I [MSGID: 109063] [dht-layout.c:702:dht_layout_normalize] 29-volname-dht: Found anomalies in /pathtofile/0006/897 (gfid = 00000000-0000-0000-0000-000000000000). Holes=1 overlaps=0
[2016-01-16 03:00:36.679985] W [MSGID: 114031] [client-rpc-fops.c:2322:client3_3_setattr_cbk] 29-volname-client-62: remote operation failed [Stale file handle]
[2016-01-16 03:00:36.679982] W [MSGID: 114031] [client-rpc-fops.c:2322:client3_3_setattr_cbk] 29-volname-client-63: remote operation failed [Stale file handle]

Comment 28 Prasad Desala 2018-05-21 07:14:11 UTC
Verified this BZ on glusterfs version 3.12.2-11.el7rhgs.x86_64.

The script mentioned in the description is tested on a pure distribute - 120 bricks and 8 x 3 volume. Ran the script for 1hr from 4 different clients, didn't see any fail creation failures during this time.

Hence, moving this BZ to Verified state.

Comment 30 errata-xmlrpc 2018-09-04 06:27:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607