Bug 1114010 - DHT : - In case of race between two mkdir(creating same Directory) from different mount, both are failing with error even though Directory is created. FUSE mount gave "Input/output error"
Summary: DHT : - In case of race between two mkdir(creating same Directory) from diffe...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: distribute
Version: rhgs-3.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Susant Kumar Palai
QA Contact: Matt Zywusko
URL:
Whiteboard:
Depends On:
Blocks: 1114557 1122886
TreeView+ depends on / blocked
 
Reported: 2014-06-27 12:53 UTC by Rachana Patel
Modified: 2016-08-01 06:32 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1114557 (view as bug list)
Environment:
Last Closed: 2016-08-01 06:32:36 UTC
Embargoed:


Attachments (Terms of Use)

Description Rachana Patel 2014-06-27 12:53:13 UTC
Description of problem:
=======================
In case of parallel  mkdir from two different mount point, both are showing failures even though Directory is created on all sub-volumes and gfid is same on all

mount 1:-

[root@OVM1 2]# mkdir abc
mkdir: cannot create directory `abc': Input/output error

mount 2:-
[root@OVM3 nfs2]# mkdir abc
mkdir: cannot create directory `abc': File exists

--> Both are failing and  FUSE mount gave error - "Input/output error"


Version-Release number of selected component (if applicable):
=============================================================
3.6.0.19-1.el6rhs.x86_64

How reproducible:
=================
always

Steps to Reproduce:
===================
1. create Distributed volume(3 bricks) and mount it on multiple client(NFS & FUSE)
[root@OVM3 nfs]# gluster v status snap
Status of volume: snap
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 10.70.35.198:/brick2/b1				49157	Y	13611
Brick 10.70.35.198:/brick2/b2				49158	Y	13345
Brick 10.70.35.198:/brick2/b3				49159	Y	13356
NFS Server on localhost					2049	Y	13623
NFS Server on 10.70.35.240				2049	Y	19185
NFS Server on 10.70.35.172				2049	Y	13465
 
Task Status of Volume snap
------------------------------------------------------------------------------
There are no active volume tasks

[root@OVM1 2]# mount | grep snap
10.70.35.198:/snap on /mnt/2 type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)

[root@OVM3 nfs2]# mount | grep snap
10.70.35.198:/snap on /mnt/nfs2 type nfs (rw,addr=10.70.35.198)

[to reproduce race, we are putting break point]

2. from one client - FUSE start creating Directory
[root@OVM1 2]# mkdir abc
(put break point at dht_mkdir_hashed_cbk)

3. From another mount point - NFS mount,  create same Directory
[root@OVM3 nfs]# pwd
/mnt/nfs2
[root@OVM3 nfs2]# mkdir abc
mkdir: cannot create directory `abc': File exists

4. continue on break point

5. mkdir failed and gave error
[root@OVM1 2]# mkdir abc
mkdir: cannot create directory `abc': Input/output error


6. verify gfid and Dir in backend
[root@OVM3 ~]# getfattr -d -m . -e hex /brick2/*/abc
getfattr: Removing leading '/' from absolute path names
# file: brick2/b1/abc
trusted.gfid=0x298825511bdf413e9d18dfef7bddef2b
trusted.glusterfs.dht=0x00000001000000000000000055555554

# file: brick2/b2/abc
trusted.gfid=0x298825511bdf413e9d18dfef7bddef2b
trusted.glusterfs.dht=0x000000010000000055555555aaaaaaa9

# file: brick2/b3/abc
trusted.gfid=0x298825511bdf413e9d18dfef7bddef2b
trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff


Actual results:
===============
both mount point shows failure and FUSE shows "I/O error"

Expected results:
=================
In case of two parallel mkdir -  if Directory is created successfully then both should not show failures


Additional info:
===============
nfs log :-

[2014-06-27 08:08:59.461588] I [dht-layout.c:663:dht_layout_normalize] 0-snap-dht: Found anomalies in /abc (gfid = 00000000-0000-0000-0000-000000000000). Holes=1 overlaps=0
[2014-06-27 10:21:46.216573] I [dht-layout.c:663:dht_layout_normalize] 0-snap-dht: Found anomalies in /abc (gfid = 00000000-0000-0000-0000-000000000000). Holes=1 overlaps=0
[2014-06-27 10:21:46.219462] W [client-rpc-fops.c:306:client3_3_mkdir_cbk] 0-snap-client-0: remote operation failed: File exists. Path: /abc
[2014-06-27 10:21:46.219496] W [nfs3.c:2722:nfs3svc_mkdir_cbk] 0-nfs: a95db38b: /abc => -1 (File exists)


from FUSE mount

[2014-06-27 10:21:54.269831] W [client-rpc-fops.c:306:client3_3_mkdir_cbk] 0-snap-client-1: remote operation failed: File exists. Path: /abc
[2014-06-27 10:21:54.269913] W [client-rpc-fops.c:306:client3_3_mkdir_cbk] 0-snap-client-2: remote operation failed: File exists. Path: /abc
[2014-06-27 10:21:54.270980] W [fuse-bridge.c:416:fuse_entry_cbk] 0-glusterfs-fuse: Received NULL gfid for /abc. Forcing EIO    <----------------
[2014-06-27 10:21:54.271025] W [fuse-bridge.c:481:fuse_entry_cbk] 0-glusterfs-fuse: 172: MKDIR() /abc => -1 (Input/output error)

Comment 3 Susant Kumar Palai 2014-07-18 11:48:08 UTC
Upstream patch : http://review.gluster.org/8203

Comment 6 Susant Kumar Palai 2015-12-24 07:02:39 UTC
Triage-update: This is already fixed.

Comment 7 Prasad Desala 2016-07-27 11:32:47 UTC
This issue is not seen on glusterfs build: 3.7.9-10.el7rhgs.x86_64.

Here are the steps that were followed,

1) Created a Distributed volume and mounted it on two clients (NFS & FUSE)
2) From FUSE client started creating a directory
mkdir test_dir
(Kept break point at dht_mkdir_hashed_cbk)
3) From NFS mount,  created same directory and the directory creation failed with the below error message

mkdir: cannot create directory `test_dir': File exists

4) continued on break point.
5) mkdir command from FUSE client executed successfully without any errors.

Comment 8 Nithya Balachandran 2016-08-01 06:32:36 UTC
Thanks Prasad. As per comment#7, I am closing this BZ.


Note You need to log in before you can comment on or make changes to this bug.