Bug 1416450

Summary:	hardlink creation fails saying "file exists" during brick down scenario on fuse mount
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Nag Pavan Chilakam <nchilaka>
Component:	md-cache	Assignee:	Poornima G <pgurusid>
Status:	CLOSED WONTFIX	QA Contact:	Vivek Das <vdas>
Severity:	low	Docs Contact:
Priority:	unspecified
Version:	rhgs-3.2	CC:	rhs-bugs, storage-qa-internal
Target Milestone:	---	Keywords:	ZStream
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-11-19 06:16:59 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1444509

Description Nag Pavan Chilakam 2017-01-25 14:32:22 UTC

Description of problem:
======================
Note: I was working on reproducing splitbrain situations for verifying automatic split brain resolution.


1)I created a 1x2 volume 
2) set favorite child policy
3)created a file f1 from mount
4)brought down b1 
5) now did a hardlink creation of f1....created 1000 hardlinks say hlink.1,hlink.2,....hlink.1000
6)now killed b2 and brought b1 up(was trying to create splitbrains)
7)now checked the fuse mount, it shows only f1
8)tried to create hardlinks with same name as what was done in step 5 
ie ln f1 hlink.{1..1000}

I get the following error
ln: failed to create hard link ‘link.994’: File exists

if i check the mount post the above command, i found about 3 files created as below 
[root@dhcp35-196 dir2]# ll
total 0
-rw-r--r--. 4 root root 0 Jan 25 19:52 link.1
-rw-r--r--. 4 root root 0 Jan 25 19:52 link.166
-rw-r--r--. 4 root root 0 Jan 25 19:52 link.800
-rw-r--r--. 4 root root 0 Jan 25 19:52 y1
[root@dhcp35-196 dir2]# ln y1 link.1001



this issue is not seen on gnfs, ie the hlink creation passes on b2

Why is this happening , is it some caching issue?

Also, however there is not split brains or issues observed otherwise

Version-Release number of selected component (if applicable):
==========
glusterfs-server-3.8.4-13.el7rhgs.x86_64


How reproducible:
always

Comment 2 Nag Pavan Chilakam 2017-01-25 14:36:57 UTC

Volume Name: 1x2
Type: Replicate
Volume ID: ceecc137-d06f-438a-b43f-8836fa8a348d
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.70.35.37:/rhs/brick1/1x2
Brick2: 10.70.35.116:/rhs/brick1/1x2
Options Reconfigured:
nfs.disable: off
performance.readdir-ahead: on
transport.address-family: inet
cluster.favorite-child-policy: mtime
cluster.self-heal-daemon: enable

Comment 9 Poornima G 2018-11-19 06:16:59 UTC

To fix this we have to clear the cache in md-cache every time a afr brick goes down or comes back up. This will relust in lots of performance impact. This use case is not something that needs to be fixed at the cost of performance. As a workaround, just stat-prefetch off-on will solve the issue. Hence closing as wontfix.