Bug 1416450 - hardlink creation fails saying "file exists" during brick down scenario on fuse mount
Summary: hardlink creation fails saying "file exists" during brick down scenario on fu...
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: md-cache
Version: rhgs-3.2
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: ---
Assignee: Poornima G
QA Contact: Vivek Das
Depends On:
Blocks: 1444509
TreeView+ depends on / blocked
Reported: 2017-01-25 14:32 UTC by Nag Pavan Chilakam
Modified: 2018-11-19 06:17 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2018-11-19 06:16:59 UTC
Target Upstream Version:

Attachments (Terms of Use)

Description Nag Pavan Chilakam 2017-01-25 14:32:22 UTC
Description of problem:
Note: I was working on reproducing splitbrain situations for verifying automatic split brain resolution.

1)I created a 1x2 volume 
2) set favorite child policy
3)created a file f1 from mount
4)brought down b1 
5) now did a hardlink creation of f1....created 1000 hardlinks say hlink.1,hlink.2,....hlink.1000
6)now killed b2 and brought b1 up(was trying to create splitbrains)
7)now checked the fuse mount, it shows only f1
8)tried to create hardlinks with same name as what was done in step 5 
ie ln f1 hlink.{1..1000}

I get the following error
ln: failed to create hard link ‘link.994’: File exists

if i check the mount post the above command, i found about 3 files created as below 
[root@dhcp35-196 dir2]# ll
total 0
-rw-r--r--. 4 root root 0 Jan 25 19:52 link.1
-rw-r--r--. 4 root root 0 Jan 25 19:52 link.166
-rw-r--r--. 4 root root 0 Jan 25 19:52 link.800
-rw-r--r--. 4 root root 0 Jan 25 19:52 y1
[root@dhcp35-196 dir2]# ln y1 link.1001

this issue is not seen on gnfs, ie the hlink creation passes on b2

Why is this happening , is it some caching issue?

Also, however there is not split brains or issues observed otherwise

Version-Release number of selected component (if applicable):

How reproducible:

Comment 2 Nag Pavan Chilakam 2017-01-25 14:36:57 UTC
Volume Name: 1x2
Type: Replicate
Volume ID: ceecc137-d06f-438a-b43f-8836fa8a348d
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Options Reconfigured:
nfs.disable: off
performance.readdir-ahead: on
transport.address-family: inet
cluster.favorite-child-policy: mtime
cluster.self-heal-daemon: enable

Comment 9 Poornima G 2018-11-19 06:16:59 UTC
To fix this we have to clear the cache in md-cache every time a afr brick goes down or comes back up. This will relust in lots of performance impact. This use case is not something that needs to be fixed at the cost of performance. As a workaround, just stat-prefetch off-on will solve the issue. Hence closing as wontfix.

Note You need to log in before you can comment on or make changes to this bug.