+++ This bug was initially created as a clone of Bug #1369077 +++
Description of problem:
Killed the data bricks which had the directory and data and renamed the directory from mount pt. renaming was successfull.
Note:- Read the steps from more information
Version-Release number of selected component (if applicable):
glusterfs 3.8.2 built on Aug 10 2016 15:34:37
[root@dhcp43-223 new]# gluster vol info
Volume Name: arbiter
Volume ID: 70c7113e-2223-4cd2-acfd-b08b1c376ea4
Number of Bricks: 4 x (2 + 1) = 12
Brick3: 10.70.43.142:/bricks/brick0/abc (arbiter)
Brick6: 10.70.43.142:/bricks/brick1/abc (arbiter)
Brick9: 10.70.43.142:/bricks/brick2/abc (arbiter)
Brick12: 10.70.43.142:/bricks/brick3/abc (arbiter)
Steps to Reproduce:
1. Create an arbiter volume 4 x (2 + 1) mount it using FUSE ( volume name -Arbiter)
2. On mount point create a directory "dir1" and create a file inside "abc"
3. write 100M to the file using dd
dd if=/dev/urandom of=abc bs=1M count=100
4. now kill the data bricks from the volume on which the data is present i.e "abc" file
in my case:- brick10 , brick11 were data bricks , brick12 was the arbiter brick
5. Rest all bricks were online.
6. now change the directory name from dir1 to dir2 from mount point using "mv dir1 dir2"
The directory got renamed in-spite being in read only mode
#mv dir1 dir2
mv: cannot move ‘dir1’ to ‘dir2’: Read-only file system
directory shouldn't be renamed.
Tried the same on plain dist volume and plain replicate 1*3 volume. the issue was not reproducible.
Reproduced the same issue on 2 x (2 + 1) volume
observed that after renaming the directory
[root@dhcp43-165 super]# mv new one
mv: cannot move ‘new’ to ‘one’: Read-only file system
[root@dhcp43-165 super]# ls
ls: cannot access new: No such file or directory
two directories are created.
--- Additional comment from Karan Sandha on 2016-08-22 08:47 EDT ---
--- Additional comment from Karan Sandha on 2016-08-22 08:48 EDT ---
--- Additional comment from Karan Sandha on 2016-08-22 08:49 EDT ---
--- Additional comment from Karan Sandha on 2016-08-22 08:53 EDT ---
--- Additional comment from Ravishankar N on 2016-08-24 10:02:28 EDT ---
Changing the component to replicate as it occurs on distribute replicate also. (Karan, feel free to correct me if I am wrong). Also assigning it to Pranith as he said he'd work on the fix:
Relevant technical discussions on IRC:
<itisravi> pranithk1: are you free to talk about the bug Karan raised?
<itisravi> its a day one issue IMO and not specific to afr.
<pranithk1> itisravi: He said the bug is not recreatable in 3-way replication?
<itisravi> pranithk1: It is..I've requested him to check again.
<itisravi> pranithk1: so if mkdir fails on one replica subvol due to quorum not met etc , dht has no roll back
<itisravi> thats the issue.
<pranithk1> itisravi: Does it happen on plain replicate?
<itisravi> pranithk1: no
<itisravi> pranithk1: its dht renamedir thing..
<pranithk1> itisravi: okay, assign the bug to DHT giving the reason
<itisravi> pranithk1: nithya was saying if afr_inodelk can also have quorum checks, then renamedir will not happen
<itisravi> so we will be good.
<itisravi> instead of partially creating it on the up subvols of DHT
<pranithk1> itisravi: That is not a bad idea, send out a patch. Please tell her it only prevents the odds, won't fix the problem completely
<itisravi> pranithk1: we can do it for afr_entrylk also then no?
<pranithk1> itisravi: Actually the inodelk/finodelk needs to be reworked. I will send the patch
<pranithk1> itisravi: yeah, that too
<itisravi> pranithk1: I see , okay.
--- Additional comment from Niels de Vos on 2016-09-12 01:39:42 EDT ---
All 3.8.x bugs are now reported against version 3.8 (without .x). For more information, see http://www.gluster.org/pipermail/gluster-devel/2016-September/050859.html
--- Additional comment from Worker Ant on 2016-11-08 07:21:51 EST ---
REVIEW: http://review.gluster.org/15802 (cluster/afr: Fix bugs in [f]inodelk/[f]entrylk) posted (#1) for review on master by Pranith Kumar Karampuri (email@example.com)
Unfortunately the fix mentioned in comment 5 has introduced a regression and upstream mainline patch http://review.gluster.org/#/c/15984/ has been posted to address it. Moving this BZ back to Post.
Downstream patch merged: https://code.engineering.redhat.com/gerrit/#/c/92316/
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.