+++ This bug was initially created as a clone of Bug #1369077 +++ Description of problem: Killed the data bricks which had the directory and data and renamed the directory from mount pt. renaming was successfull. Note:- Read the steps from more information Version-Release number of selected component (if applicable): gluster --version glusterfs 3.8.2 built on Aug 10 2016 15:34:37 How reproducible: 3/3 [root@dhcp43-223 new]# gluster vol info Volume Name: arbiter Type: Distributed-Replicate Volume ID: 70c7113e-2223-4cd2-acfd-b08b1c376ea4 Status: Started Number of Bricks: 4 x (2 + 1) = 12 Transport-type: tcp Bricks: Brick1: 10.70.43.223:/bricks/brick0/abc Brick2: 10.70.42.58:/bricks/brick0/abc Brick3: 10.70.43.142:/bricks/brick0/abc (arbiter) Brick4: 10.70.43.223:/bricks/brick1/abc Brick5: 10.70.42.58:/bricks/brick1/abc Brick6: 10.70.43.142:/bricks/brick1/abc (arbiter) Brick7: 10.70.43.223:/bricks/brick2/abc Brick8: 10.70.42.58:/bricks/brick2/abc Brick9: 10.70.43.142:/bricks/brick2/abc (arbiter) Brick10: 10.70.43.223:/bricks/brick3/abc Brick11: 10.70.42.58:/bricks/brick3/abc Brick12: 10.70.43.142:/bricks/brick3/abc (arbiter) Options Reconfigured: client.event-threads: 4 server.event-threads: 4 cluster.lookup-optimize: on transport.address-family: inet performance.readdir-ahead: on Steps to Reproduce: 1. Create an arbiter volume 4 x (2 + 1) mount it using FUSE ( volume name -Arbiter) 2. On mount point create a directory "dir1" and create a file inside "abc" 3. write 100M to the file using dd dd if=/dev/urandom of=abc bs=1M count=100 4. now kill the data bricks from the volume on which the data is present i.e "abc" file in my case:- brick10 , brick11 were data bricks , brick12 was the arbiter brick 5. Rest all bricks were online. 6. now change the directory name from dir1 to dir2 from mount point using "mv dir1 dir2" Actual results: The directory got renamed in-spite being in read only mode #mv dir1 dir2 mv: cannot move ‘dir1’ to ‘dir2’: Read-only file system #ls # dir2 Expected results: directory shouldn't be renamed. Additional info: Tried the same on plain dist volume and plain replicate 1*3 volume. the issue was not reproducible. Reproduced the same issue on 2 x (2 + 1) volume observed that after renaming the directory [root@dhcp43-165 super]# mv new one mv: cannot move ‘new’ to ‘one’: Read-only file system [root@dhcp43-165 super]# [root@dhcp43-165 super]# ls ls: cannot access new: No such file or directory new one two directories are created. --- Additional comment from Karan Sandha on 2016-08-22 08:47 EDT --- --- Additional comment from Karan Sandha on 2016-08-22 08:48 EDT --- --- Additional comment from Karan Sandha on 2016-08-22 08:49 EDT --- --- Additional comment from Karan Sandha on 2016-08-22 08:53 EDT --- --- Additional comment from Ravishankar N on 2016-08-24 10:02:28 EDT --- Changing the component to replicate as it occurs on distribute replicate also. (Karan, feel free to correct me if I am wrong). Also assigning it to Pranith as he said he'd work on the fix: Relevant technical discussions on IRC: <itisravi> pranithk1: are you free to talk about the bug Karan raised? <itisravi> its a day one issue IMO and not specific to afr. <itisravi> s/afr/arbiter <pranithk1> itisravi: He said the bug is not recreatable in 3-way replication? <itisravi> pranithk1: It is..I've requested him to check again. <itisravi> pranithk1: so if mkdir fails on one replica subvol due to quorum not met etc , dht has no roll back <itisravi> thats the issue. <pranithk1> itisravi: Does it happen on plain replicate? <itisravi> pranithk1: no <itisravi> pranithk1: its dht renamedir thing.. <pranithk1> itisravi: okay, assign the bug to DHT giving the reason <itisravi> pranithk1: nithya was saying if afr_inodelk can also have quorum checks, then renamedir will not happen <itisravi> so we will be good. <itisravi> instead of partially creating it on the up subvols of DHT <pranithk1> itisravi: That is not a bad idea, send out a patch. Please tell her it only prevents the odds, won't fix the problem completely <itisravi> pranithk1: we can do it for afr_entrylk also then no? <pranithk1> itisravi: Actually the inodelk/finodelk needs to be reworked. I will send the patch <pranithk1> itisravi: yeah, that too <itisravi> pranithk1: I see , okay. --- Additional comment from Niels de Vos on 2016-09-12 01:39:42 EDT --- All 3.8.x bugs are now reported against version 3.8 (without .x). For more information, see http://www.gluster.org/pipermail/gluster-devel/2016-September/050859.html --- Additional comment from Worker Ant on 2016-11-08 07:21:51 EST --- REVIEW: http://review.gluster.org/15802 (cluster/afr: Fix bugs in [f]inodelk/[f]entrylk) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)
Unfortunately the fix mentioned in comment 5 has introduced a regression and upstream mainline patch http://review.gluster.org/#/c/15984/ has been posted to address it. Moving this BZ back to Post.
Downstream patch merged: https://code.engineering.redhat.com/gerrit/#/c/92316/
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html