1387494 – Files not deleted from arbiter brick after deletion from the mount point.

Bug 1387494 - Files not deleted from arbiter brick after deletion from the mount point.

Summary: Files not deleted from arbiter brick after deletion from the mount point.

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	arbiter
Sub Component:
Version:	rhgs-3.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Ravishankar N
QA Contact:	Karan Sandha
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1455034 (view as bug list)
Depends On:	1335470
Blocks:	1351530
TreeView+	depends on / blocked

Reported:	2016-10-21 05:48 UTC by Ravishankar N
Modified:	2018-11-19 05:33 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	Known Issue
Doc Text:	If the data bricks of the arbiter volume get filled up, further creation of new entries might succeed in the arbiter brick despite failing on the data bricks with ENOSPC and the application (client) itself receiving an error on the mount point. Thus the arbiter bricks might have more entries. Now when an rm -rf is performed from the client, if the readdir (as a part of rm -rf) gets served on the data brick, it might delete only those entries and not the ones present only in the arbiter. When the rmdir on the parent dir of these entries comes, it won't succeed on the arbiter (errors out with ENOTEMPTY), leading to it not being removed from arbiter. Workaround: If the deletion from the mount did not complain but the bricks still contain the directories, we would need to remove the directory and its associated gfid symlink from the back end. If the directory contains files, they (file + its gfid hardlink) would need to be removed too.
Clone Of:	1335470
Environment:
Last Closed:	2018-11-19 05:33:10 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Ravishankar N 2016-10-21 05:48:44 UTC

+++ This bug was initially created as a clone of Bug #1335470 +++

Description of problem:
The files from arbiter, brick1 got deleted after removing the files from the mount point. but the third brick the files were still remaining.


Version-Release number of selected component (if applicable):
Logs placed qe@@rhsqe-repo.lab.eng.blr.redhat.com:/var/www/html/sosreports

How reproducible: 1


Steps to Reproduce:
1. Mounted the volume on client  using fuse mount on server3(43.192)
2. created files using for ((i=1; i<=100; i++))
do
mkdir -pv directory$i
cd directory$i
dd if=/dev/urandom of=file$i bs=1M count=500 status=progress
cd ..
cd directory$i
echo "File Renamed"
mv file$i renamed$i

done

3. after the disk is full. delete the two files from directory1 and directory2.
4. shutdown the server2 (43.157) for 1 min.
5. Switch on the server and check for the files in all the bricks


Actual results:
files got deleted from arbiter43. and server2(43.157) but files remained undeleted from server3 (43.192)

files Undeleted:- 

[root@dhcp43-192 directory1]# ls
directory2  directory82  file82
[root@dhcp43-192 directory1]# cd directory2/
[root@dhcp43-192 directory2]# ls
directory3  directory81  file81
[root@dhcp43-192 directory2]# ls -ll
total 0
drwxr-xr-x. 4 root root 54 May 12 19:49 directory3
drwxr-xr-x. 2 root root  6 May 12 19:16 directory81
-rw-r--r--. 2 root root  0 May 12 19:16 file81

Expected results:
all files should be deleted from the servers  

Additional info:
Found this error while deleting the files from mount point.

root@dhcp42-93 fuse]# rm -rf directory1/
rm: cannot remove ‘directory1/directory2/directory3/directory4/directory5/directory6/directory7/directory8/directory9/directory10/directory11/directory12/directory13/directory14/directory15/directory16/directory17/directory18/directory19/directory20/directory21/directory22/directory23/directory24/directory25/directory26/directory27/directory28/directory29/directory30/directory31/directory32/directory33/directory34/directory35/directory36/directory37/directory38/directory39/directory40/directory41’: Directory not empty

Comment 5 Karan Sandha 2016-11-10 13:35:27 UTC

I have recreated the issue again and placed all the client and server logs in the bug folder :-
rhsqe-repo.lab.eng.blr.redhat.com:/var/www/html/sosreports/1335470/repro

Gluster version:-

[root@dhcp47-141 tmp]# gluster --version
glusterfs 3.8.4 built on Oct 24 2016 11:13:47
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.

Comment 6 Ravishankar N 2016-11-12 07:52:40 UTC

RCA:
The script creates 500MB files and fills up the 2 data bricks midway. Though the subsequent mkdir/creates fail on the mount with ENOTCONN, the arbiter brick still gets filled up with the dirs and 0-byte files. Thus the arbiter has more files and dirs than the data bricks. Also, since the "cd .." in the script gets ENOTCONN now (since fop succeeded only on arbiter) , files/dirs are created in a haphazard manner in arbiter (i.e. not following the same order as the script intended it to be).

Now when brick2 is brought down and rm -rf is done from the mount, the readdir is served from brick1. The dentries present in b1 and b3 are deleted. But when rmdir comes on the parent, it fails on the arbiter brick with ENOTEMPY because it has extra files/dirs. Thus at the end of rm -rf, the dirs/files are still present in the arbiter.

Karan, I think there is a typo in the BZ description. It should read 'Files not deleted from the arbiter brick..'. Could you confirm and change it?

Comment 8 Atin Mukherjee 2016-11-16 11:22:37 UTC

Based on the discussion http://post-office.corp.redhat.com/archives/gluster-storage-release-team/2016-November/msg00084.html resetting the flags and taking this BZ out of 3.2.0.

Comment 15 Ravishankar N 2017-05-24 12:53:52 UTC

*** Bug 1455034 has been marked as a duplicate of this bug. ***

Comment 18 Ravishankar N 2018-11-19 05:33:10 UTC

Given that this is a situation which can be hit only when data disks are full and there is no data-loss of any kind or falsely reporting success to the application for these entry operations,this bug is not a priority right now and is being closed. Entry FOP consistency will still be undertaken as a part of 1593242 and upstream github issue https://github.com/gluster/glusterfs/issues/502

Note You need to log in before you can comment on or make changes to this bug.