Bug 1394244 - directory deletion failing with directory not empty
Summary: directory deletion failing with directory not empty
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: distribute
Version: rhgs-3.2
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Nithya Balachandran
QA Contact: Prasad Desala
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-11 13:13 UTC by Nag Pavan Chilakam
Modified: 2017-01-03 08:05 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-14 09:38:17 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Nag Pavan Chilakam 2016-11-11 13:13:42 UTC
Description of problem:
======================
In my systemic setup, which i started freshly where I have a 4x2 volume spanning 4 nodes.
I hav enabled below features, look at vol info:
root@dhcp35-191 ~]# gluster v status salvol
Status of volume: salvol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.191:/rhs/brick1/salvol       49153     0          Y       15470
Brick 10.70.37.108:/rhs/brick1/salvol       49152     0          Y       25158
Brick 10.70.35.3:/rhs/brick1/salvol         49152     0          Y       8975 
Brick 10.70.37.66:/rhs/brick1/salvol        49152     0          Y       26096
Brick 10.70.35.191:/rhs/brick2/salvol       49154     0          Y       15489
Brick 10.70.37.108:/rhs/brick2/salvol       49153     0          Y       25177
Brick 10.70.35.3:/rhs/brick2/salvol         49153     0          Y       8994 
Brick 10.70.37.66:/rhs/brick2/salvol        49153     0          Y       26115
Snapshot Daemon on localhost                49155     0          Y       15598
Self-heal Daemon on localhost               N/A       N/A        Y       15509
Quota Daemon on localhost                   N/A       N/A        Y       15545
Snapshot Daemon on 10.70.35.3               49154     0          Y       9091 
Self-heal Daemon on 10.70.35.3              N/A       N/A        Y       9014 
Quota Daemon on 10.70.35.3                  N/A       N/A        Y       9045 
Snapshot Daemon on 10.70.37.66              49154     0          Y       26214
Self-heal Daemon on 10.70.37.66             N/A       N/A        Y       26135
Quota Daemon on 10.70.37.66                 N/A       N/A        Y       26167
Snapshot Daemon on 10.70.37.108             49154     0          Y       25276
Self-heal Daemon on 10.70.37.108            N/A       N/A        Y       25201
Quota Daemon on 10.70.37.108                N/A       N/A        Y       25228
 
Task Status of Volume salvol
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@dhcp35-191 ~]# gluster v statedump
Usage: volume statedump <VOLNAME> [nfs|quotad] [all|mem|iobuf|callpool|priv|fd|inode|history]...
[root@dhcp35-191 ~]# 
[root@dhcp35-191 ~]# 
[root@dhcp35-191 ~]# 
[root@dhcp35-191 ~]# gluster v info
 
Volume Name: salvol
Type: Distributed-Replicate
Volume ID: cca6a599-ec09-4409-89d5-7cb00c20856b
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: 10.70.35.191:/rhs/brick1/salvol
Brick2: 10.70.37.108:/rhs/brick1/salvol
Brick3: 10.70.35.3:/rhs/brick1/salvol
Brick4: 10.70.37.66:/rhs/brick1/salvol
Brick5: 10.70.35.191:/rhs/brick2/salvol
Brick6: 10.70.37.108:/rhs/brick2/salvol
Brick7: 10.70.35.3:/rhs/brick2/salvol
Brick8: 10.70.37.66:/rhs/brick2/salvol
Options Reconfigured:
features.cache-invalidation: on
features.cache-invalidation-timeout: 400
performance.cache-invalidation: on
performance.md-cache-timeout: 300
cluster.shd-max-threads: 10
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
features.uss: on
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
[root@dhcp35-191 ~]# gluster v status
Status of volume: salvol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.191:/rhs/brick1/salvol       49153     0          Y       15470
Brick 10.70.37.108:/rhs/brick1/salvol       49152     0          Y       25158
Brick 10.70.35.3:/rhs/brick1/salvol         49152     0          Y       8975 
Brick 10.70.37.66:/rhs/brick1/salvol        49152     0          Y       26096
Brick 10.70.35.191:/rhs/brick2/salvol       49154     0          Y       15489
Brick 10.70.37.108:/rhs/brick2/salvol       49153     0          Y       25177
Brick 10.70.35.3:/rhs/brick2/salvol         49153     0          Y       8994 
Brick 10.70.37.66:/rhs/brick2/salvol        49153     0          Y       26115
Snapshot Daemon on localhost                49155     0          Y       15598
Self-heal Daemon on localhost               N/A       N/A        Y       15509
Quota Daemon on localhost                   N/A       N/A        Y       15545
Snapshot Daemon on 10.70.35.3               49154     0          Y       9091 
Self-heal Daemon on 10.70.35.3              N/A       N/A        Y       9014 
Quota Daemon on 10.70.35.3                  N/A       N/A        Y       9045 
Snapshot Daemon on 10.70.37.108             49154     0          Y       25276
Self-heal Daemon on 10.70.37.108            N/A       N/A        Y       25201
Quota Daemon on 10.70.37.108                N/A       N/A        Y       25228
Snapshot Daemon on 10.70.37.66              49154     0          Y       26214
Self-heal Daemon on 10.70.37.66             N/A       N/A        Y       26135
Quota Daemon on 10.70.37.66                 N/A       N/A        Y       26167
 
Task Status of Volume salvol
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@dhcp35-191 ~]# 

I then mounted the volume on 5 different clients and did following IOs:
From all clients:===> started taking statedump of the fuse mount process every 5 minutes and moving them to a dedicated directory for each host on the mount point(so into gluster vol)
From all clients:====>collecting top and cpu usage every 2 mins and appending the contents into a file for each host on the mount point(so into gluster vol)

Now from two of the clients: I started to created deep directory stucture parallely:

Client1:rhs-client11 mounted from  10.70.35.191:salvol
Client2:rhs-client32:mounted from 10.70.37.66:/salvol


However after about just 3 minutes I stopped the directory creation
I then did a parallel rm -rf * from both these clients:

The rm -rf failed with dir not empty on both clients


client1:
[root@rhs-client11 same-dir-create]# rm -rf *
rm: cannot remove `level1.1/level2.1/level3.1/level4.4': Directory not empty
[root@rhs-client11 same-dir-create]# 
[root@rhs-client11 same-dir-create]# 
[root@rhs-client11 same-dir-create]# ls
dir.rhs-client11.lab.eng.blr.redhat.com.log  level1.1
[root@rhs-client11 same-dir-create]# rm -rf *
rm: cannot remove `level1.1/level2.1/level3.1/level4.6': Directory not empty
[root@rhs-client11 same-dir-create]# ls
dir.rhs-client11.lab.eng.blr.redhat.com.log  level1.1
[root@rhs-client11 same-dir-create]# ls
dir.rhs-client11.lab.eng.blr.redhat.com.log  level1.1
[root@rhs-client11 same-dir-create]# pwd
/mnt/salvol/test-arena/same-dir-create
[root@rhs-client11 same-dir-create]# ls
dir.rhs-client11.lab.eng.blr.redhat.com.log  level1.1
[root@rhs-client11 same-dir-create]# rm -rf *
rm: cannot remove `level1.1/level2.1/level3.1': Directory not empty
[root@rhs-client11 same-dir-create]# ls
dir.rhs-client11.lab.eng.blr.redhat.com.log  level1.1
[root@rhs-client11 same-dir-create]# 
[root@rhs-client11 same-dir-create]# 
[root@rhs-client11 same-dir-create]# 
[root@rhs-client11 same-dir-create]# ls level1.1/level2.1/level3.1/level4.26/
level5.100  level5.74  level5.78  level5.82  level5.86  level5.90  level5.94  level5.98
level5.71   level5.75  level5.79  level5.83  level5.87  level5.91  level5.95  level5.99
level5.72   level5.76  level5.80  level5.84  level5.88  level5.92  level5.96
level5.73   level5.77  level5.81  level5.85  level5.89  level5.93  level5.97
[root@rhs-client11 same-dir-create]# ls level1.1/level2.1/level3.1/level4.26/*



client2:
[root@rhs-client32 same-dir-create]# ls
dir.rhs-client11.lab.eng.blr.redhat.com.log  dir.rhs-client32.lab.eng.blr.redhat.com.log  level1.1
[root@rhs-client32 same-dir-create]# ls
dir.rhs-client11.lab.eng.blr.redhat.com.log  dir.rhs-client32.lab.eng.blr.redhat.com.log  level1.1
[root@rhs-client32 same-dir-create]# owd
-bash: owd: command not found
[root@rhs-client32 same-dir-create]# pwd
/mnt/salvol/test-arena/same-dir-create
[root@rhs-client32 same-dir-create]# ls
dir.rhs-client11.lab.eng.blr.redhat.com.log  dir.rhs-client32.lab.eng.blr.redhat.com.log  level1.1
[root@rhs-client32 same-dir-create]# cd sam
-bash: cd: sam: No such file or directory
[root@rhs-client32 same-dir-create]# ls
dir.rhs-client11.lab.eng.blr.redhat.com.log  dir.rhs-client32.lab.eng.blr.redhat.com.log  level1.1
[root@rhs-client32 same-dir-create]# rm -rf *
rm: cannot remove ‘level1.1/level2.1/level3.1/level4.4’: Directory not empty
[root@rhs-client32 same-dir-create]# 
[root@rhs-client32 same-dir-create]# 
[root@rhs-client32 same-dir-create]# ls
dir.rhs-client11.lab.eng.blr.redhat.com.log  level1.1
[root@rhs-client32 same-dir-create]# rm -rf *
rm: cannot remove ‘level1.1/level2.1/level3.1/level4.6’: Directory not empty
[root@rhs-client32 same-dir-create]# ls
dir.rhs-client11.lab.eng.blr.redhat.com.log  level1.1
[root@rhs-client32 same-dir-create]# ls
dir.rhs-client11.lab.eng.blr.redhat.com.log  level1.1
[root@rhs-client32 same-dir-create]# pwd
/mnt/salvol/test-arena/same-dir-create
[root@rhs-client32 same-dir-create]# ls
dir.rhs-client11.lab.eng.blr.redhat.com.log  level1.1
[root@rhs-client32 same-dir-create]# rm -rf *
rm: cannot remove ‘level1.1/level2.1/level3.1’: Directory not empty
[root@rhs-client32 same-dir-create]# ls
dir.rhs-client11.lab.eng.blr.redhat.com.log  level1.1
[root@rhs-client32 same-dir-create]# rm -rf *
rm: cannot remove ‘level1.1/level2.1/level3.1’: Directory not empty
[root@rhs-client32 same-dir-create]# ls
dir.rhs-client11.lab.eng.blr.redhat.com.log  level1.1
[root@rhs-client32 same-dir-create]# rm -rf *
rm: cannot remove ‘level1.1/level2.1/level3.1/level4.14’: Directory not empty
[root@rhs-client32 same-dir-create]# ls
dir.rhs-client11.lab.eng.blr.redhat.com.log  level1.1
[root@rhs-client32 same-dir-create]# ls level1.1/level2.1/level3.1/level4.1
level4.14/ level4.15/ 
[root@rhs-client32 same-dir-create]# ls level1.1/level2.1/level3.1/level4.1
level4.14/ level4.15/ 
[root@rhs-client32 same-dir-create]# ls level1.1/level2.1/level3.1/level4.1
level4.14/ level4.15/ 
[root@rhs-client32 same-dir-create]# ls level1.1/level2.1/level3.1/level4.14/level5.
level5.100/ level5.53/  level5.57/  level5.61/  level5.65/  level5.69/  level5.73/  level5.77/  level5.81/  level5.85/  level5.89/  level5.93/  level5.97/  
level5.50/  level5.54/  level5.58/  level5.62/  level5.66/  level5.70/  level5.74/  level5.78/  level5.82/  level5.86/  level5.90/  level5.94/  level5.98/  
level5.51/  level5.55/  level5.59/  level5.63/  level5.67/  level5.71/  level5.75/  level5.79/  level5.83/  level5.87/  level5.91/  level5.95/  level5.99/  
level5.52/  level5.56/  level5.60/  level5.64/  level5.68/  level5.72/  level5.76/  level5.80/  level5.84/  level5.88/  level5.92/  level5.96/  
[root@rhs-client32 same-dir-create]# ls level1.1/level2.1/level3.1/level4.14/level5.
level5.100/ level5.53/  level5.57/  level5.61/  level5.65/  level5.69/  level5.73/  level5.77/  level5.81/  level5.85/  level5.89/  level5.93/  level5.97/  
level5.50/  level5.54/  level5.58/  level5.62/  level5.66/  level5.70/  level5.74/  level5.78/  level5.82/  level5.86/  level5.90/  level5.94/  level5.98/  
level5.51/  level5.55/  level5.59/  level5.63/  level5.67/  level5.71/  level5.75/  level5.79/  level5.83/  level5.87/  level5.91/  level5.95/  level5.99/  
level5.52/  level5.56/  level5.60/  level5.64/  level5.68/  level5.72/  level5.76/  level5.80/  level5.84/  level5.88/  level5.92/  level5.96/  
[root@rhs-client32 same-dir-create]# ls level1.1/level2.1/level3.1/level4.14/level5.
level5.100/ level5.53/  level5.57/  level5.61/  level5.65/  level5.69/  level5.73/  level5.77/  level5.81/  level5.85/  level5.89/  level5.93/  level5.97/  
level5.50/  level5.54/  level5.58/  level5.62/  level5.66/  level5.70/  level5.74/  level5.78/  level5.82/  level5.86/  level5.90/  level5.94/  level5.98/  
level5.51/  level5.55/  level5.59/  level5.63/  level5.67/  level5.71/  level5.75/  level5.79/  level5.83/  level5.87/  level5.91/  level5.95/  level5.99/  
level5.52/  level5.56/  level5.60/  level5.64/  level5.68/  level5.72/  level5.76/  level5.80/  level5.84/  level5.88/  level5.92/  level5.96/  
[root@rhs-client32 same-dir-create]# ls level1.1/level2.1/level3.1/level4.14/level5.
level5.100/ level5.53/  level5.57/  level5.61/  level5.65/  level5.69/  level5.73/  level5.77/  level5.81/  level5.85/  level5.89/  level5.93/  level5.97/  
level5.50/  level5.54/  level5.58/  level5.62/  level5.66/  level5.70/  level5.74/  level5.78/  level5.82/  level5.86/  level5.90/  level5.94/  level5.98/  
level5.51/  level5.55/  level5.59/  level5.63/  level5.67/  level5.71/  level5.75/  level5.79/  level5.83/  level5.87/  level5.91/  level5.95/  level5.99/  
level5.52/  level5.56/  level5.60/  level5.64/  level5.68/  level5.72/  level5.76/  level5.80/  level5.84/  level5.88/  level5.92/  level5.96/  
[root@rhs-client32 same-dir-create]# #ls level1.1/level2.1/level3.1/level4.14/level5.100/
[root@rhs-client32 same-dir-create]# rm -rf *
rm: cannot remove ‘level1.1/level2.1/level3.1/level4.16’: Directory not empty
[root@rhs-client32 same-dir-create]# ls
dir.rhs-client11.lab.eng.blr.redhat.com.log  level1.1





I then tried to delete only from client2:
But it still fails with dir not empty

Checked the mount logs but found no new log entries on retry



Tried to check the brick logs while doing the same and found that there was only one log entry on the last brick ie brick2 of node4 

[2016-11-11 12:40:28.128298] E [MSGID: 113039] [posix.c:3018:posix_open] 0-salvol-posix: open on /rhs/brick2/salvol/.glusterfs/e4/df/e4df858e-c6c6-4fdb-bdbb-e3c07a3187ba, flags: 1025 [No such file or directory]

Comment 2 Nithya Balachandran 2016-11-11 15:21:55 UTC
Nag,

Can you please leave the system in the same state until Monday? I will take a look at it then.

Thanks,
Nithya

Comment 3 Nag Pavan Chilakam 2016-11-14 09:38:17 UTC
I did notice later that there were directories being created from one of the clients while delete was being tried.
Hence that could be the reason why dir deletion failed.
Closing this as Not A Bug.
Will reopen or raise a new bug if i see this in a healthy setup.
Sorry for the inconvenience


Note You need to log in before you can comment on or make changes to this bug.