Hide Forgot
Description of problem: ====================== In my systemic setup, which i started freshly where I have a 4x2 volume spanning 4 nodes. I hav enabled below features, look at vol info: root@dhcp35-191 ~]# gluster v status salvol Status of volume: salvol Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.35.191:/rhs/brick1/salvol 49153 0 Y 15470 Brick 10.70.37.108:/rhs/brick1/salvol 49152 0 Y 25158 Brick 10.70.35.3:/rhs/brick1/salvol 49152 0 Y 8975 Brick 10.70.37.66:/rhs/brick1/salvol 49152 0 Y 26096 Brick 10.70.35.191:/rhs/brick2/salvol 49154 0 Y 15489 Brick 10.70.37.108:/rhs/brick2/salvol 49153 0 Y 25177 Brick 10.70.35.3:/rhs/brick2/salvol 49153 0 Y 8994 Brick 10.70.37.66:/rhs/brick2/salvol 49153 0 Y 26115 Snapshot Daemon on localhost 49155 0 Y 15598 Self-heal Daemon on localhost N/A N/A Y 15509 Quota Daemon on localhost N/A N/A Y 15545 Snapshot Daemon on 10.70.35.3 49154 0 Y 9091 Self-heal Daemon on 10.70.35.3 N/A N/A Y 9014 Quota Daemon on 10.70.35.3 N/A N/A Y 9045 Snapshot Daemon on 10.70.37.66 49154 0 Y 26214 Self-heal Daemon on 10.70.37.66 N/A N/A Y 26135 Quota Daemon on 10.70.37.66 N/A N/A Y 26167 Snapshot Daemon on 10.70.37.108 49154 0 Y 25276 Self-heal Daemon on 10.70.37.108 N/A N/A Y 25201 Quota Daemon on 10.70.37.108 N/A N/A Y 25228 Task Status of Volume salvol ------------------------------------------------------------------------------ There are no active volume tasks [root@dhcp35-191 ~]# gluster v statedump Usage: volume statedump <VOLNAME> [nfs|quotad] [all|mem|iobuf|callpool|priv|fd|inode|history]... [root@dhcp35-191 ~]# [root@dhcp35-191 ~]# [root@dhcp35-191 ~]# [root@dhcp35-191 ~]# gluster v info Volume Name: salvol Type: Distributed-Replicate Volume ID: cca6a599-ec09-4409-89d5-7cb00c20856b Status: Started Snapshot Count: 0 Number of Bricks: 4 x 2 = 8 Transport-type: tcp Bricks: Brick1: 10.70.35.191:/rhs/brick1/salvol Brick2: 10.70.37.108:/rhs/brick1/salvol Brick3: 10.70.35.3:/rhs/brick1/salvol Brick4: 10.70.37.66:/rhs/brick1/salvol Brick5: 10.70.35.191:/rhs/brick2/salvol Brick6: 10.70.37.108:/rhs/brick2/salvol Brick7: 10.70.35.3:/rhs/brick2/salvol Brick8: 10.70.37.66:/rhs/brick2/salvol Options Reconfigured: features.cache-invalidation: on features.cache-invalidation-timeout: 400 performance.cache-invalidation: on performance.md-cache-timeout: 300 cluster.shd-max-threads: 10 diagnostics.count-fop-hits: on diagnostics.latency-measurement: on features.uss: on features.quota-deem-statfs: on features.inode-quota: on features.quota: on transport.address-family: inet performance.readdir-ahead: on nfs.disable: on [root@dhcp35-191 ~]# gluster v status Status of volume: salvol Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.35.191:/rhs/brick1/salvol 49153 0 Y 15470 Brick 10.70.37.108:/rhs/brick1/salvol 49152 0 Y 25158 Brick 10.70.35.3:/rhs/brick1/salvol 49152 0 Y 8975 Brick 10.70.37.66:/rhs/brick1/salvol 49152 0 Y 26096 Brick 10.70.35.191:/rhs/brick2/salvol 49154 0 Y 15489 Brick 10.70.37.108:/rhs/brick2/salvol 49153 0 Y 25177 Brick 10.70.35.3:/rhs/brick2/salvol 49153 0 Y 8994 Brick 10.70.37.66:/rhs/brick2/salvol 49153 0 Y 26115 Snapshot Daemon on localhost 49155 0 Y 15598 Self-heal Daemon on localhost N/A N/A Y 15509 Quota Daemon on localhost N/A N/A Y 15545 Snapshot Daemon on 10.70.35.3 49154 0 Y 9091 Self-heal Daemon on 10.70.35.3 N/A N/A Y 9014 Quota Daemon on 10.70.35.3 N/A N/A Y 9045 Snapshot Daemon on 10.70.37.108 49154 0 Y 25276 Self-heal Daemon on 10.70.37.108 N/A N/A Y 25201 Quota Daemon on 10.70.37.108 N/A N/A Y 25228 Snapshot Daemon on 10.70.37.66 49154 0 Y 26214 Self-heal Daemon on 10.70.37.66 N/A N/A Y 26135 Quota Daemon on 10.70.37.66 N/A N/A Y 26167 Task Status of Volume salvol ------------------------------------------------------------------------------ There are no active volume tasks [root@dhcp35-191 ~]# I then mounted the volume on 5 different clients and did following IOs: From all clients:===> started taking statedump of the fuse mount process every 5 minutes and moving them to a dedicated directory for each host on the mount point(so into gluster vol) From all clients:====>collecting top and cpu usage every 2 mins and appending the contents into a file for each host on the mount point(so into gluster vol) Now from two of the clients: I started to created deep directory stucture parallely: Client1:rhs-client11 mounted from 10.70.35.191:salvol Client2:rhs-client32:mounted from 10.70.37.66:/salvol However after about just 3 minutes I stopped the directory creation I then did a parallel rm -rf * from both these clients: The rm -rf failed with dir not empty on both clients client1: [root@rhs-client11 same-dir-create]# rm -rf * rm: cannot remove `level1.1/level2.1/level3.1/level4.4': Directory not empty [root@rhs-client11 same-dir-create]# [root@rhs-client11 same-dir-create]# [root@rhs-client11 same-dir-create]# ls dir.rhs-client11.lab.eng.blr.redhat.com.log level1.1 [root@rhs-client11 same-dir-create]# rm -rf * rm: cannot remove `level1.1/level2.1/level3.1/level4.6': Directory not empty [root@rhs-client11 same-dir-create]# ls dir.rhs-client11.lab.eng.blr.redhat.com.log level1.1 [root@rhs-client11 same-dir-create]# ls dir.rhs-client11.lab.eng.blr.redhat.com.log level1.1 [root@rhs-client11 same-dir-create]# pwd /mnt/salvol/test-arena/same-dir-create [root@rhs-client11 same-dir-create]# ls dir.rhs-client11.lab.eng.blr.redhat.com.log level1.1 [root@rhs-client11 same-dir-create]# rm -rf * rm: cannot remove `level1.1/level2.1/level3.1': Directory not empty [root@rhs-client11 same-dir-create]# ls dir.rhs-client11.lab.eng.blr.redhat.com.log level1.1 [root@rhs-client11 same-dir-create]# [root@rhs-client11 same-dir-create]# [root@rhs-client11 same-dir-create]# [root@rhs-client11 same-dir-create]# ls level1.1/level2.1/level3.1/level4.26/ level5.100 level5.74 level5.78 level5.82 level5.86 level5.90 level5.94 level5.98 level5.71 level5.75 level5.79 level5.83 level5.87 level5.91 level5.95 level5.99 level5.72 level5.76 level5.80 level5.84 level5.88 level5.92 level5.96 level5.73 level5.77 level5.81 level5.85 level5.89 level5.93 level5.97 [root@rhs-client11 same-dir-create]# ls level1.1/level2.1/level3.1/level4.26/* client2: [root@rhs-client32 same-dir-create]# ls dir.rhs-client11.lab.eng.blr.redhat.com.log dir.rhs-client32.lab.eng.blr.redhat.com.log level1.1 [root@rhs-client32 same-dir-create]# ls dir.rhs-client11.lab.eng.blr.redhat.com.log dir.rhs-client32.lab.eng.blr.redhat.com.log level1.1 [root@rhs-client32 same-dir-create]# owd -bash: owd: command not found [root@rhs-client32 same-dir-create]# pwd /mnt/salvol/test-arena/same-dir-create [root@rhs-client32 same-dir-create]# ls dir.rhs-client11.lab.eng.blr.redhat.com.log dir.rhs-client32.lab.eng.blr.redhat.com.log level1.1 [root@rhs-client32 same-dir-create]# cd sam -bash: cd: sam: No such file or directory [root@rhs-client32 same-dir-create]# ls dir.rhs-client11.lab.eng.blr.redhat.com.log dir.rhs-client32.lab.eng.blr.redhat.com.log level1.1 [root@rhs-client32 same-dir-create]# rm -rf * rm: cannot remove ‘level1.1/level2.1/level3.1/level4.4’: Directory not empty [root@rhs-client32 same-dir-create]# [root@rhs-client32 same-dir-create]# [root@rhs-client32 same-dir-create]# ls dir.rhs-client11.lab.eng.blr.redhat.com.log level1.1 [root@rhs-client32 same-dir-create]# rm -rf * rm: cannot remove ‘level1.1/level2.1/level3.1/level4.6’: Directory not empty [root@rhs-client32 same-dir-create]# ls dir.rhs-client11.lab.eng.blr.redhat.com.log level1.1 [root@rhs-client32 same-dir-create]# ls dir.rhs-client11.lab.eng.blr.redhat.com.log level1.1 [root@rhs-client32 same-dir-create]# pwd /mnt/salvol/test-arena/same-dir-create [root@rhs-client32 same-dir-create]# ls dir.rhs-client11.lab.eng.blr.redhat.com.log level1.1 [root@rhs-client32 same-dir-create]# rm -rf * rm: cannot remove ‘level1.1/level2.1/level3.1’: Directory not empty [root@rhs-client32 same-dir-create]# ls dir.rhs-client11.lab.eng.blr.redhat.com.log level1.1 [root@rhs-client32 same-dir-create]# rm -rf * rm: cannot remove ‘level1.1/level2.1/level3.1’: Directory not empty [root@rhs-client32 same-dir-create]# ls dir.rhs-client11.lab.eng.blr.redhat.com.log level1.1 [root@rhs-client32 same-dir-create]# rm -rf * rm: cannot remove ‘level1.1/level2.1/level3.1/level4.14’: Directory not empty [root@rhs-client32 same-dir-create]# ls dir.rhs-client11.lab.eng.blr.redhat.com.log level1.1 [root@rhs-client32 same-dir-create]# ls level1.1/level2.1/level3.1/level4.1 level4.14/ level4.15/ [root@rhs-client32 same-dir-create]# ls level1.1/level2.1/level3.1/level4.1 level4.14/ level4.15/ [root@rhs-client32 same-dir-create]# ls level1.1/level2.1/level3.1/level4.1 level4.14/ level4.15/ [root@rhs-client32 same-dir-create]# ls level1.1/level2.1/level3.1/level4.14/level5. level5.100/ level5.53/ level5.57/ level5.61/ level5.65/ level5.69/ level5.73/ level5.77/ level5.81/ level5.85/ level5.89/ level5.93/ level5.97/ level5.50/ level5.54/ level5.58/ level5.62/ level5.66/ level5.70/ level5.74/ level5.78/ level5.82/ level5.86/ level5.90/ level5.94/ level5.98/ level5.51/ level5.55/ level5.59/ level5.63/ level5.67/ level5.71/ level5.75/ level5.79/ level5.83/ level5.87/ level5.91/ level5.95/ level5.99/ level5.52/ level5.56/ level5.60/ level5.64/ level5.68/ level5.72/ level5.76/ level5.80/ level5.84/ level5.88/ level5.92/ level5.96/ [root@rhs-client32 same-dir-create]# ls level1.1/level2.1/level3.1/level4.14/level5. level5.100/ level5.53/ level5.57/ level5.61/ level5.65/ level5.69/ level5.73/ level5.77/ level5.81/ level5.85/ level5.89/ level5.93/ level5.97/ level5.50/ level5.54/ level5.58/ level5.62/ level5.66/ level5.70/ level5.74/ level5.78/ level5.82/ level5.86/ level5.90/ level5.94/ level5.98/ level5.51/ level5.55/ level5.59/ level5.63/ level5.67/ level5.71/ level5.75/ level5.79/ level5.83/ level5.87/ level5.91/ level5.95/ level5.99/ level5.52/ level5.56/ level5.60/ level5.64/ level5.68/ level5.72/ level5.76/ level5.80/ level5.84/ level5.88/ level5.92/ level5.96/ [root@rhs-client32 same-dir-create]# ls level1.1/level2.1/level3.1/level4.14/level5. level5.100/ level5.53/ level5.57/ level5.61/ level5.65/ level5.69/ level5.73/ level5.77/ level5.81/ level5.85/ level5.89/ level5.93/ level5.97/ level5.50/ level5.54/ level5.58/ level5.62/ level5.66/ level5.70/ level5.74/ level5.78/ level5.82/ level5.86/ level5.90/ level5.94/ level5.98/ level5.51/ level5.55/ level5.59/ level5.63/ level5.67/ level5.71/ level5.75/ level5.79/ level5.83/ level5.87/ level5.91/ level5.95/ level5.99/ level5.52/ level5.56/ level5.60/ level5.64/ level5.68/ level5.72/ level5.76/ level5.80/ level5.84/ level5.88/ level5.92/ level5.96/ [root@rhs-client32 same-dir-create]# ls level1.1/level2.1/level3.1/level4.14/level5. level5.100/ level5.53/ level5.57/ level5.61/ level5.65/ level5.69/ level5.73/ level5.77/ level5.81/ level5.85/ level5.89/ level5.93/ level5.97/ level5.50/ level5.54/ level5.58/ level5.62/ level5.66/ level5.70/ level5.74/ level5.78/ level5.82/ level5.86/ level5.90/ level5.94/ level5.98/ level5.51/ level5.55/ level5.59/ level5.63/ level5.67/ level5.71/ level5.75/ level5.79/ level5.83/ level5.87/ level5.91/ level5.95/ level5.99/ level5.52/ level5.56/ level5.60/ level5.64/ level5.68/ level5.72/ level5.76/ level5.80/ level5.84/ level5.88/ level5.92/ level5.96/ [root@rhs-client32 same-dir-create]# #ls level1.1/level2.1/level3.1/level4.14/level5.100/ [root@rhs-client32 same-dir-create]# rm -rf * rm: cannot remove ‘level1.1/level2.1/level3.1/level4.16’: Directory not empty [root@rhs-client32 same-dir-create]# ls dir.rhs-client11.lab.eng.blr.redhat.com.log level1.1 I then tried to delete only from client2: But it still fails with dir not empty Checked the mount logs but found no new log entries on retry Tried to check the brick logs while doing the same and found that there was only one log entry on the last brick ie brick2 of node4 [2016-11-11 12:40:28.128298] E [MSGID: 113039] [posix.c:3018:posix_open] 0-salvol-posix: open on /rhs/brick2/salvol/.glusterfs/e4/df/e4df858e-c6c6-4fdb-bdbb-e3c07a3187ba, flags: 1025 [No such file or directory]
Nag, Can you please leave the system in the same state until Monday? I will take a look at it then. Thanks, Nithya
I did notice later that there were directories being created from one of the clients while delete was being tried. Hence that could be the reason why dir deletion failed. Closing this as Not A Bug. Will reopen or raise a new bug if i see this in a healthy setup. Sorry for the inconvenience