Description of problem: ======================= Sometimes rm -rf is not removing stale link find from all bricks and as a result of it user is unable to create file having same name as stale link file. Version-Release number : ========================= 3.6.0.24-1.el6rhs.x86_64 How reproducible: ================= Intermittent Steps to Reproduce: ==================== 1. create and mount distributed volume. (mount on multiple client) 2. create few files on mount point. 3. did multiple rename and once it was from multiple mount point. 4. from mount executed rm -rf * 5. try to create file after that. [root@OVM1 snap]# ls -l total 0 [root@OVM1 snap]# touch e6 [root@OVM1 snap]# `ls` >> e6 [root@OVM1 snap]# cat e6 [root@OVM1 snap]# echo "abc" > e6 [root@OVM1 snap]# cat e6 cat: e6: No such file or directory [root@OVM1 snap]# cat e6 cat: e6: Input/output error [root@OVM1 snap]# ls -l e6 ls: cannot access e6: No such file or directory [root@OVM1 snap]# ls -l total 0 verified on bricks brick:- [root@OVM3 snap]# ls -l /brick2/* /brick2/b1: total 0 ---------T 2 root root 0 Jul 7 20:33 e6 /brick2/b2: total 0 ---------T 2 root root 0 Jul 7 20:34 e6 /brick2/b3: total 0 Actual results: =============== due to old stale link file, unable to create file having same name Expected results: ================ rm -rf should remove old stale link file and new file creation should not fail due to presence of old link file
Tried on glusterfs 3.6.0.27 to verify the issue. Here are the Steps : created a 2*2 volume Volume Name: test1 Type: Distributed-Replicate Volume ID: 5e206611-f6a3-4f88-8a4b-e4854264e805 Status: Started Snap Volume: no Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 192.168.122.11:/brick/1 Brick2: 192.168.122.11:/brick/2 Brick3: 192.168.122.11:/brick/3 Brick4: 192.168.122.11:/brick/4 Options Reconfigured: performance.readdir-ahead: on auto-delete: disable snap-max-soft-limit: 90 snap-max-hard-limit: 256 [root@vm11 brick]# 2). Created few files and renamed them so that linkto files will be created. And then explicitly unlinked the data files from the bricks. [root@vm11 brick]# ll * 1: total 4 ---------T. 2 root root 0 Aug 27 03:04 zile2 2: total 4 ---------T. 2 root root 0 Aug 27 03:04 zile2 3: total 8 ---------T. 2 root root 0 Aug 27 03:04 zile3 ---------T. 2 root root 0 Aug 27 03:04 zile7 4: total 8 ---------T. 2 root root 0 Aug 27 03:04 zile3 ---------T. 2 root root 0 Aug 27 03:04 zile7 3)Now from the mount point issued "touch zile{1..10}" [root@vm11 mnt1]# ll total 0 -rw-r--r--. 1 root root 0 Aug 27 03:07 zile1 -rw-r--r--. 1 root root 0 Aug 27 03:07 zile10 -rw-r--r--. 1 root root 0 Aug 27 03:07 zile2 -rw-r--r--. 1 root root 0 Aug 27 03:07 zile3 -rw-r--r--. 1 root root 0 Aug 27 03:07 zile4 -rw-r--r--. 1 root root 0 Aug 27 03:07 zile5 -rw-r--r--. 1 root root 0 Aug 27 03:07 zile6 -rw-r--r--. 1 root root 0 Aug 27 03:07 zile7 -rw-r--r--. 1 root root 0 Aug 27 03:07 zile8 -rw-r--r--. 1 root root 0 Aug 27 03:07 zile9 [root@vm11 mnt1]#
verified with -3.6.0.28-1.el6rhs.x86_64 working as expected hence moving to verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1278.html