Description of problem: ======================= I have enabled md-cache with private build available at http://etherpad.corp.redhat.com/md-cache-3-2 I created a 2x2 volume and mounted it on two different clients: From one client i created a 3GB txt file and triggered about 10Million hardlink creation to it as below (file mnt-distrep.log is the original txt file) for i in {1..10000000};do ln mnt-distrep.log hardlink.$i;done I had the volume mounted on another client, say client2: From the client2 , I was accessing files using ll or ls -l, everything was fine. I then chnaged file permissions of one of the hardlink files say hardlink.90 to chmod 0777 hardlink.90(all this while the hardlink creations were still going on on client1) Now if I did an ls -l, I saw that some of the hardlink files(which probably got created recently) were having the sticky bit ---------T I saw that in the 2x2 volume, all files were created in say one replica pair while the ---------T files for each file was created on the other replica pair(this is due to the cached and hashed concept of dht ie data and link to files) I then chose one file which was displaying ----T on the client2 and kept doing ll or ls -l It was inconsistently toggling b/w data file and link to file as below [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 ---------T. 21860 root root 0 Oct 12 17:33 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 ---------T. 21866 root root 0 Oct 12 17:33 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 ---------T. 21869 root root 0 Oct 12 17:33 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 -rwxrwxrwx. 43729 root root 3156080712 Oct 12 17:29 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 ---------T. 21880 root root 0 Oct 12 17:33 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 -rwxrwxrwx. 43748 root root 3156080712 Oct 12 17:29 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 ---------T. 21888 root root 0 Oct 12 17:33 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 -rwxrwxrwx. 43758 root root 3156080712 Oct 12 17:29 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 -rwxrwxrwx. 43764 root root 3156080712 Oct 12 17:29 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 -rwxrwxrwx. 43768 root root 3156080712 Oct 12 17:29 hardlink.43363 My understanding is that , in this case instead of caching or looking up from cache it is trying to fetch the file info from the bricks and it is inconsistently fetching from the hashed brick which is not right Also, If I stopped the hardlink creation on the first client(by ctrl+c of the first client command line) I see now consistently displaying only the right file permission as below [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 ---------T. 21860 root root 0 Oct 12 17:33 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 ---------T. 21866 root root 0 Oct 12 17:33 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 ---------T. 21869 root root 0 Oct 12 17:33 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 -rwxrwxrwx. 43729 root root 3156080712 Oct 12 17:29 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 ---------T. 21880 root root 0 Oct 12 17:33 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 -rwxrwxrwx. 43748 root root 3156080712 Oct 12 17:29 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 ---------T. 21888 root root 0 Oct 12 17:33 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 -rwxrwxrwx. 43758 root root 3156080712 Oct 12 17:29 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 -rwxrwxrwx. 43764 root root 3156080712 Oct 12 17:29 hardlink.43363 [root@dhcp35-180 dhcp35-179.lab.eng.blr.redhat.com]# ll hardlink.43363 -rwxrwxrwx. 43768 root root 3156080712 Oct 12 17:29 hardlink.43363 Version-Release number of selected component (if applicable): ==== as in etherpad How reproducible: Steps to Reproduce: 1.create md-cache setup on a 2x2 volume 2. mount volume on two clients 3. from cleint 1: create a file and start create of hardlinks to the file in a loop of say some 1lakh hardlinks 4. From another client , say client2 change file permissions of a hardlink already created say hardlink.10 5. as long as the 1lakh hardlink creation is in progress, keep issuing ls -l of hardlink.20 (some hardlink which has been created) You will see that the file permissions is sometimes shown as what is expected and sometimes with just the sticky bit Actual results: Expected results: Additional info:
Its a nice finding. Fix posted upstream: http://review.gluster.org/#/c/15789/
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/91957/
Followed the steps the reproduce Steps to Reproduce: 1.create md-cache setup on a 2x2 volume 2. mount volume on two clients 3. from cleint 1: create a file and start create of hardlinks to the file in a loop of say some 1lakh hardlinks 4. From another client , say client2 change file permissions of a hardlink already created say hardlink.10 5. as long as the 1lakh hardlink creation is in progress, keep issuing ls -l of hardlink.20 (some hardlink which has been created) You will see that the file permissions is sometimes shown as what is expected and sometimes with just the sticky bit I created a 2GB file and the created around 2 lakh hardlink of the file on a loop over cifs mount and then over fuse mount using two clients. Changed the file permission of a hardlink file and continued ls -l over the mount point. I did not see any sticky bits in place of file permission. Version --------- samba-client-4.4.6-4.el7rhgs.x86_64 glusterfs-cli-3.8.4-11.el7rhgs.x86_64 Marking it as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html