Description of problem: Files are being created which are 0 bytes on a replica. 3a: ---------T 2 root root 0 Aug 2 07:30 /export/brick2/archive/AUDIO/21946/3f1ec3fe281c428ca9b4664f36b1ae04.FLAC 3b: ---------T 2 root root 0 Aug 2 07:30 /export/brick2/archive/AUDIO/21946/3f1ec3fe281c428ca9b4664f36b1ae04.FLAC 4a: -rwxr--r-- 2 root root 23382729 Nov 8 2012 /export/brick1/archive/AUDIO/21946/3f1ec3fe281c428ca9b4664f36b1ae04.FLAC 4b: -rwxr--r-- 2 root root 23382729 Nov 8 2012 /export/brick1/archive/AUDIO/21946/3f1ec3fe281c428ca9b4664f36b1ae04.FLAC When this occurs, only the 0 byte file can be accessed via samba and connecting via fuse gives the same result. Additional info: Oracle enterprise Linux - Kernel version: 3.8.13-118.2.2.el6uek.x86_64 glusterfs-cli-3.5.3-1.el6.x86_64 glusterfs-libs-3.5.3-1.el6.x86_64 glusterfs-fuse-3.5.3-1.el6.x86_64 glusterfs-geo-replication-3.5.3-1.el6.x86_64 glusterfs-3.5.3-1.el6.x86_64 glusterfs-api-3.5.3-1.el6.x86_64 glusterfs-server-3.5.3-1.el6.x86_64 glusterfs-fuse-3.5.3-1.el6.x86_64 samba-vfs-glusterfs-4.1.11-2.el6.x86_64
Created attachment 1119491 [details] Example of 0 byte file Attachment of samba showing the 0 byte file alongside the original.
Hi Sandeep, In a distribute-replicate volume, sometimes these 0 byte files are created. This is normal, but they shouldn't be accessible from mount. You mention that only the 0 byte file can be accessed, do you see an error on trying to access the original file? Thanks, Anuradha.
Hi Anuradha, From a samba mount, only the 0 byte file is copied even if the original file is selected to copy from. Thanks Sandeep
As per discussion with Sakshi Bansal, this is a bug in distribute component. Moving the component and assigning to Sakshi.
Hi Sandeep, 1) Can you access the original data file from the fuse mount, i.e, are you able to open/read/write the file? 2) Do you also see the zero byte files from the mount? If so do the original and the zero byte files get displayed from the mount? 3) If you can see the 0 byte file from the mount, can you get the xattr of the files (both original and 0 byte file) from the brick. You can do that as follows and post the output here getxattr -d . -m -e hex <path to the file on the backend> From the output you can get you the gfid of the file (say the gfid is 0xd7833cee338343359f9adea7aa246227). The original data file and the linkto file will have the same gfid. Once you get this, can you check for the gfid on the backend as follows and post the output here: stat <path to the brick>/.glusterfs/d7/83/d7833cee-3383-4335-9f9a-dea7aa246227
Hi, Output from trying to copy the file from the fuse mount [root@gchead001 ~]# cp /archive-new/AUDIO/9283/1fba3d8628054bdeb33420ebddb6a52f.FLAC /media/ cp: reading `/archive-new/AUDIO/9283/1fba3d8628054bdeb33420ebddb6a52f.FLAC': No data available Output from ls [root@gchead001 ~]# ls -al /archive-new/AUDIO/9283/1fba3* ---------T 1 root root 0 Jun 13 2015 /archive-new/AUDIO/9283/1fba3d8628054bdeb33420ebddb6a52f.FLAC ---------T 1 root root 0 Jun 13 2015 /archive-new/AUDIO/9283/1fba3d8628054bdeb33420ebddb6a52f.FLAC From the fuse mount, only the 0 byte file is displayed. Stat of original file [root@gc002b ~]# stat /export/brick1/archive/.glusterfs/33/16/33162f64-2b27-4ba8-a185-00e283100615 File: `/export/brick1/archive/.glusterfs/33/16/33162f64-2b27-4ba8-a185-00e283100615' Size: 17758681 Blocks: 34688 IO Block: 4096 regular file Device: 800h/2048d Inode: 10738449972 Links: 2 Access: (0744/-rwxr--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2015-03-29 22:30:03.000000000 +0000 Modify: 2011-08-30 17:01:30.000000000 +0000 Change: 2015-04-27 22:23:43.227636476 +0000 Stat of 0-byte file [root@gc003a ~]# stat /export/brick2/archive/.glusterfs/33/16/33162f64-2b27-4ba8-a185-00e283100615 File: `/export/brick2/archive/.glusterfs/33/16/33162f64-2b27-4ba8-a185-00e283100615' Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: 810h/2064d Inode: 45097217517 Links: 2 Access: (1000/---------T) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2015-06-13 23:19:16.351062000 +0000 Modify: 2015-06-13 23:19:16.351062000 +0000 Change: 2016-02-26 16:13:51.675356487 +0000 I have also attached an image of how the file appears browsing from samba.
Created attachment 1134855 [details] 0 byte and orginal file shows when browsing using samba
Can you also provide the details of : 1. the xattrs on the file (both 0 and non-zero byte size) from the backend bricks? 2. the xattrs from the bricks of the parent directory of the file in question 3. gluster volume info <volname> output
Original file: [root@gc002b ~]# getfattr -m . -d -e hex /export/brick1/archive/AUDIO/9283/1fba3d8628054bdeb33420ebddb6a52f.FLAC getfattr: Removing leading '/' from absolute path names # file: export/brick1/archive/AUDIO/9283/1fba3d8628054bdeb33420ebddb6a52f.FLAC security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000 trusted.afr.archive-client-4=0x000000000000000000000000 trusted.afr.archive-client-5=0x000000000000000000000000 trusted.gfid=0x33162f642b274ba8a18500e283100615 Directory [root@gc002b ~]# getfattr -m . -d -e hex /export/brick1/archive/AUDIO/9283 getfattr: Removing leading '/' from absolute path names # file: export/brick1/archive/AUDIO/9283 security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000 trusted.afr.archive-client-4=0x000000000000000000000000 trusted.afr.archive-client-5=0x000000000000000000000000 trusted.gfid=0xf30c9cf228544e9d8ba0cd2f8d820e6b trusted.glusterfs.dht=0x0000000100000000c71c71c4d5555551 0 byte file [root@gc003a ~]# getfattr -m . -d -e hex /export/brick2/archive/AUDIO/9283/1fba3d8628054bdeb33420ebddb6a52f.FLAC getfattr: Removing leading '/' from absolute path names # file: export/brick2/archive/AUDIO/9283/1fba3d8628054bdeb33420ebddb6a52f.FLAC trusted.afr.archive-client-10=0x000000000000000000000000 trusted.afr.archive-client-11=0x000000000000000000000000 trusted.gfid=0x33162f642b274ba8a18500e283100615 Directory: [root@gc003a ~]# getfattr -m . -d -e hex /export/brick2/archive/AUDIO/9283 getfattr: Removing leading '/' from absolute path names # file: export/brick2/archive/AUDIO/9283 trusted.afr.archive-client-10=0x000000000000000000000000 trusted.afr.archive-client-11=0x000000000000000000000000 trusted.gfid=0xf30c9cf228544e9d8ba0cd2f8d820e6b trusted.glusterfs.dht=0x0000000100000000d5555552e38e38df gluster volume info [root@gc002b ~]# gluster volume info archive Volume Name: archive Type: Distributed-Replicate Volume ID: 3fe35b02-5acd-4608-b856-a3191a5a577f Status: Started Number of Bricks: 18 x 2 = 36 Transport-type: tcp Bricks: Brick1: gc001a:/export/brick1/archive Brick2: gc001b:/export/brick1/archive Brick3: gc001a:/export/brick2/archive Brick4: gc001b:/export/brick2/archive Brick5: gc002a:/export/brick1/archive Brick6: gc002b:/export/brick1/archive Brick7: gc002a:/export/brick2/archive Brick8: gc002b:/export/brick2/archive Brick9: gc003a:/export/brick1/archive Brick10: gc003b:/export/brick1/archive Brick11: gc003a:/export/brick2/archive Brick12: gc003b:/export/brick2/archive Brick13: gc004a:/export/brick1/archive Brick14: gc004b:/export/brick1/archive Brick15: gc004a:/export/brick2/archive Brick16: gc004b:/export/brick2/archive Brick17: gc005a:/export/brick1/archive Brick18: gc005b:/export/brick1/archive Brick19: gc005a:/export/brick2/archive Brick20: gc005b:/export/brick2/archive Brick21: gc008a:/export/brick1/archive Brick22: gc008b:/export/brick1/archive Brick23: gc008a:/export/brick2/archive Brick24: gc008b:/export/brick2/archive Brick25: gc009a:/export/brick1/archive Brick26: gc009b:/export/brick1/archive Brick27: gc009a:/export/brick2/archive Brick28: gc009b:/export/brick2/archive Brick29: gc006a:/export/brick1/archive Brick30: gc006b:/export/brick1/archive Brick31: gc006a:/export/brick2/archive Brick32: gc006b:/export/brick2/archive Brick33: gc007a:/export/brick1/archive Brick34: gc007b:/export/brick1/archive Brick35: gc007a:/export/brick2/archive Brick36: gc007b:/export/brick2/archive Options Reconfigured: cluster.min-free-disk: 2% server.allow-insecure: on nfs.disable: off cluster.eager-lock: on
The 0 byte file is missing an xattr. DHT uses these 0 byte "linkto" files as an internal mechanism to help it find the brick on which the actual data file exists. It identifies these files using the sticky bit and the trusted.glusterfs.dht.linkto xattr. If the xattr is missing, DHT treats it as a data file and shows it on the mount. As the non-zero byte file is also not a linkto file as per DHT, it will also be listed so you will see 2 entries, however the properties will be those of the one read first. If you are certain that the non-zero byte file, henceforth referred to as the data file, contains the right information, please delete the 0 byte file from the backend bricks. Do _not_ try to delete it from the mountpoint - it could end up deleting the data file. Once you have deleted the linkto file from both bricks in the replica set, you should be able to see and access the correct file from the mountpoint. Doing an ls on the file after that should recreate the linkto file with the xattr set. Are there any other files with this problem? Are there any error messages in the glusterfs client logs to indicate any problems with this file?
Just to reiterate - do not try to delete the 0 byte file from the Fuse or Samba clients. Please delete it directly from the backend bricks.
Thank you fr your help. Is there a method of detecting these? We have a 1 PB system which is 90% full with an average file size of 25MB. Is there any indication of what could have caused this issue?
This can happen if the xattr could not be set while creating the linkto file. The brick logs for the bricks on which the linkto file was created might have logged an error. If they have, you could probably search for the same string to see if it shows up for other files. You seem to be running a rather old version of gluster so I'm not sure if this will work but it might be worth a try. Can you search your client logs to see if messages like the following are logged? "multiple subvolumes (%s and %s) have file %s (preferably rename the file in the backend,and do a fresh lookup" If yes, those are the files you would need to check. Otherwise I am afraid the only way is to crawl the filesystem and check if the xattr exists for any files which have only the sticky bit set.
Hi Sandeep, Did the steps listed above help you recover from the problem? How would you like to proceed with this BZ? Regards, Nithya
Hi Nithya, Thank you for your help with this. Can you confirm if upgrading our gluster would mitigate these problems. Also, you point out files which don't have the sticky bit set. Can you give an example of what i would be looking for please? Regards Sandeep
Hi Sandeep, Without an RCA it is difficult to know whether an upgrade will solve this issue. Do you see anything in the logs about this file? Regards, Nithya
Hi Nithya, No, the client logs do not have the messages for the files. Can you give an example of what we would be looking for on the sticky bit set? Regards Sandeep
Hi Nithya, Any news on this? Kind Regards Sandeep
Sorry Sandeep. I have not had a chance to look at this. Do you see any references to this file in the brick logs?
Hi Nithya, No, I don't. Do you have an example of the sticky bit set? Kind Regards Sandeep
This bug is getting closed because the 3.5 is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.