Bug 1303153 - Gluster creating 0 byte files
Summary: Gluster creating 0 byte files
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: 3.5.3
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Sakshi
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-01-29 17:14 UTC by Sandeep Patel
Modified: 2016-08-01 01:22 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-06-17 16:24:39 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
Example of 0 byte file (8.46 KB, image/png)
2016-01-29 17:58 UTC, Sandeep Patel
no flags Details
0 byte and orginal file shows when browsing using samba (40.07 KB, image/png)
2016-03-10 12:04 UTC, Sandeep Patel
no flags Details

Description Sandeep Patel 2016-01-29 17:14:40 UTC
Description of problem:
Files are being created which are 0 bytes on a replica.

3a:
---------T 2 root root 0 Aug  2 07:30 /export/brick2/archive/AUDIO/21946/3f1ec3fe281c428ca9b4664f36b1ae04.FLAC
3b:
---------T 2 root root 0 Aug  2 07:30 /export/brick2/archive/AUDIO/21946/3f1ec3fe281c428ca9b4664f36b1ae04.FLAC
4a:
-rwxr--r-- 2 root root 23382729 Nov  8  2012 /export/brick1/archive/AUDIO/21946/3f1ec3fe281c428ca9b4664f36b1ae04.FLAC
4b:
-rwxr--r-- 2 root root 23382729 Nov  8  2012 /export/brick1/archive/AUDIO/21946/3f1ec3fe281c428ca9b4664f36b1ae04.FLAC

When this occurs, only the 0 byte file can be accessed via samba and connecting via fuse gives the same result. 

Additional info:
Oracle enterprise Linux - Kernel version: 3.8.13-118.2.2.el6uek.x86_64

glusterfs-cli-3.5.3-1.el6.x86_64
glusterfs-libs-3.5.3-1.el6.x86_64
glusterfs-fuse-3.5.3-1.el6.x86_64
glusterfs-geo-replication-3.5.3-1.el6.x86_64
glusterfs-3.5.3-1.el6.x86_64
glusterfs-api-3.5.3-1.el6.x86_64
glusterfs-server-3.5.3-1.el6.x86_64

glusterfs-fuse-3.5.3-1.el6.x86_64
samba-vfs-glusterfs-4.1.11-2.el6.x86_64

Comment 1 Sandeep Patel 2016-01-29 17:58:39 UTC
Created attachment 1119491 [details]
Example of 0 byte file

Attachment of samba showing the 0 byte file alongside the original.

Comment 2 Anuradha 2016-02-09 12:48:06 UTC
Hi Sandeep,

In a distribute-replicate volume, sometimes these 0 byte files are created. This is normal, but they shouldn't be accessible from mount.

You mention that only the 0 byte file can be accessed, do you see an error on trying to access the original file?

Thanks,
Anuradha.

Comment 3 Sandeep Patel 2016-02-24 16:37:02 UTC
Hi Anuradha,

From a samba mount, only the 0 byte file is copied even if the original file is selected to copy from.

Thanks
Sandeep

Comment 4 Anuradha 2016-03-03 09:02:38 UTC
As per discussion with Sakshi Bansal, this is a bug in distribute component. Moving the component and assigning to Sakshi.

Comment 5 Sakshi 2016-03-03 09:33:38 UTC
Hi Sandeep,

1) Can you access the original data file from the fuse mount, i.e, are you able to open/read/write the file?
2) Do you also see the zero byte files from the mount? If so do the original and the zero byte files get displayed from the mount?

3) If you can see the 0 byte file from the mount, can you get the xattr of the files (both original and 0 byte file) from the brick. You can do that as follows and post the output here
      getxattr -d . -m -e hex <path to the file on the backend>

From the output you can get you the gfid of the file (say the gfid is 0xd7833cee338343359f9adea7aa246227). The original data file and the linkto file will have the same gfid. Once you get this, can you check for the gfid on the backend as follows and post the output here:
 stat <path to the brick>/.glusterfs/d7/83/d7833cee-3383-4335-9f9a-dea7aa246227

Comment 6 Sandeep Patel 2016-03-10 12:04:01 UTC
Hi,

Output from trying to copy the file from the fuse mount

[root@gchead001 ~]# cp /archive-new/AUDIO/9283/1fba3d8628054bdeb33420ebddb6a52f.FLAC /media/
cp: reading `/archive-new/AUDIO/9283/1fba3d8628054bdeb33420ebddb6a52f.FLAC': No data available

Output from ls
[root@gchead001 ~]# ls -al /archive-new/AUDIO/9283/1fba3*
---------T 1 root root 0 Jun 13  2015 /archive-new/AUDIO/9283/1fba3d8628054bdeb33420ebddb6a52f.FLAC
---------T 1 root root 0 Jun 13  2015 /archive-new/AUDIO/9283/1fba3d8628054bdeb33420ebddb6a52f.FLAC

From the fuse mount, only the 0 byte file is displayed.

Stat of original file

[root@gc002b ~]# stat /export/brick1/archive/.glusterfs/33/16/33162f64-2b27-4ba8-a185-00e283100615
  File: `/export/brick1/archive/.glusterfs/33/16/33162f64-2b27-4ba8-a185-00e283100615'
  Size: 17758681        Blocks: 34688      IO Block: 4096   regular file
Device: 800h/2048d      Inode: 10738449972  Links: 2
Access: (0744/-rwxr--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2015-03-29 22:30:03.000000000 +0000
Modify: 2011-08-30 17:01:30.000000000 +0000
Change: 2015-04-27 22:23:43.227636476 +0000

Stat of 0-byte file

[root@gc003a ~]# stat /export/brick2/archive/.glusterfs/33/16/33162f64-2b27-4ba8-a185-00e283100615
  File: `/export/brick2/archive/.glusterfs/33/16/33162f64-2b27-4ba8-a185-00e283100615'
  Size: 0               Blocks: 0          IO Block: 4096   regular empty file
Device: 810h/2064d      Inode: 45097217517  Links: 2
Access: (1000/---------T)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2015-06-13 23:19:16.351062000 +0000
Modify: 2015-06-13 23:19:16.351062000 +0000
Change: 2016-02-26 16:13:51.675356487 +0000

I have also attached an image of how the file appears browsing from samba.

Comment 7 Sandeep Patel 2016-03-10 12:04:48 UTC
Created attachment 1134855 [details]
0 byte and orginal file shows when browsing using samba

Comment 8 Nithya Balachandran 2016-03-10 14:17:22 UTC
Can you also provide the details of :
1. the xattrs on the file (both 0 and non-zero byte size) from the backend bricks?
2. the xattrs from the bricks of the parent directory of the file in question
3. gluster volume info <volname> output

Comment 9 Sandeep Patel 2016-03-10 14:33:41 UTC
Original file:

[root@gc002b ~]# getfattr -m . -d -e hex /export/brick1/archive/AUDIO/9283/1fba3d8628054bdeb33420ebddb6a52f.FLAC
getfattr: Removing leading '/' from absolute path names
# file: export/brick1/archive/AUDIO/9283/1fba3d8628054bdeb33420ebddb6a52f.FLAC
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.archive-client-4=0x000000000000000000000000
trusted.afr.archive-client-5=0x000000000000000000000000
trusted.gfid=0x33162f642b274ba8a18500e283100615

Directory

[root@gc002b ~]# getfattr -m . -d -e hex /export/brick1/archive/AUDIO/9283
getfattr: Removing leading '/' from absolute path names
# file: export/brick1/archive/AUDIO/9283
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.archive-client-4=0x000000000000000000000000
trusted.afr.archive-client-5=0x000000000000000000000000
trusted.gfid=0xf30c9cf228544e9d8ba0cd2f8d820e6b
trusted.glusterfs.dht=0x0000000100000000c71c71c4d5555551

0 byte file

[root@gc003a ~]# getfattr -m . -d -e hex /export/brick2/archive/AUDIO/9283/1fba3d8628054bdeb33420ebddb6a52f.FLAC
getfattr: Removing leading '/' from absolute path names
# file: export/brick2/archive/AUDIO/9283/1fba3d8628054bdeb33420ebddb6a52f.FLAC
trusted.afr.archive-client-10=0x000000000000000000000000
trusted.afr.archive-client-11=0x000000000000000000000000
trusted.gfid=0x33162f642b274ba8a18500e283100615

Directory:

[root@gc003a ~]# getfattr -m . -d -e hex /export/brick2/archive/AUDIO/9283
getfattr: Removing leading '/' from absolute path names
# file: export/brick2/archive/AUDIO/9283
trusted.afr.archive-client-10=0x000000000000000000000000
trusted.afr.archive-client-11=0x000000000000000000000000
trusted.gfid=0xf30c9cf228544e9d8ba0cd2f8d820e6b
trusted.glusterfs.dht=0x0000000100000000d5555552e38e38df

gluster volume info

[root@gc002b ~]# gluster volume info archive
Volume Name: archive
Type: Distributed-Replicate
Volume ID: 3fe35b02-5acd-4608-b856-a3191a5a577f
Status: Started
Number of Bricks: 18 x 2 = 36
Transport-type: tcp
Bricks:
Brick1: gc001a:/export/brick1/archive
Brick2: gc001b:/export/brick1/archive
Brick3: gc001a:/export/brick2/archive
Brick4: gc001b:/export/brick2/archive
Brick5: gc002a:/export/brick1/archive
Brick6: gc002b:/export/brick1/archive
Brick7: gc002a:/export/brick2/archive
Brick8: gc002b:/export/brick2/archive
Brick9: gc003a:/export/brick1/archive
Brick10: gc003b:/export/brick1/archive
Brick11: gc003a:/export/brick2/archive
Brick12: gc003b:/export/brick2/archive
Brick13: gc004a:/export/brick1/archive
Brick14: gc004b:/export/brick1/archive
Brick15: gc004a:/export/brick2/archive
Brick16: gc004b:/export/brick2/archive
Brick17: gc005a:/export/brick1/archive
Brick18: gc005b:/export/brick1/archive
Brick19: gc005a:/export/brick2/archive
Brick20: gc005b:/export/brick2/archive
Brick21: gc008a:/export/brick1/archive
Brick22: gc008b:/export/brick1/archive
Brick23: gc008a:/export/brick2/archive
Brick24: gc008b:/export/brick2/archive
Brick25: gc009a:/export/brick1/archive
Brick26: gc009b:/export/brick1/archive
Brick27: gc009a:/export/brick2/archive
Brick28: gc009b:/export/brick2/archive
Brick29: gc006a:/export/brick1/archive
Brick30: gc006b:/export/brick1/archive
Brick31: gc006a:/export/brick2/archive
Brick32: gc006b:/export/brick2/archive
Brick33: gc007a:/export/brick1/archive
Brick34: gc007b:/export/brick1/archive
Brick35: gc007a:/export/brick2/archive
Brick36: gc007b:/export/brick2/archive
Options Reconfigured:
cluster.min-free-disk: 2%
server.allow-insecure: on
nfs.disable: off
cluster.eager-lock: on

Comment 10 Nithya Balachandran 2016-03-10 15:17:44 UTC
The 0 byte file is missing an xattr.
DHT uses these 0 byte "linkto" files as an internal mechanism to help it find the brick on which the actual data file exists. It identifies these files using the sticky bit and the trusted.glusterfs.dht.linkto xattr. If the xattr is missing, DHT treats it as a data file and shows it on the mount. As the non-zero byte file is also not a linkto file as per DHT, it will also be listed so you will see 2 entries, however the properties will be those of the one read first.

If you are certain that the non-zero byte file, henceforth referred to as the data file, contains the right information, please delete the 0 byte file from the backend bricks. Do _not_ try to delete it from the mountpoint - it could end up deleting the data file. Once you have deleted the linkto file from both bricks in the replica set, you should be able to see and access the correct file from the mountpoint. 

Doing an ls on the file after that should recreate the linkto file with the xattr set.

Are there any other files with this problem? Are there any error messages in the glusterfs client logs to indicate any problems with this file?

Comment 11 Nithya Balachandran 2016-03-10 15:18:43 UTC
Just to reiterate - do not try to delete the 0 byte file from the Fuse or Samba clients. Please delete it directly from the backend bricks.

Comment 12 Sandeep Patel 2016-03-10 15:24:43 UTC
Thank you fr your help. Is there a method of detecting these? We have a 1 PB system which is 90% full with an average file size of 25MB.

Is there any indication of what could have caused this issue?

Comment 13 Nithya Balachandran 2016-03-10 15:35:31 UTC
This can happen if the xattr could not be set while creating the linkto file. The brick logs for the bricks on which the linkto file was created might have logged an error. If they have, you could probably search for the same string to see if it shows up for other files.

You seem to be running a rather old version of gluster so I'm not sure if this will work but it might be worth a try. Can you search your client logs to see if messages like the following are logged?

"multiple subvolumes (%s and %s) have file %s (preferably rename the file                           in the backend,and do a fresh lookup"

If yes, those are the files you would need to check.


Otherwise I am afraid the only way is to crawl the filesystem and check if the xattr exists for any files which have only the sticky bit set.

Comment 14 Nithya Balachandran 2016-03-11 16:24:14 UTC
Hi Sandeep,

Did the steps listed above help you recover from the problem?
How would you like to proceed with this BZ?

Regards,
Nithya

Comment 15 Sandeep Patel 2016-03-16 13:12:42 UTC
Hi Nithya,

Thank you for your help with this. Can you confirm if upgrading our gluster would mitigate these problems. Also, you point out files which don't have the sticky bit set. Can you give an example of what i would be looking for please?
Regards
Sandeep

Comment 16 Nithya Balachandran 2016-03-21 05:57:50 UTC
Hi Sandeep,

Without an RCA it is difficult to know whether an upgrade will solve this issue.
Do you see anything in the logs about this file?

Regards,
Nithya

Comment 17 Sandeep Patel 2016-03-21 12:32:15 UTC
Hi Nithya,

No, the client logs do not have the messages for the files. Can you give an example of what we would be looking for on the sticky bit set?

Regards
Sandeep

Comment 18 Sandeep Patel 2016-04-13 09:41:43 UTC
Hi Nithya,

Any news on this?

Kind Regards
Sandeep

Comment 19 Nithya Balachandran 2016-04-13 10:35:51 UTC
Sorry Sandeep. I have not had a chance to look at this.

Do you see any references to this file in the brick logs?

Comment 20 Sandeep Patel 2016-04-20 09:37:42 UTC
Hi Nithya,

No, I don't. Do you have an example of the sticky bit set?

Kind Regards
Sandeep

Comment 21 Niels de Vos 2016-06-17 16:24:39 UTC
This bug is getting closed because the 3.5 is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.


Note You need to log in before you can comment on or make changes to this bug.