Bug 1065653

Summary: Rebalance operation is stuck on "In progress status" on a distributed volume and link files are created..
Product: [Community] GlusterFS Reporter: Jiffin Tony Thottan <thottanjiffin>
Component: distributeAssignee: Nithya Balachandran <nbalacha>
Status: CLOSED INSUFFICIENT_DATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: mainlineCC: bugs, jthottan, kramdoss, nbalacha, smohan, spalai, thottanjiffin
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: dht-rebalance-file
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-03-15 06:59:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
sosreport
none
sosreport none

Description Jiffin Tony Thottan 2014-02-15 12:30:06 UTC
Description of problem:

In a distributed volume when rebalance is performed on hole files, the mount point contained the link files and rebalance operation was not completed.. 



Volume Name: dishol1
Type: Distribute
Volume ID: ac82e12e-a047-410a-bde0-bd919dc9aca6
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: 192.168.0.5:/brick1/dishol1-brick1
Brick2: 192.168.0.76:/brick1/dishol1-brick2
Brick3: 192.168.0.5:/brick2/dishol1-brick3
Brick4: 192.168.0.76:/brick2/dishol1-brick4


Before rebalance:

[root@localhost dis]# du -a
216	./file3
1944	./file4
1012	./file1
300	./file2
3472	.

[root@localhost dis]# ls -l
total 3472
-rw-r--r-- 1 root root 109951163811579 Feb 15 17:22 file1
-rw-r--r-- 1 root root 109951163084713 Feb 15 17:23 file2
-rw-r--r-- 1 root root 109951162995861 Feb 15 17:23 file3
-rw-r--r-- 1 root root 109951164767288 Feb 15 17:23 file4

After rebalance:

[root@localhost dis]# ls -l
total 3472
-rw-r-Sr-T 1 root root 109951163811579 Feb 15 17:22 file1
-rw-r--r-- 1 root root 109951163084713 Feb 15 17:23 file2
-rw-r--r-- 1 root root 109951162995861 Feb 15 17:23 file3
-rw-r--r-- 1 root root 109951164767288 Feb 15 17:23 file4



Also, the extended attributes of file is shown in the mount point.

[root@localhost dis]# mount |grep mnt
192.168.0.5:/dishol1 on /mnt/dis type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
[root@localhost dis]# 


[root@localhost dis]# getfattr -d -m. file1
# file: file1
trusted.glusterfs.dht.linkto="dishol1-client-2"



[root@localhost dis]# attr -l file1
Attribute "glusterfs.dht.linkto" has a 17 byte value for file1
[root@localhost dis]# 


Version-Release number of selected component (if applicable):

glusterfs-libs-3.5.0-0.5.beta3.fc20.x86_64
glusterfs-fuse-3.5.0-0.5.beta3.fc20.x86_64
glusterfs-cli-3.5.0-0.5.beta3.fc20.x86_64
glusterfs-3.5.0-0.5.beta3.fc20.x86_64
glusterfs-devel-3.5.0-0.5.beta3.fc20.x86_64
glusterfs-server-3.5.0-0.5.beta3.fc20.x86_64


How reproducible:

N/A

Steps to Reproduce:

As mentioned above

Actual results:

The rebalance operation was never completed and stuck on 'In progress' status.

Expected results:

The rebalance should be completed..

Comment 1 Jiffin Tony Thottan 2014-02-15 12:37:01 UTC
Created attachment 863577 [details]
sosreport

Sosreport attached..

Comment 2 Jiffin Tony Thottan 2014-02-15 12:37:46 UTC
Created attachment 863578 [details]
sosreport

Comment 3 Manikandan 2016-01-05 12:23:31 UTC
Hi Jiffin,

Could you check if it is reproducible on the latest build?

Thanks,
Manikandan Selvaganesh.

Comment 4 Sakshi 2016-02-04 08:45:08 UTC
This is because rebalance first sets the xattrs on the destination file and then removes the linkto xattr. Hence there is a small window in which you will see the linkto xattr from the mount point. Also what is the size of the file with holes? Are you sure it had completely migrated?

Comment 6 Nithya Balachandran 2017-09-04 14:28:39 UTC
Jiffin,

By link file do you mean this file?

-rw-r-Sr-T 1 root root 109951163811579 Feb 15 17:22 file1

Thanks,
Nithya

Comment 7 Jiffin 2018-03-15 06:59:51 UTC
Actually I don't recollect much about this issue. Hence closing the bug as insufficient data