1053537 – BVT: Rebalance failed for symbolic link files

Bug 1053537 - BVT: Rebalance failed for symbolic link files

Summary: BVT: Rebalance failed for symbolic link files

Keywords:
Status:	CLOSED DEFERRED
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	distribute
Sub Component:
Version:	2.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Nithya Balachandran
QA Contact:	shylesh
Docs Contact:	Lalatendu Mohanty
URL:
Whiteboard:
Depends On:
Blocks:	1286142
TreeView+	depends on / blocked

Reported:	2014-01-15 11:44 UTC by Lalatendu Mohanty
Modified:	2015-11-27 11:53 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1286142 (view as bug list)
Environment:
Last Closed:	2015-11-27 11:53:16 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Rebalance logs (13.25 KB, application/x-gzip) 2014-01-15 11:51 UTC, Lalatendu Mohanty	no flags	Details
View All

Description Lalatendu Mohanty 2014-01-15 11:44:49 UTC

Description of problem:

Rebalance status command output shows failures. It should not show any failures as if any file does not get rebalanced because of space issue it should be in skip list.

Version-Release number of selected component (if applicable):
glusterfs-3.4.0.56rhs-1389602694.el6.x86_64.rpm

How reproducible:
Intermittent

Steps to Reproduce:

1.Create a distribute volume

2. create the data i.e. symlinks on the mount point
   
   Refer "Additional info:" for code that creates the symlinks

3. Add brick
4. Start rebalance
5. Check rebalance status

Actual results:

Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost               46         1.2KB           366             0             1            completed               3.00
       rhsauto057.lab.eng.blr.redhat.com               43         1.1KB           382             1             0            completed               3.00
       rhsauto022.lab.eng.blr.redhat.com                0        0Bytes           330             0             0            completed               1.00

Expected results:


Additional info:

Code to create symbolic links:

mkdir -p $MOUNT_POINT/symlinks
            mkdir -p /symlinks
            mkdir -p $MOUNT_POINT/symlinks-gluster
            mkdir -p $MOUNT_POINT/symlinks-gluster-dest
            for i in `seq 1 100`; do
                echo "#!/bin/bash" > /symlinks/$i.sh
                echo "echo Hello World" >> /symlinks/$i.sh
                ln -s /symlinks/$i.sh $MOUNT_POINT/symlinks/$i
                echo "#!/bin/bash" >> $MOUNT_POINT/symlinks-gluster/$i.sh
                echo "echo Hello World" >> $MOUNT_POINT/symlinks-gluster/$i.sh
                ln -s $MOUNT_POINT/symlinks-gluster/$i.sh $MOUNT_POINT/symlinks-gluster-dest/$i                                                                                                                                        
            done

Comment 1 Lalatendu Mohanty 2014-01-15 11:47:36 UTC

Marked this bug as intermittent as this came twice in last few days BVT run.

Today's Run: https://beaker.engineering.redhat.com/jobs/575364

Previous Run: https://beaker.engineering.redhat.com/jobs/572378

Comment 2 Lalatendu Mohanty 2014-01-15 11:51:17 UTC

Created attachment 850449 [details]
Rebalance logs

Comment 3 Lalatendu Mohanty 2014-01-15 11:53:43 UTC

In the logs below error message is seen:

hosdu-rebalance.log.1:[2014-01-15 18:14:31.263392] E [dht-linkfile.c:287:dht_linkfile_setattr_cbk] 0-hosdu-dht: setattr of uid/gid on /93 :<gfid:00000000-0000-0000-0000-000000000000> failed (No such file or directory)

Comment 5 Lalatendu Mohanty 2014-02-07 09:41:45 UTC

In last couple of weeks of BVT run I haven't seen this issue. Hence lowering the severity

Comment 6 Vivek Agarwal 2014-02-20 08:36:56 UTC

adding 3.0 flag and removing 2.1.z

Comment 7 vsomyaju 2014-03-26 12:58:06 UTC

From the logs it seems that one of the rebalance process is not able to perform setattr because file was not present at the backend.

If it is a replicated volume, ideally lock should be taken. So not sure how it went into that situation.


But able to reproduce it for not-replicated volume.

1. Initially there are two bricks.
2. Created file. org_file
3. Created symbolic link to org_file. [sym_file]
4. Added new brick.
5. Ran rebalance.

NOTE: All bricks are on different node.



Now rebalance process  will run on all three nodes.
Lets assume file needs to be migrated from node-2 to node-3.


             Rebalance-1:              Rebalance-2           Rebalance-3


t1:         Lookup (sym_file)
t2:         Create dht-link(T)
             at node-3
            
                                      
t3:                                  Lookup(sym_file)

t4:                                  After some more
                                     operations delete
                                     the dht-link at
                                     node-3

t5:         Do setattr at dht-link
            created at t2 above.
            NOTE: It will fail
            as at t4, rebalance-2
            deleted dht-link file

t6:                                  Create symbolic-link
                                      at node3

Comment 11 Susant Kumar Palai 2015-11-27 11:53:16 UTC

Cloning this to 3.1. To be fixed in future.

Note You need to log in before you can comment on or make changes to this bug.