Bug 1476676

Summary: Rebalance skips files when a brick goes downs inspite of afr passing both node ids of replica to rebalance
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Nag Pavan Chilakam <nchilaka>
Component: distributeAssignee: Nithya Balachandran <nbalacha>
Status: CLOSED WORKSFORME QA Contact: Prasad Desala <tdesala>
Severity: medium Docs Contact:
Priority: low    
Version: rhgs-3.3CC: nchilaka, rhs-bugs, saraut, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-09 03:12:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Nag Pavan Chilakam 2017-07-31 07:30:51 UTC
Description of problem:
=======================
As part of BZ#1315781 afr now passes nodes UUIDs of both replicas to the dht layer. That means both nodes can participate in rebalance which was not the case previously(only n1 used to participate)
But,even with this fix, when a brick is down the other node should be able to migrate the files. However this doesn't happen. Inspite of the other node scanning the filesystem, it skips them

Note: each node seems to pick up a set of files, and assumes the other node will migrate what they have picked up. Hence when a src_brick of one node goes down, the other node just doesn't care, about if the nodes other than itself have actually(or are in a position) to migrate the files.



Version-Release number of selected component (if applicable):
=======
3.8.4-36

How reproducible:
========
always

Steps to Reproduce:
1.create a 1x2 setup(b1 on n1; b2 on n2)
2.add some 1lakh files
3 now add a replica pair (b3 on n1; b4 on n2)and trigger rebalance
4. it can be seen in rebal status, that both nodes participate in the rebalance
5. now bring down b1.


Actual results:
===========
it can be seen that n2 doesn't migrate files which n1 would have picked. Although n2 scans whole filesystem, it migrates only a set of the files, hence skipping them

Expected results:
==========
n2 must be able to rebalance all files, when n1 is down, it should not skip





Killed b1:
n1 which hosts b1 shows completed(as b1 was killed)
n2 which hosts b2 continues
[root@dhcp35-192 ~]# gluster v rebal v2 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost             1116        0Bytes          4518             1             0            completed        0:00:30
                            10.70.35.214             1999        0Bytes          8109             0             0          in progress        0:00:40
Estimated time left for rebalance to complete :        0:42:50
volume rebalance: v2: success


now rebal is over, it can be seen b2 scans all 1lakh files, but migrates only about 25k files, the remaining ~25k are skipped. Even the skipped status doesnt show the number.
(I confirmed that they were skipped, because, on this same setup with same files, if we don't bring down any bricks, there are about 49500 rebalanced)

[root@dhcp35-192 ~]# gluster v rebal v2 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost             1116        0Bytes          4518             1             0            completed        0:00:30
                            10.70.35.214            25033        0Bytes        100000             0             0            completed        0:05:17
volume rebalance: v2: success






I tried to even use force command , but still no change(without force, rebalance doesn't even start, as brick is down)

[root@dhcp35-192 ~]# gluster v rebal v2 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes         26149             0             0            completed        0:00:02
                            10.70.35.214                0        0Bytes        100000             0             0            completed        0:00:18
volume rebalance: v2: success

Comment 3 Nithya Balachandran 2017-08-21 09:46:55 UTC
Please attach the sos reports when filing a BZ. Developers cannot always look at a bug as soon as it is filed so we need sos reports to debug issues.