Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1730175

Summary: Seeing failure due to "getxattr err for dir [No data available]" in rebalance
Product: [Community] GlusterFS Reporter: Susant Kumar Palai <spalai>
Component: distributeAssignee: Susant Kumar Palai <spalai>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs, nbalacha
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-07-18 10:19:57 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Susant Kumar Palai 2019-07-16 05:08:33 UTC
Description of problem:
While running rebalance on a heterogeneous brick volume, saw failure in rebalance due to "[2019-07-05 09:36:55.653538] E [MSGID: 109039] [dht-common.c:4245:dht_find_local_subvol_cbk] 0-vol6-dht: getxattr err for dir [No data available]" error.

How reproducible:
1/1

Steps to Reproduce:
1. Create a 2 brick volume, where brick1 is of 20G and brick2 of 5G
2. Fuse mount the volume on a client node.
3. Check the hash layout on the bricks.
4. Start running I/O from the mount point.
5. While the I/O is still in progress, disable the cluster.weighted-rebalance volume option.
6. Let the I/O continue to run and add-brick of 10G to the volume.
7. Trigger rebalance on the volume.
8. Check hash layout on the back-end bricks.

Actual results:
Seeing failure in rebalance 

Expected results:
Rebalance should complete successfully.

Comment 1 Worker Ant 2019-07-16 05:14:57 UTC
REVIEW: https://review.gluster.org/23053 (dht: log getxattr failure for node-uuid at \"DEBUG\") posted (#1) for review on master by Susant Palai

Comment 2 Nithya Balachandran 2019-07-16 05:16:15 UTC
This is because of the mismatch in the xattr when it is a pure dist versus a dist-rep or dist-disperse. This should not prevent the rebalance from proceeding.

Comment 3 Susant Kumar Palai 2019-07-16 05:18:47 UTC
(In reply to Nithya Balachandran from comment #2)
> This is because of the mismatch in the xattr when it is a pure dist versus a
> dist-rep or dist-disperse. This should not prevent the rebalance from
> proceeding.

Correct. But since there was an error logged in dht_find_local_subvol_cbk, it creates confusion that if it is a real error. Just moved the log to DEBUG as the parent function logs if both attempts failed.

Comment 4 Nithya Balachandran 2019-07-16 05:23:59 UTC
(In reply to Susant Kumar Palai from comment #3)
> (In reply to Nithya Balachandran from comment #2)
> > This is because of the mismatch in the xattr when it is a pure dist versus a
> > dist-rep or dist-disperse. This should not prevent the rebalance from
> > proceeding.
> 
> Correct. But since there was an error logged in dht_find_local_subvol_cbk,
> it creates confusion that if it is a real error. Just moved the log to DEBUG
> as the parent function logs if both attempts failed.

Then the description is incorrect. 

"Actual results:
Seeing failure in rebalance 

Expected results:
Rebalance should complete successfully."



Rebalance will complete successfully. This message does not cause it to stop.

Comment 5 Worker Ant 2019-07-18 10:19:57 UTC
REVIEW: https://review.gluster.org/23053 (dht: log getxattr failure for node-uuid at \"DEBUG\") merged (#3) on master by N Balachandran