Bug 1730175 - Seeing failure due to "getxattr err for dir [No data available]" in rebalance
Summary: Seeing failure due to "getxattr err for dir [No data available]" in rebalance
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Susant Kumar Palai
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-16 05:08 UTC by Susant Kumar Palai
Modified: 2019-07-18 10:19 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-07-18 10:19:57 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gluster.org Gerrit 23053 0 None Merged dht: log getxattr failure for node-uuid at \"DEBUG\" 2019-07-18 10:19:56 UTC

Description Susant Kumar Palai 2019-07-16 05:08:33 UTC
Description of problem:
While running rebalance on a heterogeneous brick volume, saw failure in rebalance due to "[2019-07-05 09:36:55.653538] E [MSGID: 109039] [dht-common.c:4245:dht_find_local_subvol_cbk] 0-vol6-dht: getxattr err for dir [No data available]" error.

How reproducible:
1/1

Steps to Reproduce:
1. Create a 2 brick volume, where brick1 is of 20G and brick2 of 5G
2. Fuse mount the volume on a client node.
3. Check the hash layout on the bricks.
4. Start running I/O from the mount point.
5. While the I/O is still in progress, disable the cluster.weighted-rebalance volume option.
6. Let the I/O continue to run and add-brick of 10G to the volume.
7. Trigger rebalance on the volume.
8. Check hash layout on the back-end bricks.

Actual results:
Seeing failure in rebalance 

Expected results:
Rebalance should complete successfully.

Comment 1 Worker Ant 2019-07-16 05:14:57 UTC
REVIEW: https://review.gluster.org/23053 (dht: log getxattr failure for node-uuid at \"DEBUG\") posted (#1) for review on master by Susant Palai

Comment 2 Nithya Balachandran 2019-07-16 05:16:15 UTC
This is because of the mismatch in the xattr when it is a pure dist versus a dist-rep or dist-disperse. This should not prevent the rebalance from proceeding.

Comment 3 Susant Kumar Palai 2019-07-16 05:18:47 UTC
(In reply to Nithya Balachandran from comment #2)
> This is because of the mismatch in the xattr when it is a pure dist versus a
> dist-rep or dist-disperse. This should not prevent the rebalance from
> proceeding.

Correct. But since there was an error logged in dht_find_local_subvol_cbk, it creates confusion that if it is a real error. Just moved the log to DEBUG as the parent function logs if both attempts failed.

Comment 4 Nithya Balachandran 2019-07-16 05:23:59 UTC
(In reply to Susant Kumar Palai from comment #3)
> (In reply to Nithya Balachandran from comment #2)
> > This is because of the mismatch in the xattr when it is a pure dist versus a
> > dist-rep or dist-disperse. This should not prevent the rebalance from
> > proceeding.
> 
> Correct. But since there was an error logged in dht_find_local_subvol_cbk,
> it creates confusion that if it is a real error. Just moved the log to DEBUG
> as the parent function logs if both attempts failed.

Then the description is incorrect. 

"Actual results:
Seeing failure in rebalance 

Expected results:
Rebalance should complete successfully."



Rebalance will complete successfully. This message does not cause it to stop.

Comment 5 Worker Ant 2019-07-18 10:19:57 UTC
REVIEW: https://review.gluster.org/23053 (dht: log getxattr failure for node-uuid at \"DEBUG\") merged (#3) on master by N Balachandran


Note You need to log in before you can comment on or make changes to this bug.