Bug 1123038

Summary:	SMB:Add-brick followed by rebalance while dd is executed on cifs mount leads to errors in rebalance logs
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	surabhi <sbhaloth>
Component:	distribute	Assignee:	ankit <anraj>
Status:	CLOSED INSUFFICIENT_DATA	QA Contact:	Vivek Das <vdas>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	rhgs-3.0	CC:	anraj, kramdoss, nbalacha, nchilaka, nlevinki, rgowdapp, rhs-bugs, sanandpa, sbhaloth, tdesala, vdas
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:	dht-retest, dht-samba
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-06-06 07:11:03 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description surabhi 2014-07-24 17:07:50 UTC

Description of problem:
Executing dd on cifs mount and then performing add-brick on node followed by rebalance shows errors in volume/rebalance logs.the logs are as follows:

*******************************
[2014-07-24 13:47:27.228985] I [MSGID: 109018] [dht-common.c:696:dht_revalidate_cbk] 4-newafr-dht: Mismatching layouts for /, gfid = 00000000-0000-0000-0000-000000000001
[2014-07-24 13:47:27.229065] I [dht-layout.c:782:dht_layout_dir_mismatch] 4-newafr-dht: subvol: newafr-replicate-2; inode layout - 1431655765 - 2863311529; disk layout - 2147483646 - 3221225468
[2014-07-24 13:47:27.229307] I [dht-layout.c:782:dht_layout_dir_mismatch] 4-newafr-dht: subvol: newafr-replicate-1; inode layout - 2863311530 - 4294967295; disk layout - 3221225469 - 4294967295
[2014-07-24 13:48:03.212791] E [afr-common.c:2300:afr_lookup_done] 4-newafr-replicate-1: /..: No gfid present
[2014-07-24 13:48:44.80523[2014-07-24 13:47:21.136314] E [glusterd-utils.c:10329:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index
[2014-07-24 13:47:21.136995] E [glusterd-utils.c:10329:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index
[2014-07-24 13:47:26.160380] E [glusterd-utils.c:10329:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index
[2014-07-24 13:46:34.595255] I [MSGID: 106006] [glusterd-handler.c:4280:__glusterd_nodesvc_rpc_notify] 0-management: nfs has disconnected from glusterd.
[2014-07-24 13:46:35.613685] I [MSGID: 106006] [glusterd-handler.c:4280:__glusterd_nodesvc_rpc_notify] 0-management: glustershd has disconnected from glusterd.
[2014-07-24 13:47:29.136045] I [MSGID: 106007] [glusterd-rebalance.c:173:__glusterd_defrag_notify] 0-management: Rebalance process for volume newafr has disconnected.
[2014-07-24 13:48:27.808564] E [glusterd-op-sm.c:207:glusterd_get_txn_opinfo] 0-: Unable to get transaction opinfo for transaction ID : 2e711331-c9bd-4fd4-9952-e99b685743a3
1] E [afr-common.c:2300:afr_lookup_done] 4-newafr-replicate-1: /..: No gfid present
The message "I [MSGID: 109018] [dht-common.c:696:dht_revalidate_cbk] 4-newafr-dht: Mismatching layouts for /, gfid = 00000000-0000-0000-0000-000000000001" repeated 2 times between [2014-07-24 13:47:27.228985] and [2014-07-24 13:47:27.229337]

Version-Release number of selected component (if applicable):
glusterfs-geo-replication-3.6.0.25-1.el6rhs.x86_64
glusterfs-fuse-3.6.0.25-1.el6rhs.x86_64
glusterfs-rdma-3.6.0.25-1.el6rhs.x86_64
glusterfs-cli-3.6.0.25-1.el6rhs.x86_64
glusterfs-libs-3.6.0.25-1.el6rhs.x86_64
glusterfs-3.6.0.25-1.el6rhs.x86_64
glusterfs-devel-3.6.0.25-1.el6rhs.x86_64
glusterfs-server-3.6.0.25-1.el6rhs.x86_64
glusterfs-debuginfo-3.6.0.25-1.el6rhs.x86_64
samba-glusterfs-3.6.9-168.4.el6rhs.x86_64
glusterfs-api-3.6.0.25-1.el6rhs.x86_64
glusterfs-api-devel-3.6.0.25-1.el6rhs.x86_64


How reproducible:
tried twice

Steps to Reproduce:
1.Create 2X2 volume ,mount it via cifs
2.execute following on client.
dd if=/dev/urandom of=/dev/input_file bs=1M count=1024 “ with the attached script
3.Do add-brick and run rebalance on server
4.execute find | xargs stat on mount point
4.Check rebalance logs.

Actual results:
Error messages in gluster vol and rebalance logs:
*****
[2014-07-24 13:48:03.212791] E [afr-common.c:2300:afr_lookup_done] 4-newafr-replicate-1: /..: No gfid present
[2014-07-24 13:48:44.80523[2014-07-24 13:47:21.136314] E [glusterd-utils.c:10329:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index
The message "I [MSGID: 109018] [dht-common.c:696:dht_revalidate_cbk] 4-newafr-dht: Mismatching layouts for /, gfid = 00000000-0000-0000-0000-000000000001" repeated 2 times between [2014-07-24 13:47:27.228985] and [2014-07-24 13:47:27.229337]

Expected results:
There should not be errors while doing rebalance followed by add-brick.

Additional info:

Comment 2 Raghavendra G 2016-06-23 05:34:51 UTC

The logs doesn't indicate any issues with DHT. On 3.1.3 afr_lookup_done no longer have this log message. Can we retest this issue on 3.1.3?

Comment 5 Nithya Balachandran 2017-06-06 07:11:03 UTC

I am closing this with resolution Insufficient_data. Please file a new BZ if you see this again.