1123038 – SMB:Add-brick followed by rebalance while dd is executed on cifs mount leads to errors in rebalance logs

Bug 1123038 - SMB:Add-brick followed by rebalance while dd is executed on cifs mount leads to errors in rebalance logs

Summary: SMB:Add-brick followed by rebalance while dd is executed on cifs mount leads ...

Keywords:
Status:	CLOSED INSUFFICIENT_DATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	distribute
Sub Component:
Version:	rhgs-3.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	ankit
QA Contact:	Vivek Das
Docs Contact:
URL:
Whiteboard:	dht-retest, dht-samba
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-07-24 17:07 UTC by surabhi
Modified:	2020-04-06 12:03 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-06-06 07:11:03 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description surabhi 2014-07-24 17:07:50 UTC

Description of problem:
Executing dd on cifs mount and then performing add-brick on node followed by rebalance shows errors in volume/rebalance logs.the logs are as follows:

*******************************
[2014-07-24 13:47:27.228985] I [MSGID: 109018] [dht-common.c:696:dht_revalidate_cbk] 4-newafr-dht: Mismatching layouts for /, gfid = 00000000-0000-0000-0000-000000000001
[2014-07-24 13:47:27.229065] I [dht-layout.c:782:dht_layout_dir_mismatch] 4-newafr-dht: subvol: newafr-replicate-2; inode layout - 1431655765 - 2863311529; disk layout - 2147483646 - 3221225468
[2014-07-24 13:47:27.229307] I [dht-layout.c:782:dht_layout_dir_mismatch] 4-newafr-dht: subvol: newafr-replicate-1; inode layout - 2863311530 - 4294967295; disk layout - 3221225469 - 4294967295
[2014-07-24 13:48:03.212791] E [afr-common.c:2300:afr_lookup_done] 4-newafr-replicate-1: /..: No gfid present
[2014-07-24 13:48:44.80523[2014-07-24 13:47:21.136314] E [glusterd-utils.c:10329:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index
[2014-07-24 13:47:21.136995] E [glusterd-utils.c:10329:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index
[2014-07-24 13:47:26.160380] E [glusterd-utils.c:10329:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index
[2014-07-24 13:46:34.595255] I [MSGID: 106006] [glusterd-handler.c:4280:__glusterd_nodesvc_rpc_notify] 0-management: nfs has disconnected from glusterd.
[2014-07-24 13:46:35.613685] I [MSGID: 106006] [glusterd-handler.c:4280:__glusterd_nodesvc_rpc_notify] 0-management: glustershd has disconnected from glusterd.
[2014-07-24 13:47:29.136045] I [MSGID: 106007] [glusterd-rebalance.c:173:__glusterd_defrag_notify] 0-management: Rebalance process for volume newafr has disconnected.
[2014-07-24 13:48:27.808564] E [glusterd-op-sm.c:207:glusterd_get_txn_opinfo] 0-: Unable to get transaction opinfo for transaction ID : 2e711331-c9bd-4fd4-9952-e99b685743a3
1] E [afr-common.c:2300:afr_lookup_done] 4-newafr-replicate-1: /..: No gfid present
The message "I [MSGID: 109018] [dht-common.c:696:dht_revalidate_cbk] 4-newafr-dht: Mismatching layouts for /, gfid = 00000000-0000-0000-0000-000000000001" repeated 2 times between [2014-07-24 13:47:27.228985] and [2014-07-24 13:47:27.229337]

Version-Release number of selected component (if applicable):
glusterfs-geo-replication-3.6.0.25-1.el6rhs.x86_64
glusterfs-fuse-3.6.0.25-1.el6rhs.x86_64
glusterfs-rdma-3.6.0.25-1.el6rhs.x86_64
glusterfs-cli-3.6.0.25-1.el6rhs.x86_64
glusterfs-libs-3.6.0.25-1.el6rhs.x86_64
glusterfs-3.6.0.25-1.el6rhs.x86_64
glusterfs-devel-3.6.0.25-1.el6rhs.x86_64
glusterfs-server-3.6.0.25-1.el6rhs.x86_64
glusterfs-debuginfo-3.6.0.25-1.el6rhs.x86_64
samba-glusterfs-3.6.9-168.4.el6rhs.x86_64
glusterfs-api-3.6.0.25-1.el6rhs.x86_64
glusterfs-api-devel-3.6.0.25-1.el6rhs.x86_64


How reproducible:
tried twice

Steps to Reproduce:
1.Create 2X2 volume ,mount it via cifs
2.execute following on client.
dd if=/dev/urandom of=/dev/input_file bs=1M count=1024 “ with the attached script
3.Do add-brick and run rebalance on server
4.execute find | xargs stat on mount point
4.Check rebalance logs.

Actual results:
Error messages in gluster vol and rebalance logs:
*****
[2014-07-24 13:48:03.212791] E [afr-common.c:2300:afr_lookup_done] 4-newafr-replicate-1: /..: No gfid present
[2014-07-24 13:48:44.80523[2014-07-24 13:47:21.136314] E [glusterd-utils.c:10329:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index
The message "I [MSGID: 109018] [dht-common.c:696:dht_revalidate_cbk] 4-newafr-dht: Mismatching layouts for /, gfid = 00000000-0000-0000-0000-000000000001" repeated 2 times between [2014-07-24 13:47:27.228985] and [2014-07-24 13:47:27.229337]

Expected results:
There should not be errors while doing rebalance followed by add-brick.

Additional info:

Comment 2 Raghavendra G 2016-06-23 05:34:51 UTC

The logs doesn't indicate any issues with DHT. On 3.1.3 afr_lookup_done no longer have this log message. Can we retest this issue on 3.1.3?

Comment 5 Nithya Balachandran 2017-06-06 07:11:03 UTC

I am closing this with resolution Insufficient_data. Please file a new BZ if you see this again.

Note You need to log in before you can comment on or make changes to this bug.