Bug 1387510

Summary: Getting continuous warning and error messages when one of the replica brick goes down during rebalance
Product: Red Hat Gluster Storage Reporter: Byreddy <bsrirama>
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED CURRENTRELEASE QA Contact: Nag Pavan Chilakam <nchilaka>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: bsrirama, rhs-bugs, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-17 06:06:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Byreddy 2016-10-21 07:06:03 UTC
Description of problem:
=======================
Getting warning and error messages when one of the replica pair brick goes down during rebalance operation.


[2016-10-21 06:21:24.506578] W [MSGID: 115009] [server-resolve.c:543:server_resolve] 0-Dis-Rep-server: no resolution type for (null) (LOOKUP)
[2016-10-21 06:21:24.506650] E [MSGID: 115050] [server-rpc-fops.c:159:server_lookup_cbk] 0-Dis-Rep-server: 169755: LOOKUP (null) (00000000-0000-0000-0000-000000000000) ==> (Invalid argument) [Invalid argument]
[2016-10-21 06:21:24.529369] W [MSGID: 115009] [server-resolve.c:543:server_resolve] 0-Dis-Rep-server: no resolution type for (null) (LOOKUP)
[2016-10-21 06:21:24.529450] E [MSGID: 115050] [server-rpc-fops.c:159:server_lookup_cbk] 0-Dis-Rep-server: 169776: LOOKUP (null) (00000000-0000-0000-0000-000000000000) ==> (Invalid argument) [Invalid argument]
[2016-10-21 06:21:24.593487] W [MSGID: 115009] [server-resolve.c:543:server_resolve] 0-Dis-Rep-server: no resolution type for (null) (LOOKUP)
[2016-10-21 06:21:24.593557] E [MSGID: 115050] [server-rpc-fops.c:159:server_lookup_cbk] 0-Dis-Rep-server: 169807: LOOKUP (null) (00000000-0000-0000-0000-000000000000) ==> (Invalid argument) [Invalid argument]
[2016-10-21 06:21:24.614531] W [MSGID: 115009] [server-resolve.c:543:server_resolve] 0-Dis-Rep-server: no resolution type for (null) (LOOKUP)
[2016-10-21 06:21:24.614596] E [MSGID: 115050] [server-rpc-fops.c:159:server_lookup_cbk] 0-Dis-Rep-server: 169822: LOOKUP (null) (00000000-0000-0000-0000-000000000000) ==> (Invalid argument) [Invalid argument]
[2016-10-21 06:21:24.650385] W [MSGID: 115009] [server-resolve.c:543:server_resolve] 0-Dis-Rep-server: no resolution type for (null) (LOOKUP)
[2016-10-21 06:21:24.650500] E [MSGID: 115050] [server-rpc-fops.c:159:server_lookup_cbk] 0-Dis-Rep-server: 169833: LOOKUP (null) (00000000-0000-0000-0000-000000000000) ==> (Invalid argument) [Invalid argument]
[2016-10-21 06:21:24.666088] W [MSGID: 115009] [server-resolve.c:543:server_resolve] 0-Dis-Rep-server: no resolution type for (null) (LOOKUP)
[2016-10-21 06:21:24.666153] E [MSGID: 115050] [server-rpc-fops.c:159:server_lookup_cbk] 0-Dis-Rep-server: 169849: LOOKUP (null) (00000000-0000-0000-0000-000000000000) ==> (Invalid argument) [Invalid argument]




Version-Release number of selected component (if applicable):
=============================================================
glusterfs-3.7.1-11.el7rhgs.5.sfdc01636297 ( RHGS 3.1 )


How reproducible:
=================
Always


Steps to Reproduce:
===================
1. Have 3 cluster nodes having above mentioned bits
2. Have  2 * 2 volume  //Distributed-replicate volume
3. Fuse mount the volume on two diff clients
4. Write enough data from both the clients to take rebalance operation more time in the below steps
5. Add two new bricks
6. Trigger the rebalance  // gluster volume rebalance <vol-name> start
7. During rebalance in progress, down one of the replica pair brick ( kill -15 <pidof brick> ) // Do it on one of old the replica pair of bricks.
8. Check the brick logs of the other brick part of replica pair 


Actual results:
===============
Getting continuous warning and error messages when one of the replica brick goes down during rebalance


Expected results:
=================
No Warning and error messages should throw while rebalance in progress.

Additional info:
=================
Once rebalance is complete, stopped getting these messages in the the brick log.

Comment 3 Byreddy 2016-10-21 07:12:56 UTC
I think we get these warning and error messages in the current 3.8.4-2 build as well based on discussion with dev team so setting the Internal White board to 3.2.0

Comment 6 Byreddy 2016-11-17 06:06:16 UTC
This issue consistently reproducible only in the the hot fix and not reproducing in RHGS3.2 current build 

Closing this bug as works in the current release.