Bug 1247515

Summary:	[upgrade] Error messages seen in glusterd logs, while upgrading from RHGS 2.1.6 to RHGS 3.1
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	SATHEESARAN <sasundar>
Component:	tier	Assignee:	hari gowtham <hgowtham>
Status:	CLOSED ERRATA	QA Contact:	Byreddy <bsrirama>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	rhgs-3.1	CC:	asrivast, bsrirama, dlambrig, pprakash, rcyriac, rhs-bugs, rkavunga, sankarshan, storage-qa-internal
Target Milestone:	---	Keywords:	ZStream
Target Release:	RHGS 3.1.2
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	glusterfs-3.7.5-9	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	1282461 (view as bug list)		Environment:
Last Closed:	2016-03-01 05:32:45 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1216951, 1260783, 1260923, 1282461, 1287597

Description SATHEESARAN 2015-07-28 08:13:26 UTC

Description of problem:
-----------------------
After performing "In-service Software Upgrade" from RHGS 2.1 Update6 to RHGS 3.1,
observed error messages related to "hot-tier" in glusterd logs.

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
RHGS 2.1 Update
RHGS 3.1 (RC4)

How reproducible:
-----------------
Always

Steps to Reproduce:
--------------------
1. Install RHGS 2.1 Update6 on 2 nodes
2. Create a 'Trusted Storage Pool' with these 2 nodes.
2. Create distributed-replicate volume
3. Start the volume
4. Fuse mount the volume and add more files
5. Perform 'In-service Software Upgrade' to RHGS 3.1 ( Refer Installation guide )
6. After upgrade, reboot the node
7. When the machines comes up, look at glusterd log file and glustershd log file

Actual results:
---------------
There are error messages related to 'hot-tier' in glusterd logs

Expected results:
-----------------
There should be any error messages


Additional info:

Comment 1 SATHEESARAN 2015-07-28 08:53:28 UTC

The following is the snip from error messages in glusterd.log

[2015-07-27 22:35:42.668983] E [MSGID: 106062] [glusterd-utils.c:7910:glusterd_volume_status_copy_to_op_ctx_dict] 0-management: Failed to get hot brick count from rsp_dict
[2015-07-27 22:35:42.669078] E [MSGID: 106108] [glusterd-syncop.c:1069:_gd_syncop_commit_op_cbk] 0-management: Failed to aggregate response from  node/brick

[2015-07-27 22:35:42.717495] E [MSGID: 106108] [glusterd-syncop.c:1069:_gd_syncop_commit_op_cbk] 0-management: Failed to aggregate response from  node/brick
[2015-07-27 22:35:42.717753] E [MSGID: 106062] [glusterd-utils.c:7910:glusterd_volume_status_copy_to_op_ctx_dict] 0-management: Failed to get hot brick count from rsp_dict
[2015-07-27 22:35:42.717785] E [MSGID: 106108] [glusterd-syncop.c:1069:_gd_syncop_commit_op_cbk] 0-management: Failed to aggregate response from  node/brick
[2015-07-27 22:35:42.717943] E [MSGID: 106062] [glusterd-utils.c:7910:glusterd_volume_status_copy_to_op_ctx_dict] 0-management: Failed to get hot brick count from rsp_dict
[2015-07-27 22:35:42.717973] E [MSGID: 106108] [glusterd-syncop.c:1069:_gd_syncop_commit_op_cbk] 0-management: Failed to aggregate response from  node/brick
[2015-07-27 22:35:42.724864] I [MSGID: 106499] [glusterd-handler.c:4258:__glusterd_handle_status_volume] 0-management: Received status volume req for volume repvol
[2015-07-27 22:35:42.732448] E [MSGID: 106062] [glusterd-utils.c:7910:glusterd_volume_status_copy_to_op_ctx_dict] 0-management: Failed to get hot brick count from rsp_dict
[2015-07-27 22:35:42.732492] E [MSGID: 106108] [glusterd-syncop.c:1069:_gd_syncop_commit_op_cbk] 0-management: Failed to aggregate response from  node/brick
[2015-07-27 22:35:42.732736] E [MSGID: 106062] [glusterd-utils.c:7910:glusterd_volume_status_copy_to_op_ctx_dict] 0-management: Failed to get hot brick count from rsp_dict
[2015-07-27 22:35:42.732781] E [MSGID: 106108] [glusterd-syncop.c:1069:_gd_syncop_commit_op_cbk] 0-management: Failed to aggregate response from  node/brick
[2015-07-27 22:35:42.733083] E [MSGID: 106062] [glusterd-utils.c:7910:glusterd_volume_status_copy_to_op_ctx_dict] 0-management: Failed to get hot brick count from rsp_dict
[2015-07-27 22:35:42.733151] E [MSGID: 106108] [glusterd-syncop.c:1069:_gd_syncop_commit_op_cbk] 0-management: Failed to aggregate response from  node/brick
[2015-07-27 22:37:17.816121] I [MSGID: 106488] [glusterd-handler.c:1463:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2015-07-27 22:38:08.277948] I [MSGID: 106499] [glusterd-handler.c:4258:__glusterd_handle_status_volume] 0-management: Received status volume req for volume distvol3
[2015-07-27 22:38:08.286581] E [MSGID: 106062] [glusterd-utils.c:7910:glusterd_volume_status_copy_to_op_ctx_dict] 0-management: Failed to get hot brick count from rsp_dict
[2015-07-27 22:38:08.286615] E [MSGID: 106108] [glusterd-syncop.c:1069:_gd_syncop_commit_op_cbk] 0-management: Failed to aggregate response from  node/brick

Comment 2 SATHEESARAN 2015-07-31 02:21:01 UTC

To be more specific, when 'gluster volume status <vol-name>' is issued, with one or more RHGS 2.1 (glusterd ) node in the cluster, one can observe this error messages in the glusterd logs

<snip>
[2015-07-31 07:41:29.170570] I [MSGID: 106499] [glusterd-handler.c:4258:__glusterd_handle_status_volume] 0-management: Received status volume req for volume repvol
[2015-07-31 07:41:29.177977] E [MSGID: 106062] [glusterd-utils.c:7910:glusterd_volume_status_copy_to_op_ctx_dict] 0-management: Failed to get hot brick count from rsp_dict
[2015-07-31 07:41:29.178072] E [MSGID: 106108] [glusterd-syncop.c:1069:_gd_syncop_commit_op_cbk] 0-management: Failed to aggregate response from  node/brick
</snip>

Comment 4 Byreddy 2015-12-07 05:11:33 UTC

Verified this bug with RHGS version - glusterfs-3.7.5-9 with the below steps.
1. Created two node cluster with 2.1.6 build.
2. Created Distributed-replicate volume, mounted it  and added some files
3. Done in-service upgrade to 3.1.2 
4. Rebooted the updated node 
5. Checked the glusterd logs for the issue specified in Description section

No error messages found in the glusterd logs. so moving to verified state.

Comment 6 errata-xmlrpc 2016-03-01 05:32:45 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html