1286294 – Upgrading a subset of cluster to 3.7.6 leads to issues with glusterd commands

Bug 1286294 - Upgrading a subset of cluster to 3.7.6 leads to issues with glusterd commands

Summary: Upgrading a subset of cluster to 3.7.6 leads to issues with glusterd commands

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	3.7.6
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Assignee:	bugs@gluster.org
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-11-27 20:25 UTC by Denis Lambolez
Modified:	2015-12-01 04:38 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Clone Of:	1276029
Environment:
Last Closed:	2015-12-01 04:38:49 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Denis Lambolez 2015-11-27 20:25:35 UTC

Same case and configuration as the previous bug but upgrading from 3.7.5 to 3.7.6 one node (Brick2).

Volume Name: smbshare
Type: Replicate
Volume ID: 40bfc10d-6f7a-45cf-81ba-0e4d531da890
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: catsserver-node1:/srv/gluster/bricks/smbshare
Brick2: catsserver-node2:/srv/gluster/bricks/smbshare
Options Reconfigured:
nfs.disable: on
server.allow-insecure: on

On Brick1:
----------
tail /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
[2015-11-27 20:13:35.247775] E [MSGID: 106524] [glusterd-op-sm.c:1794:glusterd_op_stage_stats_volume] 0-glusterd: Volume name get failed
[2015-11-27 20:13:35.269306] E [MSGID: 106301] [glusterd-op-sm.c:5197:glusterd_op_ac_stage_op] 0-management: Stage failed on operation 'Volume Profile', Status : -2

On Brick2:
----------
tail /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
[2015-11-27 20:13:35.273939] E [MSGID: 106153] [glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Staging failed on catsserver-node1. Error: Volume name get failed


+++ This bug was initially created as a clone of Bug #1276029 +++

This is extract from mail that a user(David Robinson) sent on gluster-users.

Description of problem:

I have a replica pair setup that I was trying to upgrade from 3.7.4 to 3.7.5. 
After upgrading the rpm packages (rpm -Uvh *.rpm) and rebooting one of the nodes, I am now receiving the following:
 
[root@frick01 log]# gluster volume status
Staging failed on frackib01.corvidtec.com. Please check log file for details.
 

[root@frick01 log]# gluster volume info
 
Volume Name: gfs
Type: Replicate
Volume ID: abc63b5c-bed7-4e3d-9057-00930a2d85d3
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp,rdma
Bricks:
Brick1: frickib01.corvidtec.com:/data/brick01/gfs
Brick2: frackib01.corvidtec.com:/data/brick01/gfs
Options Reconfigured:
storage.owner-gid: 100
server.allow-insecure: on
performance.readdir-ahead: on
server.event-threads: 4
client.event-threads: 4

How reproducible:
Reported by multiple users.

Logs have been attached.

--- Additional comment from Raghavendra Talur on 2015-10-28 08:47 EDT ---



--- Additional comment from Anand Nekkunti on 2015-10-30 10:27:08 EDT ---

master branch patch link: http://review.gluster.org/#/c/12473/

--- Additional comment from Vijay Bellur on 2015-11-02 04:39:27 EST ---

REVIEW: http://review.gluster.org/12486 (glusterd: move new feature (tiering) enum op to the last of the array) posted (#1) for review on release-3.7 by Gaurav Kumar Garg (ggarg)

--- Additional comment from Raghavendra Talur on 2015-11-17 01:01:43 EST ---

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.6, please open a new bug report.

glusterfs-3.7.6 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://www.gluster.org/pipermail/gluster-users/2015-November/024359.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 1 Denis Lambolez 2015-11-27 21:57:25 UTC

Works fine after both servers are upgraded to 3.7.6. Problem is linked to nodes not being at the same version.

Comment 2 Atin Mukherjee 2015-11-30 04:23:07 UTC

Please refer http://www.gluster.org/pipermail/gluster-users/2015-November/024178.html. I've mentioned that the fix actually doesn't solve the immediate upgrade path as well until the whole cluster is upgraded. However once you are in 3.7.6 the subsequent upgrades will not experience it. Let me know whether we can close this bug based on the same point.

Comment 3 Denis Lambolez 2015-11-30 19:24:21 UTC

OK. So I think we can close it. Let's review it when we'll upgrade to 3.7.7. Thansk for the support.

Comment 4 Atin Mukherjee 2015-12-01 04:38:49 UTC

Closing this bug since the fix was done with an assumption that we will hit the same issue with next upgrade path until and unless the whole cluster is upgraded.

Note You need to log in before you can comment on or make changes to this bug.