1544461 – 3.8 -> 3.10 rolling upgrade fails (same for 3.12 or 3.13) on Ubuntu 14

Bug 1544461 - 3.8 -> 3.10 rolling upgrade fails (same for 3.12 or 3.13) on Ubuntu 14

Summary: 3.8 -> 3.10 rolling upgrade fails (same for 3.12 or 3.13) on Ubuntu 14

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	3.10
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	hari gowtham
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1544600 1544637 1544638
TreeView+	depends on / blocked

Reported:	2018-02-12 14:46 UTC by Marc
Modified:	2018-03-10 12:37 UTC (History)
CC List:	4 users (show)
Fixed In Version:	glusterfs-3.10.11
Clone Of:
Clones:	1544600 (view as bug list)
Environment:
Last Closed:	2018-03-10 12:37:56 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
glusterd.log file of the upgraded server and 2 x /var/lib/glusterd/vols/gluster_volume/info files. (3.56 KB, application/x-7z-compressed) 2018-02-12 14:46 UTC, Marc	no flags	Details
View All

Description Marc 2018-02-12 14:46:16 UTC

Created attachment 1394937 [details]
glusterd.log file of the upgraded server and 2 x /var/lib/glusterd/vols/gluster_volume/info files.

Description of problem: Unable to upgrade Gluster cluster to 3.10.10 version after 3.8.15 version ( same for 3.12 & 3.13 i think is related to https://bugzilla.redhat.com/show_bug.cgi?id=1511903 )


Version-Release number of selected component (if applicable): old one 3.8.15 , new one 3.10.10


How reproducible: Always (also tried with 3.12 and 3.13)


Steps to Reproduce:
1. Install 3.10.10 on Ubuntu 14 from PPA.
2. Upgrade one of those nodes latest 3.10 ( now 3.10.10)
3. Newly upgraded node will be rejected from a gluster cluster.

Actual results: Node is rejected from cluster


Expected results:  Node must be accepted


Additional info:
I have a 5 x replicated on Ubuntu 14. 
I am trying to update GlusterFS. First i was at 3.7 version from which i tried multiple scenarios and all failed while directly trying with the newer GlusterFS versions (3.10 3.12 3.13). I then noticed that 3.8 is working fine so i updated from 3.7.20 to 3.8.15 as an intermediary version. While trying to update ( i only updated 1/5 servers to 3.10.10 while the rest are at 3.8.15) to the next 3.10 LTM the node which was updated is throwing following error:

"Version of Cksums gluster_volume differ. local cksum = 3272345312, remote cksum = 469010668 on peer 1-gls-dus21-ci-efood-real-de.openstacklocal" 


Also all peers are now in "Peer Rejected (Connected)" state after update.

Volume Name: gluster_volume
Type: Replicate
Volume ID: 2e6bd6ba-37c8-4808-9156-08545cea3e3e
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 5 = 5
Transport-type: tcp
Bricks:
Brick1: 2-gls-dus10-ci-efood-real-de.openstack.local:/export_vdb
Brick2: 1-gls-dus10-ci-efood-real-de.openstack.local:/export_vdb
Brick3: 1-gls-dus21-ci-efood-real-de:/export_vdb
Brick4: 3-gls-dus10-ci-efood-real-de.openstack.local:/export_vdb
Brick5: 2-gls-dus21-ci-efood-real-de.openstacklocal:/export_vdb
Options Reconfigured:
features.barrier: off
performance.readdir-ahead: on
auth.allow: 10.96.213.245,10.96.214.101,10.97.177.132,10.97.177.127,10.96.214.93,10.97.177.139,10.96.214.119,10.97.177.106,10.96.210.69,10.96.214.94,10.97.177.118,10.97.177.128,10.96.214.98
nfs.disable: on
performance.cache-size: 2GB
performance.cache-max-file-size: 1MB
cluster.self-heal-window-size: 64
performance.io-thread-count: 32


root@1-gls-dus21-ci-efood-real-de:/home/ubuntu# gluster peer status
Number of Peers: 4

Hostname: 3-gls-dus10-ci-efood-real-de.openstack.local
Uuid: 3d141235-9b93-4798-8e03-82a758216b0b
State: Peer in Cluster (Connected)

Hostname: 1-gls-dus10-ci-efood-real-de.openstack.local
Uuid: 00839049-2ade-48f8-b5f3-66db0e2b9377
State: Peer in Cluster (Connected)

Hostname: 2-gls-dus10-ci-efood-real-de.openstack.local
Uuid: 1617cd54-9b2a-439e-9aa6-30d4ecf303f8
State: Peer in Cluster (Connected)

Hostname: 2-gls-dus21-ci-efood-real-de.openstacklocal
Uuid: 0c698b11-9078-441a-9e7f-442befeef7a9
State: Peer Rejected (Connected)



Volume status from one of which was not updated:

root@1-gls-dus21-ci-efood-real-de:/home/ubuntu# gluster volume status
Status of volume: gluster_volume
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 2-gls-dus10-ci-efood-real-de.openstac
k.local:/export_vdb                         49153     0          Y       30521
Brick 1-gls-dus10-ci-efood-real-de.openstac
k.local:/export_vdb                         49152     0          Y       23166
Brick 1-gls-dus21-ci-efood-real-de:/export_
vdb                                         49153     0          Y       2322
Brick 3-gls-dus10-ci-efood-real-de.openstac
k.local:/export_vdb                         49153     0          Y       10854
Self-heal Daemon on localhost               N/A       N/A        Y       4931
Self-heal Daemon on 3-gls-dus10-ci-efood-re
al-de.openstack.local                       N/A       N/A        Y       16591
Self-heal Daemon on 2-gls-dus10-ci-efood-re
al-de.openstack.local                       N/A       N/A        Y       4621
Self-heal Daemon on 1-gls-dus10-ci-efood-re
al-de.openstack.local                       N/A       N/A        Y       3487

Task Status of Volume gluster_volume
------------------------------------------------------------------------------
There are no active volume tasks

And from the updated one:

root@2-gls-dus21-ci-efood-real-de:/var/log/glusterfs# gluster volume status
Status of volume: gluster_volume
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 2-gls-dus21-ci-efood-real-de.openstac
klocal:/export_vdb                          N/A       N/A        N       N/A
NFS Server on localhost                     N/A       N/A        N       N/A

Task Status of Volume gluster_volume
------------------------------------------------------------------------------
There are no active volume tasks




[2018-02-12 13:35:53.400122] E [MSGID: 106010] [glusterd-utils.c:3043:glusterd_compare_friend_volume] 0-management: Version of Cksums gluster_volume differ. local cksum = 3272345312, remote cksum = 469010668 on peer 1-gls-dus10-ci-efood-real-de.openstack.local
[2018-02-12 13:35:53.400211] I [MSGID: 106493] [glusterd-handler.c:3866:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to 1-gls-dus10-ci-efood-real-de.openstack.local (0), ret: 0, op_ret: -1
[2018-02-12 13:35:53.417588] I [MSGID: 106163] [glusterd-handshake.c:1316:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800
[2018-02-12 13:35:53.430748] I [MSGID: 106490] [glusterd-handler.c:2606:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 3d141235-9b93-4798-8e03-82a758216b0b
[2018-02-12 13:35:53.431024] E [MSGID: 106010] [glusterd-utils.c:3043:glusterd_compare_friend_volume] 0-management: Version of Cksums gluster_volume differ. local cksum = 3272345312, remote cksum = 469010668 on peer 3-gls-dus10-ci-efood-real-de.openstack.local
[2018-02-12 13:35:53.431121] I [MSGID: 106493] [glusterd-handler.c:3866:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to 3-gls-dus10-ci-efood-real-de.openstack.local (0), ret: 0, op_ret: -1
[2018-02-12 13:35:53.473344] I [MSGID: 106493] [glusterd-rpc-ops.c:485:__glusterd_friend_add_cbk] 0-glusterd: Received RJT from uuid: 7488286f-6bfa-46f8-bc50-9ee815e96c66, host: 1-gls-dus21-ci-efood-real-de.openstacklocal, port: 0


I do no have this file on any of the servers: `/var/lib/glusterd/vols/remote/info` but i attached the `/var/lib/glusterd/vols/gluster_volume/info` from the upgraded one and from a server which was not upgraded.


The 3.7 version was running fine for quite some time so we can exclude network issue, selinux etc..

Comment 1 Marc 2018-02-12 14:51:24 UTC

I see that on the new node i have the new "tier-enabled=0", could it be also related to this: https://www.spinics.net/lists/gluster-users/msg33329.html.

Comment 2 Atin Mukherjee 2018-02-12 15:07:17 UTC

This is indeed a bug and we have managed to root cause it couple of days back. I am assigning it to one of my colleague Hari who is aware of this issue and the fix required. For the time being, please remove tier-enabled=0 in all the info files from the node which has been upgraded and then once all nodes are upgraded bump up the cluster.op-version.

@Hari - we need to send this fix to 3.10, 3.12 and 4.0 branch by changing the op-version check to 3.11 instead of 3.7.6.

Comment 3 hari gowtham 2018-02-13 02:45:58 UTC

Hi Atin,

I can see that you have posted the patch on master. 
https://review.gluster.org/19552

If you are fine with it, I'll backport them to 3.10, 3.12 and 4.0

Comment 4 Worker Ant 2018-02-13 05:29:33 UTC

REVIEW: https://review.gluster.org/19553 (glusterd: fix tier-enabled flag op-version check) posted (#1) for review on release-3.10 by hari gowtham

Comment 5 Marc 2018-02-13 06:31:32 UTC

Hi Atin,Hari,

I have deleted the "tier-enabled=0" line from the upgraded server but it still does not work. If i restart the Gluster service the "info" files it is regenerated and the "tier-enabled=0" line is added again.

If i delete and not restart i have the same :

root@2-gls-dus21-ci-efood-real-de:/var/lib/glusterd/vols/gluster_volume# gluster volume status
Status of volume: gluster_volume
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 2-gls-dus21-ci-efood-real-de.openstac
klocal:/export_vdb                          N/A       N/A        N       N/A
NFS Server on localhost                     N/A       N/A        N       N/A

Task Status of Volume gluster_volume
------------------------------------------------------------------------------
There are no active volume tasks


If i delete and restart, the file gets the line back and the output is:

root@2-gls-dus21-ci-efood-real-de:/var/lib/glusterd/vols/gluster_volume# gluster volume status
Status of volume: gluster_volume
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 2-gls-dus21-ci-efood-real-de.openstac
klocal:/export_vdb                          49152     0          Y       26586
Self-heal Daemon on localhost               N/A       N/A        Y       26568

Task Status of Volume gluster_volume
------------------------------------------------------------------------------
There are no active volume tasks

Comment 6 Atin Mukherjee 2018-02-13 08:19:21 UTC

(In reply to Marc from comment #5)
> Hi Atin,Hari,
> 
> I have deleted the "tier-enabled=0" line from the upgraded server but it
> still does not work. If i restart the Gluster service the "info" files it is
> regenerated and the "tier-enabled=0" line is added again.
> 
> If i delete and not restart i have the same :
> 
> root@2-gls-dus21-ci-efood-real-de:/var/lib/glusterd/vols/gluster_volume#
> gluster volume status
> Status of volume: gluster_volume
> Gluster process                             TCP Port  RDMA Port  Online  Pid
> -----------------------------------------------------------------------------
> -
> Brick 2-gls-dus21-ci-efood-real-de.openstac
> klocal:/export_vdb                          N/A       N/A        N       N/A
> NFS Server on localhost                     N/A       N/A        N       N/A
> 
> Task Status of Volume gluster_volume
> -----------------------------------------------------------------------------
> -
> There are no active volume tasks
> 
> 
> If i delete and restart, the file gets the line back and the output is:
> 
> root@2-gls-dus21-ci-efood-real-de:/var/lib/glusterd/vols/gluster_volume#
> gluster volume status
> Status of volume: gluster_volume
> Gluster process                             TCP Port  RDMA Port  Online  Pid
> -----------------------------------------------------------------------------
> -
> Brick 2-gls-dus21-ci-efood-real-de.openstac
> klocal:/export_vdb                          49152     0          Y      
> 26586
> Self-heal Daemon on localhost               N/A       N/A        Y      
> 26568
> 
> Task Status of Volume gluster_volume
> -----------------------------------------------------------------------------
> -
> There are no active volume tasks

I see that your brick is up here. What's the output of peer status? If all peers are in befriended and connected state, we should be good. What's the difference between the last and the first step what you mentioned?

Comment 7 Marc 2018-02-13 08:37:48 UTC

(In reply to Atin Mukherjee from comment #6)
> (In reply to Marc from comment #5)
> > Hi Atin,Hari,
> > 
> > I have deleted the "tier-enabled=0" line from the upgraded server but it
> > still does not work. If i restart the Gluster service the "info" files it is
> > regenerated and the "tier-enabled=0" line is added again.
> > 
> > If i delete and not restart i have the same :
> > 
> > root@2-gls-dus21-ci-efood-real-de:/var/lib/glusterd/vols/gluster_volume#
> > gluster volume status
> > Status of volume: gluster_volume
> > Gluster process                             TCP Port  RDMA Port  Online  Pid
> > -----------------------------------------------------------------------------
> > -
> > Brick 2-gls-dus21-ci-efood-real-de.openstac
> > klocal:/export_vdb                          N/A       N/A        N       N/A
> > NFS Server on localhost                     N/A       N/A        N       N/A
> > 
> > Task Status of Volume gluster_volume
> > -----------------------------------------------------------------------------
> > -
> > There are no active volume tasks
> > 
> > 
> > If i delete and restart, the file gets the line back and the output is:
> > 
> > root@2-gls-dus21-ci-efood-real-de:/var/lib/glusterd/vols/gluster_volume#
> > gluster volume status
> > Status of volume: gluster_volume
> > Gluster process                             TCP Port  RDMA Port  Online  Pid
> > -----------------------------------------------------------------------------
> > -
> > Brick 2-gls-dus21-ci-efood-real-de.openstac
> > klocal:/export_vdb                          49152     0          Y      
> > 26586
> > Self-heal Daemon on localhost               N/A       N/A        Y      
> > 26568
> > 
> > Task Status of Volume gluster_volume
> > -----------------------------------------------------------------------------
> > -
> > There are no active volume tasks
> 
> I see that your brick is up here. What's the output of peer status? If all
> peers are in befriended and connected state, we should be good. What's the
> difference between the last and the first step what you mentioned?

After restart from a 3.8.15 server:
root@1-gls-dus10-ci-efood-real-de:/home/ubuntu# gluster peer status
Number of Peers: 4

Hostname: 3-gls-dus10-ci-efood-real-de.openstack.local
Uuid: 3d141235-9b93-4798-8e03-82a758216b0b
State: Peer in Cluster (Connected)

Hostname: 1-gls-dus21-ci-efood-real-de.openstacklocal
Uuid: 7488286f-6bfa-46f8-bc50-9ee815e96c66
State: Peer in Cluster (Connected)

Hostname: 2-gls-dus10-ci-efood-real-de.openstack.local
Uuid: 1617cd54-9b2a-439e-9aa6-30d4ecf303f8
State: Peer in Cluster (Connected)

Hostname: 2-gls-dus21-ci-efood-real-de.openstacklocal
Uuid: 0c698b11-9078-441a-9e7f-442befeef7a9
State: Peer Rejected (Connected)

After restart from the 3.10.10 server:

root@2-gls-dus21-ci-efood-real-de:/home/ubuntu# gluster peer status
Number of Peers: 4

Hostname: 3-gls-dus10-ci-efood-real-de.openstack.local
Uuid: 3d141235-9b93-4798-8e03-82a758216b0b
State: Peer Rejected (Connected)

Hostname: 1-gls-dus21-ci-efood-real-de.openstacklocal
Uuid: 7488286f-6bfa-46f8-bc50-9ee815e96c66
State: Peer Rejected (Connected)

Hostname: 1-gls-dus10-ci-efood-real-de.openstack.local
Uuid: 00839049-2ade-48f8-b5f3-66db0e2b9377
State: Peer Rejected (Connected)

Hostname: 2-gls-dus10-ci-efood-real-de.openstack.local
Uuid: 1617cd54-9b2a-439e-9aa6-30d4ecf303f8
State: Peer Rejected (Connected)

I have deleted the "tier-enabled=0" line and without restart "gluster peer status" is the same as above. I also tried to re-peer that upgraded server but got:

root@1-gls-dus10-ci-efood-real-de:/home/ubuntu# gluster peer detach 2-gls-dus21-ci-efood-real-de.openstacklocal
peer detach: failed: Brick(s) with the peer 2-gls-dus21-ci-efood-real-de.openstacklocal exist in cluster
root@1-gls-dus10-ci-efood-real-de:/home/ubuntu# gluster peer probe 2-gls-dus21-ci-efood-real-de.openstacklocal
peer probe: success. Host 2-gls-dus21-ci-efood-real-de.openstacklocal port 24007 already in peer list

Comment 8 Atin Mukherjee 2018-02-13 15:52:18 UTC

Sorry to hear that the suggested work around didn't work :-/

Please try this out:

1. bring down glusterd instance on 3.8.15 server

2. from 3.10.10 server remove tier-enabled=0 entry but don't restart glusterd service.

3. bring up glusterd instance on 3.10.10 server

4. check 'gluster peer status' and 'gluster volume status' output on both the nodes

Comment 9 Marc 2018-02-14 08:27:21 UTC

(In reply to Atin Mukherjee from comment #8)
> Sorry to hear that the suggested work around didn't work :-/
> 
> Please try this out:
> 
> 1. bring down glusterd instance on 3.8.15 server
> 
> 2. from 3.10.10 server remove tier-enabled=0 entry but don't restart
> glusterd service.
> 
> 3. bring up glusterd instance on 3.10.10 server
> 
> 4. check 'gluster peer status' and 'gluster volume status' output on both
> the nodes

Hi Atin,

I didn't exactly understand the above steps, are you referring at step (1) to stop the 3.10.10 instead of 3.8.15 server? Otherwise (2) step is done already (tier-enabled=0 entry removed) and at step (3) ... i never stopped the 3.10.10 instance so that i could bring up. The 'gluster peer status' will show 'peer rejected( disconnected) if i stop the 3.8.15 server and i run the command from 3.10.10.

Comment 10 Atin Mukherjee 2018-02-14 09:13:58 UTC

(In reply to Marc from comment #9)
> (In reply to Atin Mukherjee from comment #8)
> > Sorry to hear that the suggested work around didn't work :-/
> > 
> > Please try this out:
> > 
> > 1. bring down glusterd instance on 3.8.15 server
> > 
> > 2. from 3.10.10 server remove tier-enabled=0 entry but don't restart
> > glusterd service.
> > 
> > 3. bring up glusterd instance on 3.10.10 server
> > 
> > 4. check 'gluster peer status' and 'gluster volume status' output on both
> > the nodes
> 
> Hi Atin,
> 
> I didn't exactly understand the above steps, are you referring at step (1)
> to stop the 3.10.10 instead of 3.8.15 server? Otherwise (2) step is done
> already (tier-enabled=0 entry removed) and at step (3) ... i never stopped
> the 3.10.10 instance so that i could bring up. The 'gluster peer status'
> will show 'peer rejected( disconnected) if i stop the 3.8.15 server and i
> run the command from 3.10.10.

I am requesting not to stop the glusterd instance running with 3.10.10. Just take out tier-enabled=0 entry from all the volume info files and then restart glusterd instance on 3.8.15. Please do let me know if you have any other confusions.

Comment 11 Marc 2018-02-14 09:43:48 UTC

(In reply to Atin Mukherjee from comment #10)
> (In reply to Marc from comment #9)
> > (In reply to Atin Mukherjee from comment #8)
> > > Sorry to hear that the suggested work around didn't work :-/
> > > 
> > > Please try this out:
> > > 
> > > 1. bring down glusterd instance on 3.8.15 server
> > > 
> > > 2. from 3.10.10 server remove tier-enabled=0 entry but don't restart
> > > glusterd service.
> > > 
> > > 3. bring up glusterd instance on 3.10.10 server
> > > 
> > > 4. check 'gluster peer status' and 'gluster volume status' output on both
> > > the nodes
> > 
> > Hi Atin,
> > 
> > I didn't exactly understand the above steps, are you referring at step (1)
> > to stop the 3.10.10 instead of 3.8.15 server? Otherwise (2) step is done
> > already (tier-enabled=0 entry removed) and at step (3) ... i never stopped
> > the 3.10.10 instance so that i could bring up. The 'gluster peer status'
> > will show 'peer rejected( disconnected) if i stop the 3.8.15 server and i
> > run the command from 3.10.10.
> 
> I am requesting not to stop the glusterd instance running with 3.10.10. Just
> take out tier-enabled=0 entry from all the volume info files and then
> restart glusterd instance on 3.8.15. Please do let me know if you have any
> other confusions.

Ok, now i think i understood what you mean but i think i already tried that with no improvement.

1. Restarted the only 3.10.10 server so the "tier-enabled=0" to be regenerated back.
2. Removed "tier-enabled=0" without restarting GlusterFS service.
3. Restarted GlusterFS service on all other 4 x 3.8.15 GlusterFS servers.

On the only 3.10.10 server:

root@2-gls-dus21-ci-efood-real-de:/var/lib/glusterd/vols/gluster_volume#  gluster volume status
Status of volume: gluster_volume
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 2-gls-dus21-ci-efood-real-de.openstac
klocal:/export_vdb                          49152     0          Y       26586
Self-heal Daemon on localhost               N/A       N/A        Y       5776

Task Status of Volume gluster_volume
------------------------------------------------------------------------------
There are no active volume tasks


root@2-gls-dus21-ci-efood-real-de:/var/lib/glusterd/vols/gluster_volume# gluster peer status
Number of Peers: 4

Hostname: 3-gls-dus10-ci-efood-real-de.openstack.local
Uuid: 3d141235-9b93-4798-8e03-82a758216b0b
State: Peer Rejected (Connected)

Hostname: 1-gls-dus21-ci-efood-real-de.openstacklocal
Uuid: 7488286f-6bfa-46f8-bc50-9ee815e96c66
State: Peer Rejected (Connected)

Hostname: 1-gls-dus10-ci-efood-real-de.openstack.local
Uuid: 00839049-2ade-48f8-b5f3-66db0e2b9377
State: Peer Rejected (Connected)

Hostname: 2-gls-dus10-ci-efood-real-de.openstack.local
Uuid: 1617cd54-9b2a-439e-9aa6-30d4ecf303f8
State: Peer Rejected (Connected)

On the other 4 x 3.8.15 servers:


root@2-gls-dus10-ci-efood-real-de:/home/ubuntu# gluster volume status
Status of volume: gluster_volume
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 2-gls-dus10-ci-efood-real-de.openstac
k.local:/export_vdb                         49153     0          Y       30521
Brick 1-gls-dus10-ci-efood-real-de.openstac
k.local:/export_vdb                         49152     0          Y       1663
Brick 1-gls-dus21-ci-efood-real-de:/export_
vdb                                         49152     0          Y       2322
Brick 3-gls-dus10-ci-efood-real-de.openstac
k.local:/export_vdb                         49153     0          Y       10854
Self-heal Daemon on localhost               N/A       N/A        Y       31225
Self-heal Daemon on 3-gls-dus10-ci-efood-re
al-de.openstack.local                       N/A       N/A        Y       10567
Self-heal Daemon on 1-gls-dus21-ci-efood-re
al-de.openstacklocal                        N/A       N/A        Y       31437
Self-heal Daemon on 1-gls-dus10-ci-efood-re
al-de.openstack.local                       N/A       N/A        Y       15453

Task Status of Volume gluster_volume
------------------------------------------------------------------------------
There are no active volume tasks

root@2-gls-dus10-ci-efood-real-de:/home/ubuntu# gluster peer status
Number of Peers: 4

Hostname: 3-gls-dus10-ci-efood-real-de.openstack.local
Uuid: 3d141235-9b93-4798-8e03-82a758216b0b
State: Peer in Cluster (Connected)

Hostname: 1-gls-dus21-ci-efood-real-de.openstacklocal
Uuid: 7488286f-6bfa-46f8-bc50-9ee815e96c66
State: Peer in Cluster (Connected)

Hostname: 1-gls-dus10-ci-efood-real-de.openstack.local
Uuid: 00839049-2ade-48f8-b5f3-66db0e2b9377
State: Peer in Cluster (Connected)

Hostname: 2-gls-dus21-ci-efood-real-de.openstacklocal
Uuid: 0c698b11-9078-441a-9e7f-442befeef7a9
State: Peer Rejected (Connected)


PS: I only removed the "tier-enabled=0" from here  /var/lib/glusterd/vols/gluster_volume/info . If is there any place where i need to edit please let me know.
PSS: This is not urgent for me, this is a test environment and i can live with 4xgluster until an official fix is released (unless you need me to test different scenarios for testing). Can you tell what is the usual timeframe before a fix like this gets released?

Thank you

Comment 12 Atin Mukherjee 2018-02-14 10:18:15 UTC

The bug fix updates are on a monthly basis. Since I have already posted a patch in release-3.10 branch, you should be able to get the fix in next update. For specific dates, Jiffin or Shyam can help you with that. Needinfo set on one of them.

Comment 13 Jiffin 2018-02-14 10:21:37 UTC

The 3.12 is tagged yesterday with the fix. I will send the announcement by EOD

Comment 14 Atin Mukherjee 2018-02-14 10:22:12 UTC

Jiffin - we're talking about release-3.10 here.

Comment 15 Jiffin 2018-02-14 10:27:31 UTC

(In reply to Atin Mukherjee from comment #14)
> Jiffin - we're talking about release-3.10 here.

3.10.x release usually happens at end of every month

Comment 16 Worker Ant 2018-02-21 15:36:33 UTC

COMMIT: https://review.gluster.org/19553 committed in release-3.10 by "Atin Mukherjee" <amukherj> with a commit message- glusterd: fix tier-enabled flag op-version check

tier-enabled flag in volinfo structure was introduced in 3.10, however
while writing this value to the glusterd store was done with a wrong
op-version check which results into volume checksum failure during upgrades.

>Change-Id: I4330d0c4594eee19cba42e2cdf49a63f106627d4
>BUG: 1544600
>Signed-off-by: Atin Mukherjee <amukherj>

Change-Id: I4330d0c4594eee19cba42e2cdf49a63f106627d4
BUG: 1544461
Signed-off-by: hari gowtham <hgowtham>

Comment 17 Marc 2018-03-01 10:47:59 UTC

(In reply to Jiffin from comment #15)
> (In reply to Atin Mukherjee from comment #14)
> > Jiffin - we're talking about release-3.10 here.
> 
> 3.10.x release usually happens at end of every month

Hi Jiffin,

Any news about the release, from your message yesterday the release supposed to happen.

Comment 18 Jiffin 2018-03-02 04:17:13 UTC

(In reply to Marc from comment #17)
> (In reply to Jiffin from comment #15)
> > (In reply to Atin Mukherjee from comment #14)
> > > Jiffin - we're talking about release-3.10 here.
> > 
> > 3.10.x release usually happens at end of every month
> 
> Hi Jiffin,
> 
> Any news about the release, from your message yesterday the release supposed
> to happen.

Don't worry the change is available in 3.10.11, it may take a while to close this bug.

Comment 19 Shyamsundar 2018-03-10 12:37:56 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.11, please open a new bug report.

glusterfs-3.10.11 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2018-March/000091.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.