1420993 – Modified volume options not synced once offline nodes comes up.

Bug 1420993 - Modified volume options not synced once offline nodes comes up.

Summary: Modified volume options not synced once offline nodes comes up.

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	3.8
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Atin Mukherjee
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1420637
Blocks:	1420635 1420991
TreeView+	depends on / blocked

Reported:	2017-02-10 05:05 UTC by Atin Mukherjee
Modified:	2017-03-18 10:52 UTC (History)
CC List:	5 users (show)
Fixed In Version:	glusterfs-3.8.10
Clone Of:	1420637
Environment:
Last Closed:	2017-03-18 10:52:09 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Atin Mukherjee 2017-02-10 05:05:22 UTC

+++ This bug was initially created as a clone of Bug #1420637 +++

+++ This bug was initially created as a clone of Bug #1420635 +++

Description of problem:
=======================
modification done to the volume when some cluster are down are not synced once offline nodes comes up.



Version-Release number of selected component (if applicable):
==============================================================
glusterfs-3.8.4-14

How reproducible:
=================
Always


Steps to Reproduce:
====================
1. Have 3 nodes cluster
2. Create and start a Distributed volume using 3 bricks (pick one from each node)
3. stop glusterd on two nodes (say n2 and n3 )
4. change these volume options from default to 
performance.readdir-ahead from on to off
cluster.server-quorum-ratio from default value to 30
5. Now start glusterd on n2 and n3 nodes
6. check the volume info on both nodes and check modified volume options are synced.

Actual results:
===============
Modified volume options not synced once offline nodes comes up.


Expected results:
=================
sync should happen once nodes comes up.


Additional info:

--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-02-09 01:49:18 EST ---

This bug is automatically being proposed for the current release of Red Hat Gluster Storage 3 under active development, by setting the release flag 'rhgs‑3.2.0' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Byreddy on 2017-02-09 01:52:24 EST ---

errors in glusterd log:
=======================
[2017-02-09 06:40:29.199737] E [MSGID: 106422] [glusterd-utils.c:4357:glusterd_compare_friend_data] 0-management: Importing global options failed
[2017-02-09 06:40:29.199775] E [MSGID: 106376] [glusterd-sm.c:1397:glusterd_friend_sm] 0-glusterd: handler returned: 2
[2017-02-09 06:40:29.199926] I [MSGID: 106493] [glusterd-rpc-ops.c:478:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 273c5136-66a9-4b3e-8f1d-fb45509a4a18, host: dhcp41-198.lab.eng.blr.redhat.com, port: 0
[2017-02-09 06:40:29.238089] I [MSGID: 106492] [glusterd-handler.c:2788:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 273c5136-66a9-4b3e-8f1d-fb45509a4a18
[2017-02-09 06:40:29.238127] I [MSGID: 106502] [glusterd-handler.c:2833:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
[2017-02-09 06:40:29.270561] I [MSGID: 106493] [glusterd-rpc-ops.c:693:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: 273c5136-66a9-4b3e-8f1d-fb45509a4a18
[2017-02-09 06:40:29.270981] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped
[2017-02-09 06:40:29.271042] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: nfs service is stopped
[2017-02-09 06:40:29.271475] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: glustershd already stopped
[2017-02-09 06:40:29.271522] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: glustershd service is stopped
[2017-02-09 06:40:29.271591] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: quotad already stopped
[2017-02-09 06:40:29.271715] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: quotad service is stopped
[2017-02-09 06:40:29.271807] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped
[2017-02-09 06:40:29.271841] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: bitd service is stopped
[2017-02-09 06:40:29.271901] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already stopped
[2017-02-09 06:40:29.271947] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: scrub service is stopped
[2017-02-09 06:40:29.272106] I [rpc-clnt.c:1046:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2017-02-09 06:40:30.976089] I [MSGID: 106488] [glusterd-handler.c:1539:__glusterd_handle_cli_get_volume] 0-management: Received get vol req
[2017-02-09 06:40:30.977864] I [MSGID: 106488] [glusterd-handler.c:1539:__glusterd_handle_cli_get_volume] 0-management: Received get vol req
[2017-02-09 06:40:44.641763] I [MSGID: 106163] [glusterd-handshake.c:1274:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30901
[2017-02-09 06:40:44.723849] I [MSGID: 106490] [glusterd-handler.c:2610:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: c744d8ef-71ba-4429-9243-0456d2654824
[2017-02-09 06:40:44.764377] I [MSGID: 106493] [glusterd-handler.c:3865:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to 10.70.43.71 (0), ret: 0, op_ret: 0
[2017-02-09 06:40:44.916543] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped
[2017-02-09 06:40:44.916586] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: nfs service is stopped
[2017-02-09 06:40:44.916926] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: glustershd already stopped
[2017-02-09 06:40:44.916951] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: glustershd service is stopped
[2017-02-09 06:40:44.916985] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: quotad already stopped
[2017-02-09 06:40:44.917006] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: quotad service is stopped
[2017-02-09 06:40:44.917041] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped
[2017-02-09 06:40:44.917067] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: bitd service is stopped
[2017-02-09 06:40:44.917133] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already stopped
[2017-02-09 06:40:44.917161] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: scrub service is stopped
[2017-02-09 06:40:44.924636] I [MSGID: 106492] [glusterd-handler.c:2788:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: c744d8ef-71ba-4429-9243-0456d2654824
[2017-02-09 06:40:44.941841] I [MSGID: 106502] [glusterd-handler.c:2833:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
[2017-02-09 06:50:54.497245] E [rpc-clnt.c:200:call_bail] 0-management: bailing out frame type(Peer mgmt) op(--(2)) xid = 0x4 sent = 2017-02-09 06:40:44.860661. timeout = 600 for 10.70.43.71:24007
(END)

--- Additional comment from Worker Ant on 2017-02-09 02:33:00 EST ---

REVIEW: https://review.gluster.org/16574 (glusterd: ignore return code of glusterd_restart_bricks) posted (#1) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Worker Ant on 2017-02-09 11:46:03 EST ---

COMMIT: https://review.gluster.org/16574 committed in master by Atin Mukherjee (amukherj) 
------
commit 55625293093d485623f3f3d98687cd1e2c594460
Author: Atin Mukherjee <amukherj>
Date:   Thu Feb 9 12:56:38 2017 +0530

    glusterd: ignore return code of glusterd_restart_bricks
    
    When GlusterD is restarted on a multi node cluster, while syncing the
    global options from other GlusterD, it checks for quorum and based on
    which it decides whether to stop/start a brick. However we handle the
    return code of this function in which case if we don't want to start any
    bricks the ret will be non zero and we will end up failing the import
    which is incorrect.
    
    Fix is just to ignore the ret code of glusterd_restart_bricks ()
    
    Change-Id: I37766b0bba138d2e61d3c6034bd00e93ba43e553
    BUG: 1420637
    Signed-off-by: Atin Mukherjee <amukherj>
    Reviewed-on: https://review.gluster.org/16574
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Samikshan Bairagya <samikshan>
    Reviewed-by: Jeff Darcy <jdarcy>

Comment 1 Worker Ant 2017-02-10 05:06:04 UTC

REVIEW: https://review.gluster.org/16594 (glusterd: ignore return code of glusterd_restart_bricks) posted (#1) for review on release-3.8 by Atin Mukherjee (amukherj)

Comment 2 Worker Ant 2017-02-21 03:33:58 UTC

COMMIT: https://review.gluster.org/16594 committed in release-3.8 by Atin Mukherjee (amukherj) 
------
commit dae553d2538cf465edfff567909b846025762a3e
Author: Atin Mukherjee <amukherj>
Date:   Thu Feb 9 12:56:38 2017 +0530

    glusterd: ignore return code of glusterd_restart_bricks
    
    When GlusterD is restarted on a multi node cluster, while syncing the
    global options from other GlusterD, it checks for quorum and based on
    which it decides whether to stop/start a brick. However we handle the
    return code of this function in which case if we don't want to start any
    bricks the ret will be non zero and we will end up failing the import
    which is incorrect.
    
    Fix is just to ignore the ret code of glusterd_restart_bricks ()
    
    >Reviewed-on: https://review.gluster.org/16574
    >Smoke: Gluster Build System <jenkins.org>
    >NetBSD-regression: NetBSD Build System <jenkins.org>
    >CentOS-regression: Gluster Build System <jenkins.org>
    >Reviewed-by: Samikshan Bairagya <samikshan>
    >Reviewed-by: Jeff Darcy <jdarcy>
    >(cherry picked from commit 55625293093d485623f3f3d98687cd1e2c594460)
    
    Change-Id: I37766b0bba138d2e61d3c6034bd00e93ba43e553
    BUG: 1420993
    Signed-off-by: Atin Mukherjee <amukherj>
    Reviewed-on: https://review.gluster.org/16594
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Samikshan Bairagya <samikshan>
    CentOS-regression: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>

Comment 3 Niels de Vos 2017-03-18 10:52:09 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.10, please open a new bug report.

glusterfs-3.8.10 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-March/000068.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.