Bug 1365265 - Glusterd not operational due to snapshot conflicting with nfs-ganesha export file in "/var/lib/glusterd/snaps"
Summary: Glusterd not operational due to snapshot conflicting with nfs-ganesha export ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: ganesha-nfs
Version: 3.8.2
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On: 1318000 1318591
Blocks: 1365797
TreeView+ depends on / blocked
 
Reported: 2016-08-08 18:06 UTC by Jiffin
Modified: 2016-08-12 09:48 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.8.2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1318591
: 1365797 (view as bug list)
Environment:
Last Closed: 2016-08-12 09:48:48 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Jiffin 2016-08-08 18:06:34 UTC
+++ This bug was initially created as a clone of Bug #1318591 +++

+++ This bug was initially created as a clone of Bug #1318000 +++

Description of problem:

Glusterd is not operational on one node in the cluster (lvp87) due to snapshot conflicting with nfs-ganesha "export.Scheduled-Job-registry-1-registry_GMT-2016.03.14-11.00.01.conf" in "/var/lib/glusterd/snaps"

Glusterd logs:

[2016-03-15 07:21:49.063861] E [MSGID: 101032] [store.c:435:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/snaps/Scheduled-Job-registry-1-registry_GMT-2016.03.14-11.00.01/export.Scheduled-Job-registry-1-registry_GMT-2016.03.14-11.00.01.conf/info. [Not a directory]
[2016-03-15 07:21:49.063869] D [MSGID: 0] [store.c:440:gf_store_handle_retrieve] 0-: Returning -1
[2016-03-15 07:21:49.063873] E [MSGID: 106200] [glusterd-store.c:2550:glusterd_store_update_volinfo] 0-management: volinfo handle is NULL
[2016-03-15 07:21:49.063878] E [MSGID: 106207] [glusterd-store.c:2848:glusterd_store_retrieve_volume] 0-management: Failed to update volinfo for export.Scheduled-Job-registry-1-registry_GMT-2016.03.14-11.00.01.conf volume
[2016-03-15 07:21:49.063883] D [MSGID: 0] [glusterd-utils.c:893:glusterd_volume_brickinfos_delete] 0-management: Returning 0
[2016-03-15 07:21:49.063888] D [MSGID: 0] [store.c:461:gf_store_handle_destroy] 0-: Returning 0
[2016-03-15 07:21:49.063897] D [MSGID: 0] [glusterd-utils.c:937:glusterd_volinfo_delete] 0-management: Returning 0
[2016-03-15 07:21:49.063902] E [MSGID: 106201] [glusterd-store.c:3046:glusterd_store_retrieve_volumes] 0-management: Unable to restore volume: export.Scheduled-Job-registry-1-registry_GMT-2016.03.14-11.00.01.conf
[2016-03-15 07:21:49.063927] D [MSGID: 0] [glusterd-store.c:3071:glusterd_store_retrieve_volumes] 0-management: Returning with -1
[2016-03-15 07:21:49.063937] E [MSGID: 106195] [glusterd-store.c:3439:glusterd_store_retrieve_snap] 0-management: Failed to retrieve snap volumes for snap Scheduled-Job-registry-1-registry_GMT-2016.03.14-11.00.01
[2016-03-15 07:21:49.063942] E [MSGID: 106043] [glusterd-store.c:3593:glusterd_store_retrieve_snaps] 0-management: Unable to restore snapshot: Scheduled-Job-registry-1-registry_GMT-2016.03.14-11.00.01
[2016-03-15 07:21:49.063948] D [MSGID: 0] [glusterd-store.c:3611:glusterd_store_retrieve_snaps] 0-management: Returning with -1
[2016-03-15 07:21:49.063953] D [MSGID: 0] [glusterd-store.c:4343:glusterd_restore] 0-management: Returning -1
[2016-03-15 07:21:49.063967] E [MSGID: 101019] [xlator.c:428:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again
[2016-03-15 07:21:49.063973] E [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed
[2016-03-15 07:21:49.063977] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed
[2016-03-15 07:21:49.064238] D [logging.c:1764:gf_log_flush_extra_msgs] 0-logging-infra: Log buffer size reduced. About to flush 5 extra log messages
[2016-03-15 07:21:49.064249] D [logging.c:1767:gf_log_flush_extra_msgs] 0-logging-infra: Just flushed 5 extra log messages
[2016-03-15 07:21:49.064417] W [glusterfsd.c:1236:cleanup_and_exit] (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xfd) [0x7f0a7d57e2fd] -->/usr/sbin/glusterd(glusterfs_process_volfp+0x126) [0x7f0a7d57e1a6] -->/usr/sbin/glusterd(cleanup_and_exit+0x69) [0x7f0a7d57d789] ) 0-: received signum (0), shutting down
[2016-03-15 07:21:49.064435] D [glusterfsd-mgmt.c:2355:glusterfs_mgmt_pmap_signout] 0-fsd-mgmt: portmapper signout arguments not given

Volume status:

Status of volume: certsd
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick sv-2000lvp88.paas.local:/bricks/thin_
brick_certsd/brick_certsd_lvp88             49160     0          Y       2005 
NFS Server on localhost                     2049      0          Y       3584 
Self-heal Daemon on localhost               N/A       N/A        Y       3592 
 
Task Status of Volume certsd
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: gluster_shared_storage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick sv-2000lvp88.paas.local:/var/lib/glus
terd/ss_brick                               49161     0          Y       3564 
NFS Server on localhost                     2049      0          Y       3584 
Self-heal Daemon on localhost               N/A       N/A        Y       3592 
 
Task Status of Volume gluster_shared_storage
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: pv01-sknd3
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick sv-2000lvp88.paas.local:/bricks/thin_
brick_pv01-sknd3/brick_pv01-sknd3_lvp88     49154     0          Y       2006 
Self-heal Daemon on localhost               N/A       N/A        Y       3592 
 
Task Status of Volume pv01-sknd3
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: pv02-ddfe5
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick sv-2000lvp88.paas.local:/bricks/thin_
brick_pv02-ddfe5/brick_pv02-ddfe5_lvp88     49155     0          Y       2016 
Self-heal Daemon on localhost               N/A       N/A        Y       3592 
 
Task Status of Volume pv02-ddfe5
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: pv03-ed6fc
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick sv-2000lvp88.paas.local:/bricks/thin_
brick_pv03-ed6fc/brick_pv03-ed6fc_lvp88     49156     0          Y       2022 
Self-heal Daemon on localhost               N/A       N/A        Y       3592 
 
Task Status of Volume pv03-ed6fc
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: pv04-1fr6e
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick sv-2000lvp88.paas.local:/bricks/thin_
brick_pv04-1fr6e/brick_pv04-1fr6e_lvp88     49157     0          Y       2028 
Self-heal Daemon on localhost               N/A       N/A        Y       3592 
 
Task Status of Volume pv04-1fr6e
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: pv05-ku56u
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick sv-2000lvp88.paas.local:/bricks/thin_
brick_pv05-ku56u/brick_pv05-ku56u_lvp88     49158     0          Y       2030 
Self-heal Daemon on localhost               N/A       N/A        Y       3592 
 
Task Status of Volume pv05-ku56u
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: pv06-m6o8i
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick sv-2000lvp88.paas.local:/bricks/thin_
brick_pv06-m6o8i/brick_pv06-m6o8i_lvp88     49159     0          Y       2045 
Self-heal Daemon on localhost               N/A       N/A        Y       3592 
 
Task Status of Volume pv06-m6o8i
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: registry
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick sv-2000lvp88.paas.local:/bricks/thin_
brick_registry/brick_registry_lvp88         49153     0          Y       2052 
Self-heal Daemon on localhost               N/A       N/A        Y       3592

Volume info:

Volume Name: pv06-m6o8i
Type: Replicate
Volume ID: b9ecd956-a10d-427a-ad95-81f735c58050
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: sv-2000lvp87.paas.local:/bricks/thin_brick_pv06-m6o8i/brick_pv06-m6o8i_lvp87
Brick2: sv-2000lvp88.paas.local:/bricks/thin_brick_pv06-m6o8i/brick_pv06-m6o8i_lvp88
Options Reconfigured:
performance.readdir-ahead: on
nfs.disable: true
snap-activate-on-create: enable
auto-delete: enable
cluster.enable-shared-storage: enable

Volume Name: certsd
Type: Replicate
Volume ID: 17839849-381f-4299-8088-a1e62765e09c
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: sv-2000lvp87.paas.local:/bricks/thin_brick_certsd/brick_certsd_lvp87
Brick2: sv-2000lvp88.paas.local:/bricks/thin_brick_certsd/brick_certsd_lvp88
Options Reconfigured:
performance.readdir-ahead: on
snap-activate-on-create: enable
auto-delete: enable
cluster.enable-shared-storage: enable
[root@sv-2000lvp88 ~]# gluster v info registry

Volume Name: registry
Type: Replicate
Volume ID: 1619f2f4-892d-48d2-9c7a-431a8b57a67e
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: sv-2000lvp87.paas.local:/bricks/thin_brick_registry/brick_registry_lvp87
Brick2: sv-2000lvp88.paas.local:/bricks/thin_brick_registry/brick_registry_lvp88
Options Reconfigured:
features.barrier: disable
performance.readdir-ahead: on
nfs.disable: true
snap-activate-on-create: enable
auto-delete: enable
cluster.enable-shared-storage: enable
[root@sv-2000lvp88 ~]# gluster v info gluster_shared_storage

Volume Name: gluster_shared_storage
Type: Replicate
Volume ID: c160df0e-4472-47e0-80dd-e118b5dddc3f
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: sv-2000lvp88.paas.local:/var/lib/glusterd/ss_brick
Brick2: sv-2000lvp87:/var/lib/glusterd/ss_brick
Options Reconfigured:
performance.readdir-ahead: on
snap-activate-on-create: enable
auto-delete: enable
cluster.enable-shared-storage: enable

Version-Release number of selected component (if applicable):

RHGS 3.1.2
RHEL 7.2

How reproducible:

Always

Steps to Reproduce:

* Stop glusterd on both the nodes in the cluster

* Removed nf-ganesha package on both nodes than removed the exported conf file "export.Scheduled-Job-registry-1-registry_GMT-2016.03.14-11.00.01.conf" from "/var/lib/glusterd/snaps"

* Start Glusterd on both the nodes

Actual results:

Due to conflict, glusterd is not operational.

Expected results:

There should be no conflict & glusterd should be operational.

Additional info:

--- Additional comment from Mukul Malhotra on 2016-03-15 14:18 EDT ---



--- Additional comment from Mukul Malhotra on 2016-03-15 14:19 EDT ---



--- Additional comment from Mukul Malhotra on 2016-03-16 10:35:49 EDT ---

Hello,

Also, please improve the logging related to nfs-ganesha in the logs so that the same should be visible & would help in troubleshooting the issue.

Thanks
Mukul

--- Additional comment from Vijay Bellur on 2016-03-17 10:02:15 EDT ---

REVIEW: http://review.gluster.org/13763 (snapshot : Copy the export configuration properly) posted (#1) for review on master by jiffin tony Thottan (jthottan)

--- Additional comment from Vijay Bellur on 2016-03-17 10:02:19 EDT ---

REVIEW: http://review.gluster.org/13764 (glusterd : Skip invalid entries in "vols" directory) posted (#1) for review on master by jiffin tony Thottan (jthottan)

--- Additional comment from Vijay Bellur on 2016-03-18 05:36:22 EDT ---

REVIEW: http://review.gluster.org/13763 (snapshot : Copy the export configuration properly) posted (#2) for review on master by jiffin tony Thottan (jthottan)

--- Additional comment from Vijay Bellur on 2016-03-18 06:37:47 EDT ---

REVIEW: http://review.gluster.org/13764 (glusterd : Skip invalid entries in "vols" directory) posted (#2) for review on master by jiffin tony Thottan (jthottan)

--- Additional comment from Vijay Bellur on 2016-03-24 06:45:01 EDT ---

REVIEW: http://review.gluster.org/13764 (glusterd : read only directories inside vols) posted (#3) for review on master by jiffin tony Thottan (jthottan)

--- Additional comment from Vijay Bellur on 2016-03-24 06:45:05 EDT ---

REVIEW: http://review.gluster.org/13763 (snapshot : Copy the export configuration properly) posted (#3) for review on master by jiffin tony Thottan (jthottan)

--- Additional comment from Mike McCune on 2016-03-28 18:50:27 EDT ---

This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

--- Additional comment from Vijay Bellur on 2016-03-29 07:11:14 EDT ---

REVIEW: http://review.gluster.org/13764 (glusterd : read only directories inside vols) posted (#4) for review on master by jiffin tony Thottan (jthottan)

--- Additional comment from Vijay Bellur on 2016-03-29 07:11:18 EDT ---

REVIEW: http://review.gluster.org/13763 (snapshot : Copy the export configuration properly) posted (#4) for review on master by jiffin tony Thottan (jthottan)

--- Additional comment from Vijay Bellur on 2016-03-30 02:59:09 EDT ---

REVIEW: http://review.gluster.org/13764 (glusterd : read only directories inside vols) posted (#5) for review on master by jiffin tony Thottan (jthottan)

--- Additional comment from Vijay Bellur on 2016-03-30 02:59:12 EDT ---

REVIEW: http://review.gluster.org/13763 (snapshot : Copy the export configuration properly) posted (#5) for review on master by jiffin tony Thottan (jthottan)

--- Additional comment from Vijay Bellur on 2016-06-23 05:00:28 EDT ---

REVIEW: http://review.gluster.org/13763 (snapshot : Copy the export configuration properly) posted (#6) for review on master by jiffin tony Thottan (jthottan)

--- Additional comment from Vijay Bellur on 2016-06-23 13:17:20 EDT ---

REVIEW: http://review.gluster.org/13764 (glusterd : read only directories inside vols) posted (#6) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Vijay Bellur on 2016-07-08 07:54:24 EDT ---

REVIEW: http://review.gluster.org/13764 (glusterd : read only directories inside vols) posted (#7) for review on master by jiffin tony Thottan (jthottan)

--- Additional comment from Vijay Bellur on 2016-07-08 08:11:16 EDT ---

REVIEW: http://review.gluster.org/13763 (snapshot : Copy the export configuration properly) posted (#7) for review on master by jiffin tony Thottan (jthottan)

--- Additional comment from Vijay Bellur on 2016-07-08 08:18:21 EDT ---

REVIEW: http://review.gluster.org/13763 (snapshot : Copy the export configuration properly) posted (#8) for review on master by jiffin tony Thottan (jthottan)

--- Additional comment from Vijay Bellur on 2016-08-05 05:05:27 EDT ---

REVIEW: http://review.gluster.org/13764 (glusterd : read only directories inside vols) posted (#8) for review on master by jiffin tony Thottan (jthottan)

--- Additional comment from Vijay Bellur on 2016-08-05 08:07:14 EDT ---

REVIEW: http://review.gluster.org/13764 (glusterd : skip non directories inside /var/lib/glusterd/vols) posted (#9) for review on master by jiffin tony Thottan (jthottan)

--- Additional comment from Vijay Bellur on 2016-08-08 02:48:55 EDT ---

REVIEW: http://review.gluster.org/13764 (glusterd : skip non directories inside /var/lib/glusterd/vols) posted (#10) for review on master by jiffin tony Thottan (jthottan)

--- Additional comment from Vijay Bellur on 2016-08-08 10:23:54 EDT ---

COMMIT: http://review.gluster.org/13764 committed in master by Atin Mukherjee (amukherj) 
------
commit 720b63c24b07ee64e1338db28de602b9abbef0a1
Author: Jiffin Tony Thottan <jthottan>
Date:   Thu Mar 17 18:53:13 2016 +0530

    glusterd : skip non directories inside /var/lib/glusterd/vols
    
    Right now glusterd won't come up if vols directory contains an invalid entry.
    Instead of doing that with this change a message will be logged and then skip
    that entry
    
    Change-Id: I665b5c35291b059cf054622da0eec4db44ec5f68
    BUG: 1318591
    Signed-off-by: Jiffin Tony Thottan <jthottan>
    Reviewed-on: http://review.gluster.org/13764
    Reviewed-by: Prashanth Pai <ppai>
    Reviewed-by: Atin Mukherjee <amukherj>
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>

Comment 1 Vijay Bellur 2016-08-08 18:08:02 UTC
REVIEW: http://review.gluster.org/15113 (glusterd : skip non directories inside /var/lib/glusterd/vols) posted (#1) for review on release-3.8 by jiffin tony Thottan (jthottan)

Comment 2 Vijay Bellur 2016-08-09 15:03:18 UTC
COMMIT: http://review.gluster.org/15113 committed in release-3.8 by Niels de Vos (ndevos) 
------
commit 32b5667f017e3206bd9d3b65074eecb7ddb84244
Author: Jiffin Tony Thottan <jthottan>
Date:   Thu Mar 17 18:53:13 2016 +0530

    glusterd : skip non directories inside /var/lib/glusterd/vols
    
    Right now glusterd won't come up if vols directory contains an invalid entry.
    Instead of doing that with this change a message will be logged and then skip
    that entry
    
    Backport details:
    >Change-Id: I665b5c35291b059cf054622da0eec4db44ec5f68
    >BUG: 1318591
    >Signed-off-by: Jiffin Tony Thottan <jthottan>
    >Reviewed-on: http://review.gluster.org/13764
    >Reviewed-by: Prashanth Pai <ppai>
    >Reviewed-by: Atin Mukherjee <amukherj>
    >Smoke: Gluster Build System <jenkins.org>
    >CentOS-regression: Gluster Build System <jenkins.org>
    >NetBSD-regression: NetBSD Build System <jenkins.org>
    (cherry picked from commit 720b63c24b07ee64e1338db28de602b9abbef0a1)
    
    Change-Id: I665b5c35291b059cf054622da0eec4db44ec5f68
    BUG: 1365265
    Signed-off-by: Jiffin Tony Thottan <jthottan>
    Reviewed-on: http://review.gluster.org/15113
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Prashanth Pai <ppai>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Niels de Vos <ndevos>

Comment 3 Niels de Vos 2016-08-12 09:48:48 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.2, please open a new bug report.

glusterfs-3.8.2 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://www.gluster.org/pipermail/announce/2016-August/000058.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.