Bug 1119582

Summary: glusterd does not start if older volume exists
Product: [Community] GlusterFS Reporter: Nithya Balachandran <nbalacha>
Component: coreAssignee: Raghavendra Bhat <rabhat>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs, gluster-bugs, hamiller
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.6.0beta1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1157975 (view as bug list) Environment:
Last Closed: 2014-11-11 08:37:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1157975    

Description Nithya Balachandran 2014-07-15 06:13:37 UTC
Description of problem:
Glusterd build using the latest master code base does not start if a volume created when using the older gluster code is present on the node


Version-Release number of selected component (if applicable):


How reproducible:
Consistently

Steps to Reproduce:
1.Checkout code from git (I was using commit d3f0de90d0c5166e63f5764d2f21703fd29ce976 ), build and install gluster. (make install)
2.Create a volume and do some file operations.
3.Update to the latest glusterfs master code as on July 14 2014, build and install gluster on the nodes.
4. Start glusterd 

Actual results:
Glusterd refuses to start

Expected results:
Glusterd should start


Additional info:

The following log messages are seen:
Jul 14 10:41:29 gluster-srv2 usr-local-etc-glusterfs-glusterd.vol[30355]: [glusterd.c:1215:init] 0-management: Using /var/lib/glusterd as working directory
Jul 14 10:41:29 gluster-srv2 usr-local-etc-glusterfs-glusterd.vol[30355]: [rdma.c:4194:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed (No such device)
Jul 14 10:41:29 gluster-srv2 usr-local-etc-glusterfs-glusterd.vol[30355]: [rdma.c:4482:init] 0-rdma.management: Failed to initialize IB Device
Jul 14 10:41:29 gluster-srv2 usr-local-etc-glusterfs-glusterd.vol[30355]: [rpc-transport.c:333:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed
Jul 14 10:41:29 gluster-srv2 usr-local-etc-glusterfs-glusterd.vol[30355]: [rpcsvc.c:1524:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed
Jul 14 10:41:34 gluster-srv2 usr-local-etc-glusterfs-glusterd.vol[30355]: [glusterd-store.c:2020:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 30600 
Jul 14 10:41:34 gluster-srv2 usr-local-etc-glusterfs-glusterd.vol[30355]: [store.c:432:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/vols/holmes/snapd.info, returned error: (No such file or directory)
Jul 14 10:41:34 gluster-srv2 usr-local-etc-glusterfs-glusterd.vol[30355]: [glusterd-store.c:2124:glusterd_store_retrieve_snapd] 0-management: volinfo handle is NULL
Jul 14 10:41:34 gluster-srv2 usr-local-etc-glusterfs-glusterd.vol[30355]: [glusterd-store.c:3009:glusterd_store_retrieve_volumes] 0-: Unable to restore volume: holmes
Jul 14 10:41:34 gluster-srv2 usr-local-etc-glusterfs-glusterd.vol[30355]: [xlator.c:425lator_init] 0-management: Initialization of volume 'management' failed, review your volfile again
Jul 14 10:41:34 gluster-srv2 usr-local-etc-glusterfs-glusterd.vol[30355]: [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed 
Jul 14 10:41:34 gluster-srv2 usr-local-etc-glusterfs-glusterd.vol[30355]: [graph.c:525:glusterfs_graph_activate] 0-graph: init failed




I reverted to the earlier gluster build, deleted the volume and then installed the latest gluster code build. glusterd then started without issues.

Comment 1 Anand Avati 2014-07-15 10:38:57 UTC
REVIEW: http://review.gluster.org/8310 (mgmt/glusterd: do not check for snapd handle in restore if uss is disabled) posted (#1) for review on master by Raghavendra Bhat (raghavendra)

Comment 2 Anand Avati 2014-07-16 06:38:47 UTC
COMMIT: http://review.gluster.org/8310 committed in master by Kaushal M (kaushal) 
------
commit dcc1696045f12127ff37e6312a04c0024c8a4e24
Author: Raghavendra Bhat <raghavendra>
Date:   Tue Jul 15 15:55:34 2014 +0530

    mgmt/glusterd: do not check for snapd handle in restore if uss is disabled
    
    Change-Id: I01afe64685a5794cce9265580c6c5de57a045201
    BUG: 1119582
    Signed-off-by: Raghavendra Bhat <raghavendra>
    Reviewed-on: http://review.gluster.org/8310
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Kaushal M <kaushal>

Comment 3 Niels de Vos 2014-09-22 12:44:53 UTC
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/

Comment 4 Niels de Vos 2014-11-11 08:37:05 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report.

glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html
[2] http://supercolony.gluster.org/mailman/listinfo/gluster-users

Comment 5 Harold Miller 2015-02-04 16:39:24 UTC
I see this BZ was closed, but I am seeing the following in a current customer case.
All nodes upgraded from 3.0u1 to 3.0u3  are facing same symptoms. 

service glusterd start FAILS.

Here is interesting part from debug mode:

2015-01-27 13:55:13.119036] D [store.c:608:gf_store_iter_get_next] 0-: Returning with 0
[2015-01-27 13:55:13.119043] D [store.c:608:gf_store_iter_get_next] 0-: Returning with 0
[2015-01-27 13:55:13.119051] D [store.c:608:gf_store_iter_get_next] 0-: Returning with -1
[2015-01-27 13:55:13.119066] D [glusterd-store.c:2414:glusterd_store_retrieve_bricks] 0-: Returning with 0
[2015-01-27 13:55:13.119086] E [store.c:432:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/vols/user_home/snapd.info, returned error: (No such file or directory)
[2015-01-27 13:55:13.119093] D [store.c:437:gf_store_handle_retrieve] 0-: Returning -1
[2015-01-27 13:55:13.119099] E [glusterd-store.c:2198:glusterd_store_retrieve_snapd] 0-management: volinfo handle is NULL
[2015-01-27 13:55:13.119109] D [glusterd-utils.c:957:glusterd_volume_brickinfos_delete] 0-management: Returning 0
[2015-01-27 13:55:13.119117] D [store.c:458:gf_store_handle_destroy] 0-: Returning 0
[2015-01-27 13:55:13.119123] D [glusterd-utils.c:1001:glusterd_volinfo_delete] 0-management: Returning 0
[2015-01-27 13:55:13.119128] E [glusterd-store.c:3082:glusterd_store_retrieve_volumes] 0-: Unable to restore volume: user_home
[2015-01-27 13:55:13.119136] D [glusterd-store.c:3114:glusterd_store_retrieve_volumes] 0-: Returning with -1
[2015-01-27 13:55:13.119142] D [glusterd-store.c:4308:glusterd_restore] 0-: Returning -1
[2015-01-27 13:55:13.119152] E [xlator.c:406:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again
[2015-01-27 13:55:13.119159] E [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed
[2015-01-27 13:55:13.119165] E [graph.c:525:glusterfs_graph_activate] 0-graph: init failed
[2015-01-27 13:55:13.119644] D [logging.c:1805:gf_log_flush_extra_msgs] 0-logging-infra: Log buffer size reduced. About to flush 3 extra log messages
[2015-01-27 13:55:13.119662] D [logging.c:1808:gf_log_flush_extra_msgs] 0-logging-infra: Just flushed 3 extra log messages
[2015-01-27 13:55:13.119809] W [glusterfsd.c:1182:cleanup_and_exit] (--> 0-: received signum (0), shutting down
[2015-01-27 13:55:13.119827] D [glusterfsd-mgmt.c:2244:glusterfs_mgmt_pmap_signout] 0-fsd-mgmt: portmapper signout arguments not given

File snapd.info is missing indeed.