Bug 763652 - (GLUSTER-1920) Start and stop glusterd fails to start a previously created volume
Start and stop glusterd fails to start a previously created volume
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: glusterd (Show other bugs)
3.1.0
All Linux
urgent Severity medium
: ---
: ---
Assigned To: Pranith Kumar K
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-10-11 19:50 EDT by Harshavardhana
Modified: 2015-03-22 21:03 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Harshavardhana 2010-10-11 19:50:55 EDT
[2010-10-11 16:45:27.311338] I [glusterd.c:274:init] management: Using /etc/glusterd as working directory
[2010-10-11 16:45:27.312687] C [rdma.c:3817:rdma_init] rpc-transport/rdma: No IB devices found
[2010-10-11 16:45:27.312714] E [rdma.c:4744:init] rdma.management: Failed to initialize IB Device
[2010-10-11 16:45:27.312730] E [rpc-transport.c:965:rpc_transport_load] rpc-transport: 'rdma' initialization failed
[2010-10-11 16:45:27.312824] I [glusterd.c:86:glusterd_uuid_init] glusterd: retrieved UUID: 1bc70cba-23a1-4565-9ef0-309360442b44
[2010-10-11 16:45:27.312927] E [glusterd-store.c:1092:glusterd_store_retrieve_volume] : Unknown key: brick-0
[2010-10-11 16:45:27.317269] E [glusterd-utils.c:2137:glusterd_friend_find_by_hostname] : error in getaddrinfo: Name or service not known

[2010-10-11 16:45:27.317630] E [glusterd-utils.c:113:glusterd_is_local_addr] : error in getaddrinfo: Name or service not known

[2010-10-11 16:45:27.317650] E [glusterd-store.c:1516:glusterd_resolve_all_bricks] glusterd: resolve brick failed in restore


[root@platform test]# gluster volume start test
Starting volume test has been unsuccessful

[root@platform test]# ps -ef | grep glusterfs
root      7414  9724  0 16:46 pts/2    00:00:00 grep glusterfs
[root@platform test]

[root@platform test]# gluster volume info

Volume Name: test
Type: Distribute
Status: Stopped
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: :

I see that Brick1 is NULL not sure why it is null since i have entries. 

[root@platform test]# ls -l /etc/glusterd/vols/test/
total 24
drwxr-xr-x 2 root root 4096 2010-10-11 16:43 bricks
-rw-r--r-- 1 root root   16 2010-10-11 16:45 cksum
-rw-r--r-- 1 root root  139 2010-10-11 16:43 info
drwxr-xr-x 2 root root 4096 2010-10-11 16:39 run
-rw-r--r-- 1 root root  620 2010-10-11 16:33 test.10.1.10.202.storage.vol
-rw-r--r-- 1 root root  628 2010-10-11 16:33 test-fuse.vol
[root@platform test]# less /etc/glusterd/vols/test/info 
type=0
count=1
status=2
sub_count=0
version=1
transport-type=0
volume-id=0d30ab75-0500-4468-800c-c95b99a8b25c
brick-0=10.1.10.202:-storage
[root@platform test]# less /etc/glusterd/vols/test/cksum 
info=2332094601

PS: this happened after the segfault.
Comment 1 Pranith Kumar K 2011-03-09 23:02:20 EST
This bug is the result of not raising errors when retrieving the glusterd-store for the brick is corrupted. Ideally if there are errors in retrieving the glusterd-store, glusterd should exit giving the error, which is done as part of 2066. This corruption happened because of the bugs in glusterd-store which is fixed as part of 1754. So closing this bug

Note You need to log in before you can comment on or make changes to this bug.