| Summary: | glusterd crashed while trying to restore volumes | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Raghavendra G <raghavendra> | ||||
| Component: | glusterd | Assignee: | Pranith Kumar K <pkarampu> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||
| Severity: | high | Docs Contact: | |||||
| Priority: | low | ||||||
| Version: | mainline | CC: | gluster-bugs, rabhat, vijay | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | Type: | --- | |||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
|
Description
Raghavendra G
2010-11-09 03:53:44 UTC
It crashed while trying to restore a brick named ":". Below are the contents of the file:
raghu@booradley:/etc/glusterd/vols/local/bricks$ cat /etc/glusterd/vols/local/bricks/:
hostname=
path=
listen-port=0
hostname=
path=
listen-port=0
I've attached .cmd_log_history.
Below is the backtrace:
(gdb) bt
#0 0xb7d65490 in strncpy () from /lib/libc.so.6
#1 0xb6945fc6 in glusterd_store_retrieve_bricks (volinfo=0x8084c68)
at ../../../../../xlators/mgmt/glusterd/src/glusterd-store.c:961
#2 0xb6946760 in glusterd_store_retrieve_volume (volname=0x807aebb "local")
at ../../../../../xlators/mgmt/glusterd/src/glusterd-store.c:1108
#3 0xb6946a13 in glusterd_store_retrieve_volumes (this=0x8076808)
at ../../../../../xlators/mgmt/glusterd/src/glusterd-store.c:1153
#4 0xb6947dfd in glusterd_restore () at ../../../../../xlators/mgmt/glusterd/src/glusterd-store.c:1536
#5 0xb690f705 in init (this=0x8076808) at ../../../../../xlators/mgmt/glusterd/src/glusterd.c:404
#6 0xb7e994fa in __xlator_init (xl=0x8076808) at ../../../libglusterfs/src/xlator.c:875
#7 0xb7e9960a in xlator_init (xl=0x8076808) at ../../../libglusterfs/src/xlator.c:903
#8 0xb7ec67b9 in glusterfs_graph_init (graph=0x80725e0) at ../../../libglusterfs/src/graph.c:328
#9 0xb7ec6cb3 in glusterfs_graph_activate (graph=0x80725e0, ctx=0x8071008)
at ../../../libglusterfs/src/graph.c:491
#10 0x0804d07f in glusterfs_process_volfp (ctx=0x8071008, fp=0x80723c8)
at ../../../glusterfsd/src/glusterfsd.c:1316
#11 0x0804d1ab in glusterfs_volumes_init (ctx=0x8071008) at ../../../glusterfsd/src/glusterfsd.c:1362
#12 0x0804d2ad in main (argc=2, argv=0xbfab3464) at ../../../glusterfsd/src/glusterfsd.c:1407
(gdb) f 1
#1 0xb6945fc6 in glusterd_store_retrieve_bricks (volinfo=0x8084c68)
at ../../../../../xlators/mgmt/glusterd/src/glusterd-store.c:961
961 strncpy (brickinfo->hostname, value, 1024);
(gdb) p value
$16 = 0x0
(gdb) p key
$17 = 0x8086718 "hostname"
raghu@booradley:~/work/gluster.org/git/current/glusterfs.git/build$ cat /etc/hosts
#
# hosts This file describes a number of hostname-to-address
# mappings for the TCP/IP subsystem. It is mostly
# used at boot time, when no name servers are running.
# On small systems, this file can be used instead of a
# "named" name server. Just add the names, addresses
# and any aliases to this file...
#
# By the way, Arnt Gulbrandsen <agulbra.no> says that 127.0.0.1
# should NEVER be named with the name of the machine. It causes problems
# for some (stupid) programs, irc and reputedly talk. :^)
#
# For loopbacking.
127.0.0.1 localhost
127.0.0.1 booradley
#192.168.1.13 #booradley.zillionresearch.com booradley
192.168.1.201 n1
192.168.1.202 n2
192.168.1.203 n3
192.168.1.204 n4
PATCH: http://patches.gluster.com/patch/6224 in master (mgmt/glusterd: In store-retrieve exit with error message instead of crashing.) (In reply to comment #2) > PATCH: http://patches.gluster.com/patch/6224 in master (mgmt/glusterd: In > store-retrieve exit with error message instead of crashing.) An intermediate fix that handled this crash already went in the fix for 2271. I made it a little more robust. If any of the entries in any stores/ or the files it self are missing glusterd should print the error and exit out. Probed a peer, stopeed glusterd and then removed the entry of the other peer from /etc/glusterd/peers directory. Now started glusterd. It logs the error message. [2011-03-11 16:35:13.165866] D [glusterd-store.c:1610:glusterd_store_retrieve_peers] 0-: Returning with 0 [2011-03-11 16:35:13.165882] D [glusterd-store.c:1640:glusterd_resolve_all_bricks] 0-: Returning with 0 [2011-03-11 16:35:13.165896] D [glusterd-store.c:1667:glusterd_restore] 0-: Returning 0 Given volfile: +------------------------------------------------------------------------------+ 1: volume management 2: type mgmt/glusterd 3: option working-directory /etc/glusterd 4: option transport-type socket,rdma 5: option transport.socket.keepalive-time 10 6: option transport.socket.keepalive-interval 2 7: end-volume 8: +------------------------------------------------------------------------------+ [2011-03-11 16:35:13.214601] I [glusterd-handler.c:2611:glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: eaae880d-fa3d-4ba9-a53d-417323598df0 [2011-03-11 16:35:13.214682] I [glusterd-handler.c:379:glusterd_friend_find] 0-glusterd: Unable to find peer by uuid [2011-03-11 16:35:13.248038] I [glusterd-handler.c:391:glusterd_friend_find] 0-glusterd: Unable to find hostname: 192.168.1.104 [2011-03-11 16:35:13.248298] I [glusterd-handler.c:3267:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to 192.168.1.104 (24007), ret: 0 |