Bug 763798 - (GLUSTER-2066) glusterd crashed while trying to restore volumes
glusterd crashed while trying to restore volumes
Product: GlusterFS
Classification: Community
Component: glusterd (Show other bugs)
All Linux
low Severity high
: ---
: ---
Assigned To: Pranith Kumar K
Depends On:
  Show dependency treegraph
Reported: 2010-11-09 01:52 EST by Raghavendra G
Modified: 2015-12-01 11:45 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed:
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
cmd_log_history (4.05 KB, application/octet-stream)
2010-11-08 22:53 EST, Raghavendra G
no flags Details

  None (edit)
Description Raghavendra G 2010-11-08 22:53:44 EST
Created attachment 373
Comment 1 Raghavendra G 2010-11-09 01:52:42 EST
It crashed while trying to restore a brick named ":". Below are the contents of the file:

raghu@booradley:/etc/glusterd/vols/local/bricks$ cat /etc/glusterd/vols/local/bricks/:

I've attached .cmd_log_history.

Below is the backtrace:
(gdb) bt
#0  0xb7d65490 in strncpy () from /lib/libc.so.6
#1  0xb6945fc6 in glusterd_store_retrieve_bricks (volinfo=0x8084c68)
    at ../../../../../xlators/mgmt/glusterd/src/glusterd-store.c:961
#2  0xb6946760 in glusterd_store_retrieve_volume (volname=0x807aebb "local")
    at ../../../../../xlators/mgmt/glusterd/src/glusterd-store.c:1108
#3  0xb6946a13 in glusterd_store_retrieve_volumes (this=0x8076808)
    at ../../../../../xlators/mgmt/glusterd/src/glusterd-store.c:1153
#4  0xb6947dfd in glusterd_restore () at ../../../../../xlators/mgmt/glusterd/src/glusterd-store.c:1536
#5  0xb690f705 in init (this=0x8076808) at ../../../../../xlators/mgmt/glusterd/src/glusterd.c:404
#6  0xb7e994fa in __xlator_init (xl=0x8076808) at ../../../libglusterfs/src/xlator.c:875
#7  0xb7e9960a in xlator_init (xl=0x8076808) at ../../../libglusterfs/src/xlator.c:903
#8  0xb7ec67b9 in glusterfs_graph_init (graph=0x80725e0) at ../../../libglusterfs/src/graph.c:328
#9  0xb7ec6cb3 in glusterfs_graph_activate (graph=0x80725e0, ctx=0x8071008)
    at ../../../libglusterfs/src/graph.c:491
#10 0x0804d07f in glusterfs_process_volfp (ctx=0x8071008, fp=0x80723c8)
    at ../../../glusterfsd/src/glusterfsd.c:1316
#11 0x0804d1ab in glusterfs_volumes_init (ctx=0x8071008) at ../../../glusterfsd/src/glusterfsd.c:1362
#12 0x0804d2ad in main (argc=2, argv=0xbfab3464) at ../../../glusterfsd/src/glusterfsd.c:1407
(gdb) f 1
#1  0xb6945fc6 in glusterd_store_retrieve_bricks (volinfo=0x8084c68)
    at ../../../../../xlators/mgmt/glusterd/src/glusterd-store.c:961
961                                     strncpy (brickinfo->hostname, value, 1024);
(gdb) p value
$16 = 0x0
(gdb) p key
$17 = 0x8086718 "hostname"

raghu@booradley:~/work/gluster@sv.gnu.org/git/current/glusterfs.git/build$ cat /etc/hosts
# hosts         This file describes a number of hostname-to-address
#               mappings for the TCP/IP subsystem.  It is mostly
#               used at boot time, when no name servers are running.
#               On small systems, this file can be used instead of a
#               "named" name server.  Just add the names, addresses
#               and any aliases to this file...
# By the way, Arnt Gulbrandsen <agulbra@nvg.unit.no> says that
# should NEVER be named with the name of the machine.  It causes problems
# for some (stupid) programs, irc and reputedly talk. :^)

# For loopbacking.               localhost               booradley
#           #booradley.zillionresearch.com booradley           n1           n2           n3           n4
Comment 2 Anand Avati 2011-02-22 02:11:51 EST
PATCH: http://patches.gluster.com/patch/6224 in master (mgmt/glusterd: In store-retrieve exit with error message instead of crashing.)
Comment 3 Pranith Kumar K 2011-03-11 02:48:16 EST
(In reply to comment #2)
> PATCH: http://patches.gluster.com/patch/6224 in master (mgmt/glusterd: In
> store-retrieve exit with error message instead of crashing.)

An intermediate fix that handled this crash already went in the fix for 2271. I made it a little more robust. If any of the entries in any stores/ or the files it self are missing glusterd should print the error and exit out.
Comment 4 Raghavendra Bhat 2011-03-11 03:07:43 EST
Probed a peer, stopeed glusterd and then removed the entry of the other peer from /etc/glusterd/peers directory. Now started glusterd. It logs the error message.

[2011-03-11 16:35:13.165866] D [glusterd-store.c:1610:glusterd_store_retrieve_peers] 0-: Returning with 0
[2011-03-11 16:35:13.165882] D [glusterd-store.c:1640:glusterd_resolve_all_bricks] 0-: Returning with 0
[2011-03-11 16:35:13.165896] D [glusterd-store.c:1667:glusterd_restore] 0-: Returning 0
Given volfile:
  1: volume management
  2:     type mgmt/glusterd
  3:     option working-directory /etc/glusterd
  4:     option transport-type socket,rdma
  5:     option transport.socket.keepalive-time 10
  6:     option transport.socket.keepalive-interval 2
  7: end-volume

[2011-03-11 16:35:13.214601] I [glusterd-handler.c:2611:glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: eaae880d-fa3d-4ba9-a53d-417323598df0
[2011-03-11 16:35:13.214682] I [glusterd-handler.c:379:glusterd_friend_find] 0-glusterd: Unable to find peer by uuid
[2011-03-11 16:35:13.248038] I [glusterd-handler.c:391:glusterd_friend_find] 0-glusterd: Unable to find hostname:
[2011-03-11 16:35:13.248298] I [glusterd-handler.c:3267:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to (24007), ret: 0

Note You need to log in before you can comment on or make changes to this bug.