Hide Forgot
It seems that for gluster 3.2.2 it's enough to add '-g2' flag to prevent crashing. That is, gluster is now built with optimization enabled and full debugging information (I think that should be enough for initial performance testing). I think that you should still check why does it crash. I'd assume that you have a bug where you are using an uninitialized variable or making some illegal assumption on packing of struct elements in your code. FYI, I am building it as follows (w/o CFLAGS line it crashes on 'peer probe'): ./configure -C \ --prefix=/usr \ --libdir=/usr/lib64 \ --localstatedir=/var \ --sysconfdir=/etc \ --disable-dependency-tracking \ CFLAGS='-g2' \ && make -j 8
Are there any two machines I can use for testing this? on a single machine things seems to work fine.
In our build environment we are using gcc v4.3.4 (gcc-4_3-branch revision 152973) With this gcc I have to compile gluster with optimizations disabled (-O0). When compiled with -O2 (default) glusterd crashes when I am trying to 'peer probe', see the stack trace and several parameters I've printed in gdb below. I've tried both gluster v3.2.1 and 3.2.3. Obviously, I can't advance much with performance testing with gluster build with -O0 (switching to another compiler is a big issue for us..) Can you please check this? >>>>>>>>>>> Loaded symbols for /lib64/libgcc_s.so.1 Core was generated by `/usr/sbin/glusterd'. Program terminated with signal 11, Segmentation fault. #0 0x00007f644be00bcc in rpc_transport_connect (this=0x639750, port=0) at rpc-transport.c:810 810 rpc-transport.c: No such file or directory. in rpc-transport.c (gdb) bt #0 0x00007f644be00bcc in rpc_transport_connect (this=0x639750, port=0) at rpc-transport.c:810 #1 0x00007f644be062ab in rpc_clnt_submit (rpc=0x639590, prog=0x7f644a583140, procnum=1, cbkfn=0x7f644a355030 <glusterd3_1_probe_cbk>, proghdr=0x7fffedc8b1e0, proghdrcount=1, progpayload=0x0, progpayloadcount=0, iobref=0x63a310, frame=0x62e6dc, rsphdr=0x0, rsphdr_count=0, rsp_payload=0x0, rsp_payload_count=0, rsp_iobref=0x0) at rpc-clnt.c:1362 #2 0x00007f644a34fa9b in glusterd_submit_request (rpc=0x639590, req=0x7fffedc8b250, frame=0x62e6dc, prog=0x7f644a583140, procnum=1, iobref=0x63a310, sfunc=0x7f644bbf0fc0 <gd_xdr_from_mgmt_probe_req>, this=0x6330b0, cbkfn=0x7f644a355030 <glusterd3_1_probe_cbk>) at glusterd-utils.c:351 #3 0x00007f644a3549d5 in glusterd3_1_probe (frame=0x62e6dc, this=0x6330b0, data=0x63a3d0) at glusterd-rpc-ops.c:1340 #4 0x00007f644a330861 in glusterd_ac_friend_probe (event=<value optimized out>, ctx=0x63a390) at glusterd-sm.c:364 #5 0x00007f644a3309e5 in glusterd_friend_sm () at glusterd-sm.c:958 #6 0x00007f644a3610c1 in glusterd_peer_dump_version_cbk (req=0x0, iov=0x7f644c26b304, count=<value optimized out>, myframe=0x7f644aa4ff98) at glusterd-handshake.c:378 #7 0x00007f644be05b34 in rpc_clnt_handle_reply (clnt=0x639590, pollin=0x63a090) at rpc-clnt.c:736 #8 0x00007f644be05d78 in rpc_clnt_notify (trans=<value optimized out>, mydata=0x6395c0, event=<value optimized out>, data=0x1) at rpc-clnt.c:849 #9 0x00007f644be00a57 in rpc_transport_notify (this=0x639750, event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:918 #10 0x00007f644a0f9eef in socket_event_poll_in (this=0x639750) at socket.c:1647 #11 0x00007f644a0fa058 in socket_event_handler (fd=<value optimized out>, idx=2, data=0x639750, poll_in=1, poll_out=0, poll_err=0) at socket.c:1762 #12 0x00007f644c049f67 in event_dispatch_epoll_handler (i=<value optimized out>, events=<value optimized out>, event_pool=<value optimized out>) at event.c:794 #13 event_dispatch_epoll (i=<value optimized out>, events=<value optimized out>, event_pool=<value optimized out>) at event.c:856 #14 0x000000000040622a in main (argc=1, argv=0x7fffedc8b678) at glusterfsd.c:1488 (gdb) print this $1 = (rpc_transport_t *) 0x639750 (gdb) print *this $2 = {ops = 0x0, listener = 0x0, private = 0x0, xl_private = 0x0, xl = 0x0, mydata = 0x0, lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = { __prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, refcount = 0, ctx = 0x0, options = 0x0, name = 0x0, dnscache = 0x0, buf = 0x0, init = 0, fini = 0, validate_options = 0, reconfigure = 0, notify = 0, notify_data = 0x0, peerinfo = {sockaddr = {ss_family = 0, __ss_align = 0, __ss_padding = '\000' <repeats 111 times>}, sockaddr_len = 0, identifier = '\000' <repeats 72 times>, "@1XJd\177\000\000\001\000\000\000\000\000\000\000\060P5Jd\177", '\000' <repeats 13 times>}, myinfo = {sockaddr = {ss_family = 2, __ss_align = 0, __ss_padding = '\000' <repeats 111 times>}, sockaddr_len = 16, identifier = "14.10.12.12:1021", '\000' <repeats 91 times>}, total_bytes_read = 288, total_bytes_write = 140, list = {next = 0x0, prev = 0x0}, client_bind_insecure = 0} (gdb) print this->ops $3 = (struct rpc_transport_ops *) 0x0 (gdb) frame 3 #3 0x00007f644a3549d5 in glusterd3_1_probe (frame=0x62e6dc, this=0x6330b0, data=0x63a3d0) at glusterd-rpc-ops.c:1340 1340 glusterd-rpc-ops.c: No such file or directory. in glusterd-rpc-ops.c (gdb) print hostname $4 = 0x63a3b0 "module-2" (gdb) print port $5 = 0 (gdb) print peerinfo $6 = (glusterd_peerinfo_t *) 0x638e60 (gdb) print *peerinfo $7 = {uuid = '\000' <repeats 15 times>, uuid_str = '\000' <repeats 49 times>, state = {state = GD_FRIEND_STATE_DEFAULT, transition_time = {tv_sec = 0, tv_usec = 0}}, hostname = 0x638340 "module-2", port = 0, uuid_list = {next = 0x635980, prev = 0x635980}, op_peers_list = {next = 0x0, prev = 0x0}, rpc = 0x639590, mgmt = 0x7f644a583140, connected = 1, shandle = 0x639d30, sm_log = { transitions = 0x638f50, current = 0, size = 50, count = 0, state_name_get = 0x7f644a32f630 <glusterd_friend_sm_state_name_get>, event_name_get = 0x7f644a32f650 <glusterd_friend_sm_event_name_get>}}
This is a compiler bug. Standalone test case which misbehaves on SLES SP1 - https://github.com/avati/gcc-bug Bug needs to be raised with Novell.
CHANGE: http://review.gluster.com/549 (now returns 'true(1)' is gfid is root, 'false(0)' if not.) merged in master by Vijay Bellur (vijay)
(In reply to comment #4) > CHANGE: http://review.gluster.com/549 (now returns 'true(1)' is gfid is root, > 'false(0)' if not.) merged in master by Vijay Bellur (vijay) The above commit is for bug-3158 :p
CHANGE: http://review.gluster.com/522 (Change-Id: I0f078d1753db65d2f2e0380d1b0450c114cf40dd) merged in master by Vijay Bellur (vijay)
CHANGE: http://review.gluster.com/523 (Change-Id: I53b007fbdb42313d207d5d63fbfaaa6aaf033f95) merged in master by Vijay Bellur (vijay)