Description of problem: Executed the gluster peer detach comamnd to detach a RHSS node from the cluster. This resulted in a crash for glusterd A Bz 1108505 just before this BZ was filed, that was related to gluster peer probe failure. Version-Release number of selected component (if applicable): glusterfs-3.6.0.15-1.el6rhs.x86_64 How reproducible: happens to be seen this time Steps to Reproduce: 1. gluster peer detach <rhss-nodename> 2. 3. Actual results: patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2014-06-11 22:18:03 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.6.0.15 /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x7fa407b7ae56] /usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x7fa407b9528f] /lib64/libc.so.6(+0x329a0)[0x7fa406bc39a0] /lib64/libpthread.so.0(pthread_spin_lock+0x0)[0x7fa407311380] /usr/lib64/libglusterfs.so.0(__gf_free+0x14a)[0x7fa407ba81fa] /usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so(glusterd_peer_destroy+0x3f)[0x7fa3fca79c3f] /usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so(glusterd_friend_cleanup+0xb8)[0x7fa3fca880b8] /usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so(+0x4a49f)[0x7fa3fca5f49f] /usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so(glusterd_friend_sm+0x1c6)[0x7fa3fca5ff36] /usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so(__glusterd_handle_cli_deprobe+0x1b9)[0x7fa3fca5db39] /usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so(glusterd_big_locked_handler+0x3f)[0x7fa3fca45e2f] /usr/lib64/libglusterfs.so.0(synctask_wrap+0x12)[0x7fa407bb6742] /lib64/libc.so.6(+0x43bf0)[0x7fa406bd4bf0] --------- bt, (gdb) bt #0 0x00007fa407311380 in pthread_spin_lock () from /lib64/libpthread.so.0 #1 0x00007fa407ba81fa in __gf_free () from /usr/lib64/libglusterfs.so.0 #2 0x00007fa3fca79c3f in glusterd_peer_destroy () from /usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so #3 0x00007fa3fca880b8 in glusterd_friend_cleanup () from /usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so #4 0x00007fa3fca5f49f in ?? () from /usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so #5 0x00007fa3fca5ff36 in glusterd_friend_sm () from /usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so #6 0x00007fa3fca5db39 in __glusterd_handle_cli_deprobe () from /usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so #7 0x00007fa3fca45e2f in glusterd_big_locked_handler () from /usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so #8 0x00007fa407bb6742 in synctask_wrap () from /usr/lib64/libglusterfs.so.0 #9 0x00007fa406bd4bf0 in ?? () from /lib64/libc.so.6 #10 0x0000000000000000 in ?? () memory on the node on which the gluster command was executed, [root@nfs1 ~]# free -tg total used free shared buffers cached Mem: 7 1 6 0 0 0 -/+ buffers/cache: 0 7 Swap: 7 0 7 Total: 15 1 14 Expected results: detach should not result in crash. Additional info:
Created attachment 908018 [details] sosreport of existing rhs node
Created attachment 908022 [details] coredump
[root@nfs1 ~]# gluster peer detach rhsauto005.lab.eng.blr.redhat.com peer detach: success [root@nfs1 ~]# [root@nfs1 ~]# gluster peer status Number of Peers: 3 Hostname: 10.70.37.215 Uuid: b9eded1c-fbae-4e9b-aa31-26a06e747d83 State: Peer in Cluster (Connected) Hostname: 10.70.37.44 Uuid: e3a7651a-2d8d-4cd3-8dc0-e607dc019754 State: Peer in Cluster (Connected) Hostname: 10.70.37.201 Uuid: 542bf4aa-b6b5-40c3-82bf-f344fb637a99 State: Peer in Cluster (Connected) and I didn't see any core dump after these commands, hence moving this BZ to verified.
Hi KP, Please review the edited doc text for technical accuracy and sign off.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1278.html
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days