Hide Forgot
Description of problem: I was running geo-rep with the code coverage, and I had compiled the gluster from source. So I started glusterd with LDEBUG. After starting gluster volume and starting the geo-rep session, I ran gluster v geo status detail, glusterd crashed. Version-Release number of selected component (if applicable): glusterfs-3.4.0.32rhs-1.el6rhs.x86_64 How reproducible: Haven't tried reproducing again, but it happened in 2 nodes. Steps to Reproduce: 1. Compile and install gluster from source. with code coverage compiler flags. 2. Now create two volumes and start a geo-rep session between them. 3. Now after the session is started, run geo-rep status detail. Actual results: glusterd crashed with following backtrace. (gdb) bt #0 0x00000039f48328a5 in raise () from /lib64/libc.so.6 #1 0x00000039f4834085 in abort () from /lib64/libc.so.6 #2 0x00000039f482ba1e in __assert_fail_base () from /lib64/libc.so.6 #3 0x00000039f482bae0 in __assert_fail () from /lib64/libc.so.6 #4 0x00007f808ca0e67d in glusterd_mountbroker_check (slave_ip=0x18480e8, op_errstr=0x0) at glusterd-geo-rep.c:1745 #5 0x00007f808ca1829f in glusterd_get_slave_info (slave=0x7f808800eac6 "euclid", slave_ip=0x184a168, slave_vol=0x184a160, op_errstr=0x0) at glusterd-geo-rep.c:3674 #6 0x00007f808ca0a985 in _get_status_mst_slv (this=0x7f808ec95394, key=0x7f8088019270 "slave1", value=0x7f808eab3594, data=0x184a240) at glusterd-geo-rep.c:953 #7 0x00007f80904a8210 in dict_foreach (dict=0x7f808ec95394, fn=0x7f808ca0a746 <_get_status_mst_slv>, data=0x184a240) at dict.c:1123 #8 0x00007f808ca1412a in glusterd_get_gsync_status_mst (volinfo=0x7f807c0014d0, rsp_dict=0x7f808ec95880, node=0x184a2f0 "ramanujan.blr.redhat.com") at glusterd-geo-rep.c:2927 #9 0x00007f808ca14273 in glusterd_get_gsync_status_all (rsp_dict=0x7f808ec95880, node=0x184a2f0 "ramanujan.blr.redhat.com") at glusterd-geo-rep.c:2946 #10 0x00007f808ca14468 in glusterd_get_gsync_status (dict=0x7f808ec95ce0, op_errstr=0x184bdc8, rsp_dict=0x7f808ec95880) at glusterd-geo-rep.c:2977 #11 0x00007f808ca17052 in glusterd_op_gsync_set (dict=0x7f808ec95ce0, op_errstr=0x184bdc8, rsp_dict=0x7f808ec95880) at glusterd-geo-rep.c:3456 #12 0x00007f808c9a609d in glusterd_op_commit_perform (op=GD_OP_GSYNC_SET, dict=0x7f808ec95ce0, op_errstr=0x184bdc8, rsp_dict=0x7f808ec95880) at glusterd-op-sm.c:3920 #13 0x00007f808ca3fb74 in gd_commit_op_phase (peers=0x12b9dd0, op=GD_OP_GSYNC_SET, op_ctx=0x7f808ec95d6c, req_dict=0x7f808ec95ce0, op_errstr=0x184bdc8, npeers=0) at glusterd-syncop.c:958 #14 0x00007f808ca40f52 in gd_sync_task_begin (op_ctx=0x7f808ec95d6c, req=0x7f808c8e404c) at glusterd-syncop.c:1230 #15 0x00007f808ca4112d in glusterd_op_begin_synctask (req=0x7f808c8e404c, op=GD_OP_GSYNC_SET, dict=0x7f808ec95d6c) at glusterd-syncop.c:1264 #16 0x00007f808ca07cc3 in __glusterd_handle_gsync_set (req=0x7f808c8e404c) at glusterd-geo-rep.c:318 #17 0x00007f808c982a06 in glusterd_big_locked_handler (req=0x7f808c8e404c, actor_fn=0x7f808ca07594 <__glusterd_handle_gsync_set>) at glusterd-handler.c:77 #18 0x00007f808ca07e46 in glusterd_handle_gsync_set (req=0x7f808c8e404c) at glusterd-geo-rep.c:346 #19 0x00007f8090513808 in synctask_wrap (old_task=0x12bbff0) at syncop.c:131 #20 0x00000039f4843b70 in ?? () from /lib64/libc.so.6 #21 0x0000000000000000 in ?? () (gdb) f 4 #4 0x00007f808ca0e67d in glusterd_mountbroker_check (slave_ip=0x18480e8, op_errstr=0x0) at glusterd-geo-rep.c:1745 1745 GF_ASSERT (op_errstr); Expected results: glusterd should not crash. Additional info: part of the glusterd log before crashing [2013-09-08 03:45:19.046211] I [glusterd-geo-rep.c:283:__glusterd_handle_gsync_set] 0-management: slave not found, whilehandling geo-replication options [2013-09-08 03:45:19.046237] D [glusterd-utils.c:157:glusterd_lock] 0-management: Cluster lock held by d6694e36-99b7-49bc-8bee-687203be714d [2013-09-08 03:45:19.046255] W [glusterd-geo-rep.c:1404:glusterd_op_gsync_args_get] 0-: master not found [2013-09-08 03:45:19.046264] D [glusterd-geo-rep.c:1430:glusterd_op_gsync_args_get] 0-: Returning -2 [2013-09-08 03:45:19.046283] D [glusterd-geo-rep.c:1387:glusterd_verify_gsync_status_opts] 0-: Returning 0 [2013-09-08 03:45:19.046293] D [glusterd-geo-rep.c:2248:glusterd_op_stage_gsync_set] 0-: Returning 0 [2013-09-08 03:45:19.046300] D [glusterd-op-sm.c:3856:glusterd_op_stage_validate] 0-management: OP = 15. Returning 0 [2013-09-08 03:45:19.046310] D [glusterd-op-sm.c:4937:glusterd_op_bricks_select] 0-management: Returning 0 [2013-09-08 03:45:19.046317] D [glusterd-syncop.c:1164:gd_brick_op_phase] 0-management: Sent op req to 0 bricks pending frames: frame : type(0) op(0) patchset: git://git.gluster.com/glusterfs.git signal received: 6 time of crash: 2013-09-08 03:45:19configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.4.0.32rhs glusterd(glusterfsd_print_trace+0x31)[0x40aac8] /lib64/libc.so.6[0x39f4832920] /lib64/libc.so.6(gsignal+0x35)[0x39f48328a5] /lib64/libc.so.6(abort+0x175)[0x39f4834085] /lib64/libc.so.6[0x39f482ba1e] /lib64/libc.so.6(__assert_perror_fail+0x0)[0x39f482bae0] /usr/local/lib/glusterfs/3.4.0.32rhs/xlator/mgmt/glusterd.so(glusterd_mountbroker_check+0x11c)[0x7f808ca0e67d] /usr/local/lib/glusterfs/3.4.0.32rhs/xlator/mgmt/glusterd.so(+0xd029f)[0x7f808ca1829f] /usr/local/lib/glusterfs/3.4.0.32rhs/xlator/mgmt/glusterd.so(+0xc2985)[0x7f808ca0a985] /usr/local/lib/libglusterfs.so.0(dict_foreach+0xe3)[0x7f80904a8210] /usr/local/lib/glusterfs/3.4.0.32rhs/xlator/mgmt/glusterd.so(+0xcc12a)[0x7f808ca1412a] /usr/local/lib/glusterfs/3.4.0.32rhs/xlator/mgmt/glusterd.so(+0xcc273)[0x7f808ca14273] /usr/local/lib/glusterfs/3.4.0.32rhs/xlator/mgmt/glusterd.so(+0xcc468)[0x7f808ca14468] /usr/local/lib/glusterfs/3.4.0.32rhs/xlator/mgmt/glusterd.so(glusterd_op_gsync_set+0x342)[0x7f808ca17052] /usr/local/lib/glusterfs/3.4.0.32rhs/xlator/mgmt/glusterd.so(glusterd_op_commit_perform+0x3a1)[0x7f808c9a609d] /usr/local/lib/glusterfs/3.4.0.32rhs/xlator/mgmt/glusterd.so(gd_commit_op_phase+0x138)[0x7f808ca3fb74] /usr/local/lib/glusterfs/3.4.0.32rhs/xlator/mgmt/glusterd.so(gd_sync_task_begin+0x4e0)[0x7f808ca40f52] /usr/local/lib/glusterfs/3.4.0.32rhs/xlator/mgmt/glusterd.so(glusterd_op_begin_synctask+0xe5)[0x7f808ca4112d] /usr/local/lib/glusterfs/3.4.0.32rhs/xlator/mgmt/glusterd.so(__glusterd_handle_gsync_set+0x72f)[0x7f808ca07cc3] /usr/local/lib/glusterfs/3.4.0.32rhs/xlator/mgmt/glusterd.so(glusterd_big_locked_handler+0x84)[0x7f808c982a06] /usr/local/lib/glusterfs/3.4.0.32rhs/xlator/mgmt/glusterd.so(glusterd_handle_gsync_set+0x34)[0x7f808ca07e46] /usr/local/lib/libglusterfs.so.0(synctask_wrap+0x5c)[0x7f8090513808] /lib64/libc.so.6[0x39f4843b70]
https://code.engineering.redhat.com/gerrit/#/c/12663/
Fixed in version please.
This time around, there were no crashes [root@spitfire ]# gluster v geo master falcon::slave status NODE MASTER SLAVE HEALTH UPTIME --------------------------------------------------------------------------- spitfire.blr.redhat.com master falcon::slave Stable 00:09:04 typhoon.blr.redhat.com master falcon::slave Stable 00:09:00 mustang.blr.redhat.com master falcon::slave Stable 00:09:00 harrier.blr.redhat.com master falcon::slave Stable 00:09:00 [root@spitfire ]# gluster v geo master falcon::slave status detail MASTER: master SLAVE: falcon::slave NODE HEALTH UPTIME FILES SYNCD FILES PENDING BYTES PENDING DELETES PENDING -------------------------------------------------------------------------------------------------------------------- spitfire.blr.redhat.com Stable 00:09:10 1116 0 0Bytes 0 mustang.blr.redhat.com Stable 00:09:06 0 0 0Bytes 0 harrier.blr.redhat.com Stable 00:09:07 1081 0 0Bytes 0 typhoon.blr.redhat.com Stable 00:09:07 0 0 0Bytes 0 [root@spitfire ]# gluster v geo master falcon::slave status NODE MASTER SLAVE HEALTH UPTIME --------------------------------------------------------------------------- spitfire.blr.redhat.com master falcon::slave Stable 00:09:04 typhoon.blr.redhat.com master falcon::slave Stable 00:09:00 mustang.blr.redhat.com master falcon::slave Stable 00:09:00 harrier.blr.redhat.com master falcon::slave Stable 00:09:00 [root@spitfire ]# gluster v geo master falcon::slave status detail MASTER: master SLAVE: falcon::slave NODE HEALTH UPTIME FILES SYNCD FILES PENDING BYTES PENDING DELETES PENDING -------------------------------------------------------------------------------------------------------------------- spitfire.blr.redhat.com Stable 00:09:10 1116 0 0Bytes 0 mustang.blr.redhat.com Stable 00:09:06 0 0 0Bytes 0 harrier.blr.redhat.com Stable 00:09:07 1081 0 0Bytes 0 typhoon.blr.redhat.com Stable 00:09:07 0 0 0Bytes 0 Moving the bug to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1769.html