Bug 1354621 - bug-1293414-import-brickinfo-uuid.t crashes in glusterd_friend_sm/cds_list_del_init
Summary: bug-1293414-import-brickinfo-uuid.t crashes in glusterd_friend_sm/cds_list_de...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-07-11 18:19 UTC by Jeff Darcy
Modified: 2018-08-29 03:35 UTC (History)
2 users (show)

Fixed In Version: glusterfs-4.1.3 (or later)
Clone Of:
Environment:
Last Closed: 2018-08-29 03:35:54 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Jeff Darcy 2016-07-11 18:19:35 UTC
Doesn't happen 100%, but close.  Here are some symptoms.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fb3eb7fe700 (LWP 680)]
0x00007fb4074ceefb in __cds_list_del (prev=0x21e45b000, next=0xdeadc0de00)
    at /usr/include/urcu/list.h:73
73		next->prev = prev;

(gdb) bt
#0  0x00007fb4074ceefb in __cds_list_del (prev=0x21e45b000, next=0xdeadc0de00)
    at /usr/include/urcu/list.h:73
#1  0x00007fb4074cef33 in cds_list_del (elem=0x7fb3d80120d0)
    at /usr/include/urcu/list.h:81
#2  0x00007fb4074cef4e in cds_list_del_init (elem=0x7fb3d80120d0)
    at /usr/include/urcu/list.h:88
#3  0x00007fb4074d1b4d in glusterd_friend_sm () at glusterd-sm.c:1348
#4  0x00007fb4074c732b in __glusterd_handle_incoming_unfriend_req (
    req=0x7fb3d800f4ac) at glusterd-handler.c:2670

(gdb) p event
$1 = (glusterd_friend_sm_event_t *) 0x7fb3d80120d0
(gdb) p event->list
$2 = {next = 0xdeadc0de00, prev = 0x21e45b000}
(gdb) p tmp
$3 = (glusterd_friend_sm_event_t *) 0xdeadc0de00
(gdb) p *event
$4 = {list = {next = 0xdeadc0de00, prev = 0x21e45b000}, 
  peerid = "\000\241\000\000\000\264\177\000\000\070\000\000\000\000\000", 
  peername = 0x7fb3d80120d000 <error: Cannot access memory at address 0x7fb3d80120d000>, ctx = 0xffffffffffffff00, event = 4294967295}

It looks like list/memory corruption, probably due to improper usage of RCU functions.  I have a patch that makes the problem go away, which I'll post as soon as I get this bug number.

Comment 1 Vijay Bellur 2016-07-11 18:27:19 UTC
REVIEW: http://review.gluster.org/14893 (glusterd: fix glusterd_friend_sm usage of SM functions) posted (#1) for review on master by Jeff Darcy (jdarcy)

Comment 2 Vijay Bellur 2016-07-11 18:29:40 UTC
REVIEW: http://review.gluster.org/14893 (glusterd: fix glusterd_friend_sm usage of RCU functions) posted (#2) for review on master by Jeff Darcy (jdarcy)

Comment 3 Vijay Bellur 2016-07-12 12:18:02 UTC
REVIEW: http://review.gluster.org/14893 (glusterd: fix glusterd_friend_sm usage of RCU functions) posted (#3) for review on master by Jeff Darcy (jdarcy)

Comment 4 Amar Tumballi 2018-08-29 03:35:54 UTC
This update is done in bulk based on the state of the patch and the time since last activity. If the issue is still seen, please reopen the bug.


Note You need to log in before you can comment on or make changes to this bug.