Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 797085

Summary:	Crash in event_unregister while adding bricks in a loop
Product:	[Community] GlusterFS	Reporter:	Anush Shetty <ashetty>
Component:	rpc	Assignee:	shishir gowda <sgowda>
Status:	CLOSED UPSTREAM	QA Contact:
Severity:	unspecified	Docs Contact:
Priority:	high
Version:	mainline	CC:	gluster-bugs, nsathyan, vbellur
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2012-04-23 07:05:44 UTC	Type:	---
Regression:	---	Mount Type:	cifs
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Anush Shetty 2012-02-24 08:42:03 UTC

Description of problem: Adding a lot of bricks (1000 of them) in a loop while the IO is going on in a mount point led to the crash in one of the servers.

Version-Release number of selected component (if applicable): 3.3.0qa23


How reproducible: Not consistently in the same function


Steps to Reproduce:
1. for i in `seq 1 10000`; do echo 'sdsd' > /mnt/gluster/dot_$i; cat /mnt/gluster/dot_$i > /dev/null; done
2. for i in `seq 1 1000`; do sleep 1; gluster volume add-brick test 10.1.11.134:/mnt/s_$i; done

  
Actual results:

(gdb) bt full
#0  0x00007fb5275cc4c3 in event_unregister (event_pool=0x1afe170, fd=20, idx=5) at event.c:927
        ret = -1
        __FUNCTION__ = "event_unregister"
#1  0x00007fb52407e7b1 in __socket_reset (this=0x1afcdd0) at socket.c:485
        priv = 0x1afd380
        __FUNCTION__ = "__socket_reset"
#2  0x00007fb52407f1a7 in socket_event_poll_err (this=0x1afcdd0) at socket.c:690
        priv = 0x1afd380
        ret = -1
        __FUNCTION__ = "socket_event_poll_err"
#3  0x00007fb52408388c in socket_event_handler (fd=20, idx=5, data=0x1afcdd0, poll_in=1, poll_out=0, poll_err=0) at socket.c:1808
        this = 0x1afcdd0
        priv = 0x1afd380
        ret = -1
        __FUNCTION__ = "socket_event_handler"
#4  0x00007fb5275cc074 in event_dispatch_epoll_handler (event_pool=0x1aa4d90, events=0x1abd7f0, i=0) at event.c:794
        event_data = 0x1abd7f4
        handler = 0x7fb5240835e3 <socket_event_handler>
        data = 0x1afcdd0
        idx = 5
        ret = -1
        __FUNCTION__ = "event_dispatch_epoll_handler"
#5  0x00007fb5275cc297 in event_dispatch_epoll (event_pool=0x1aa4d90) at event.c:856
        events = 0x1abd7f0
        size = 1
        i = 0
        ret = 1
        __FUNCTION__ = "event_dispatch_epoll"
#6  0x00007fb5275cc622 in event_dispatch (event_pool=0x1aa4d90) at event.c:956
        ret = -1
        __FUNCTION__ = "event_dispatch"
#7  0x0000000000407d7c in main (argc=19, argv=0x7fff7bedfd68) at glusterfsd.c:1612
        ctx = 0x1a8d010
        ret = 0
        __FUNCTION__ = "main"
(gdb) f 0
#0  0x00007fb5275cc4c3 in event_unregister (event_pool=0x1afe170, fd=20, idx=5) at event.c:927
927             ret = event_pool->ops->event_unregister (event_pool, fd, idx);
(gdb) p *event_pool->ops
Cannot access memory at address 0x8b00000a
(gdb) p *event_pool
$1 = {ops = 0x8b00000a, fd = 0, breaker = {0, -65536}, count = -2046099190, reg = 0x0, used = 0, idx_cache = 0, changed = 0, mutex = {
    __data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
    __size = '\000' <repeats 39 times>, __align = 0}, cond = {__data = {__lock = 0, __futex = 0, __total_seq = 0, __wakeup_seq = 0, 
      __woken_seq = 0, __mutex = 0x0, __nwaiters = 28, __broadcast_seq = 825110577}, 
    __size = '\000' <repeats 40 times>, "\034\000\000\000\061\060.1", __align = 0}, evcache = 0x3a3433312e31312e, evcache_size = 3748657}

Comment 1 Vijay Bellur 2012-03-25 05:44:44 UTC

Can you please reproduce the problem and attach logs?

Comment 2 Anush Shetty 2012-04-23 07:05:44 UTC

Don't see this crash now. Will reopen if I see it again.