Bug 1778175 - glusterfsd crashed with "'MemoryError' Cannot access memory at address"
Summary: glusterfsd crashed with "'MemoryError' Cannot access memory at address"
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: rpc
Version: 7
Hardware: All
OS: Linux
high
high
Target Milestone: ---
Assignee: Mohit Agrawal
QA Contact:
URL:
Whiteboard:
Depends On: 1763036 1778182 1804523 1806595
Blocks: 1741193
TreeView+ depends on / blocked
 
Reported: 2019-11-29 12:38 UTC by Mohit Agrawal
Modified: 2020-02-24 15:24 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1763036
Environment:
Last Closed: 2019-12-05 07:22:57 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gluster.org Gerrit 23774 0 None Merged rpc: Synchronize slot allocation code 2019-12-05 07:22:56 UTC

Description Mohit Agrawal 2019-11-29 12:38:30 UTC
+++ This bug was initially created as a clone of Bug #1763036 +++

+++ This bug was initially created as a clone of Bug #1741193 +++

Description of problem:
glusterfsd crashed on all the gluster nodes and created a huge corefile causing the mounts to go down

Primary analysis of the corefile: (/cases/02441282/_additional_files/x-coredump/inf03-core.6572.) 

[file seems incomplete]

Core was generated by `/usr/sbin/glusterfsd -s 10.7.199.48 --volfile-id vol_0005ed9e52ceb05f16b1bb48cd'.
Program terminated with signal 6, Aborted.
#0  0x00007f35de36e207 in ?? ()
(gdb) bt
Python Exception <class 'gdb.MemoryError'> Cannot access memory at address 0x7f2bdef8da58: 

(gdb) info proc glusterfsd
exe = '/usr/sbin/glusterfsd -s 10.7.199.48 --volfile-id vol_0005ed9e52ceb05f16b1bb48cd'


(gdb) backtrace
#0  0x00007f35de36e207 in ?? ()
Cannot access memory at address 0x7f2bdef8da58
(gdb) down 0x7f2bdef8da58
Cannot access memory at address 0x7f2bdef8da58

(gdb) info frame
Stack level 0, frame at 0x7f2bdef8da60:
 rip = 0x7f35de36e207; saved rip Cannot access memory at address 0x7f2bdef8da58

(gdb) up 0x7f2bdef8da58
#0  0x00007f35de36e207 in ?? ()
(gdb) i r
rax            0x0	0
rbx            0x7f35e01ff000	139869370183680
rcx            0xffffffffffffffff	-1
rdx            0x6	6
rsi            0x2da6	11686
rdi            0x19ac	6572
rbp            0x7f35de4c2520	0x7f35de4c2520
rsp            0x7f2bdef8da58	0x7f2bdef8da58
r8             0x0	0
r9             0x7f17b4f04770	139739796621168
r10            0x8	8
r11            0x206	518
r12            0x7f35dfdb875a	139869365700442
r13            0x7f35dfdb8b00	139869365701376
r14            0x0	0
r15            0x7f255fc70ac0	139798497397440
rip            0x7f35de36e207	0x7f35de36e207
eflags         0x206	[ PF IF ]
cs             0x33	51
ss             0x2b	43
ds             0x0	0
es             0x0	0
fs             0x0	0
gs             0x0	0
(gdb) 

Please verify the corefile further to see if the root cause could be isolated

10.7.199.48:-var-lib-heketi-mounts-vg_77cff90104e3566f7856e523efd82152-brick_3ac56467ac42041325057fdfd61bf7cc-brick

uuid=e0b70592-6ab6-4d66-b1b6-d54bd4797d70
hostname=10.7.199.48
path=/var/lib/heketi/mounts/vg_77cff90104e3566f7856e523efd82152/brick_3ac56467ac42041325057fdfd61bf7cc/brick
real_path=/var/lib/heketi/mounts/vg_77cff90104e3566f7856e523efd82152/brick_3ac56467ac42041325057fdfd61bf7cc/brick
listen-port=49153
rdma.listen-port=0
decommissioned=0
brick-id=vol_0005ed9e52ceb05f16b1bb48cd526ea9-client-2
mount_dir=/brick
snap-status=0
brick-fsid=16842170


[2019-08-10 06:34:18.342656] I [MSGID: 106005] [glusterd-handler.c:6179:__glusterd_brick_rpc_notify] 0-management: Brick 10.7.199.48:/var/lib/heketi/mounts/vg_77cff90104e3566f7856e523efd82152/brick_3ac56467ac42041325057fdfd61bf7cc/brick has disconnected from glusterd.

(Please feel free to change the "Component" if found inapprpriate. 

Many thanks,
Jay

Version-Release number of selected component (if applicable):
rhgs3.4

How reproducible:
The issue occurred twice, around the same time (morning, latest one was on Aug 10)

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Jay Samson on 2019-09-23 10:19:15 UTC ---


Here is the analysis of the issue so far (will post more when I get time) :

[New LWP 21371]
[New LWP 23168]
[New LWP 117329]
bt[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Core was generated by `/usr/sbin/glusterfsd -s 10.7.199.47 --volfile-id vol_b763bb6afb974e6f80384673ba'.
Program terminated with signal 6, Aborted.
#0  0x00007f9efaed5207 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55
55	  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0  0x00007f9efaed5207 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007f9efaed68f8 in __GI_abort () at abort.c:90
#2  0x00007f9efaece026 in __assert_fail_base (fmt=0x7f9efb028ea0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7f9efc94507a "slot->fd == fd", file=file@entry=0x7f9efc945054 "event-epoll.c", line=line@entry=417, 
    function=function@entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118> "event_register_epoll") at assert.c:92
#3  0x00007f9efaece0d2 in __GI___assert_fail (assertion=assertion@entry=0x7f9efc94507a "slot->fd == fd", file=file@entry=0x7f9efc945054 "event-epoll.c", line=line@entry=417, function=function@entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118> "event_register_epoll")
    at assert.c:101
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150, fd=<optimized out>, handler=<optimized out>, data=<optimized out>, poll_in=<optimized out>, poll_out=<optimized out>, notify_poller_death=0 '\000') at event-epoll.c:417
#5  0x00007f9ef798ceb2 in socket_server_event_handler (fd=<optimized out>, idx=<optimized out>, gen=<optimized out>, data=0x7f9ee80403f0, poll_in=<optimized out>, poll_out=<optimized out>, poll_err=0, event_thread_died=0 '\000') at socket.c:2950
#6  0x00007f9efc8f8870 in event_dispatch_epoll_handler (event=0x7f98fffc9e70, event_pool=0x563fac588150) at event-epoll.c:643
#7  event_dispatch_epoll_worker (data=0x7f992c8ea110) at event-epoll.c:759
#8  0x00007f9efb6d5dd5 in start_thread (arg=0x7f98fffca700) at pthread_create.c:307
#9  0x00007f9efaf9cead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) f 7
#7  event_dispatch_epoll_worker (data=0x7f992c8ea110) at event-epoll.c:759
759			ret = event_dispatch_epoll_handler (event_pool, &event);
(gdb) p *event_pool
$1 = {ops = 0x7f9efcb6b4e0 <event_ops_epoll>, fd = 3, breaker = {0, 0}, count = 16384, reg = 0x0, ereg = {0x563fac5b6e50, 0x7f9e247f81b0, 0x0 <repeats 1022 times>}, slots_used = {1021, 10, 0 <repeats 1022 times>}, poller_death = {next = 0x563fac5b6fe0, 
    prev = 0x563fac5b6fe0}, poller_death_sliced = 0, poller_gen = 54, used = 0, changed = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, 
    __size = '\000' <repeats 39 times>, __align = 0}, cond = {__data = {__lock = 0, __futex = 0, __total_seq = 0, __wakeup_seq = 0, __woken_seq = 0, __mutex = 0x0, __nwaiters = 0, __broadcast_seq = 0}, __size = '\000' <repeats 47 times>, __align = 0}, evcache = 0x0, 
  evcache_size = 0, eventthreadcount = 197, pollers = {140320709637888, 140320701245184, 140319441008384, 140318519322368, 140318091491072, 140317311362816, 140317135181568, 140316095010560, 140315306489600, 140314870265600, 140314023028480, 140313586804480, 
    140312873785088, 140312370452224, 140311590323968, 140311028274944, 140310508189440, 140310340400896, 140310247036672, 140310153672448, 140310060308224, 140309966944000, 140309803562752, 140309710198528, 140309616834304, 140309451233024, 140309357868800, 
    140309130221312, 140309028464384, 140308935100160, 140308841735936, 140308748371712, 140308655007488, 140308561643264, 140308468279040, 140308374914816, 140308281550592, 140308188186368, 140308086429440, 140307993065216, 140307899700992, 140307806336768, 
    140307704579840, 140307611215616, 140307509458688, 140307416094464, 140307322730240, 140307229366016, 140307136001792, 140307042637568, 140306949273344, 140306855909120, 140306762544896, 140306660787968, 140306559031040, 140306457274112, 140306363909888, 
    140306270545664, 140306177181440, 140306083817216, 140305990452992, 140305818961664, 140305725597440, 140305632233216, 140305538868992, 140305445504768, 140305352140544, 140305258776320, 140305022048000, 140304928683776, 140304762005248, 140304668641024, 
    140304566884096, 140304465127168, 140304363370240, 140304261613312, 140304168249088, 140304074884864, 140303981520640, 140303879763712, 140303786399488, 140303693035264, 140303599671040, 140303506306816, 140303412942592, 140303317473024, 140303222003456, 
    140303126533888, 140303031064320, 140302935594752, 140302840125184, 140302744655616, 140302640527104, 140302536398592, 140302432270080, 140302336534272, 140302240798464, 140302145062656, 140302049326848, 140301953591040, 140301857855232, 140301711763200, 
    140301607634688, 140301503506176, 140301407770368, 140301230380800, 140301128623872, 140301026866944, 140300924057344, 140300819928832, 140300715800320, 140300550899456, 140300383143680, 140300281386752, 140300106315520, 140300004558592, 140299902801664, 
    140299801044736, 140299697182464, 140299601712896, 140299506243328, 140299410507520, 140299306379008, 140299202250496, 140299098121984, 140298993993472, 140298898257664, 140298802521856, 140298698393344, 140298602657536, 140298498529024, 140298402793216, 
    140298307057408, 140298211321600, 140298107193088, 140298003064576, 140297907328768, 140297811592960, 140297715857152, 140297603335936, 140297499207424, 140297403471616, 140297307735808, 140297212000000, 140297116264192, 140297020528384, 140296924792576, 
    140296820664064, 140296650192640, 140296556828416, 140296463464192, 140296370099968, 140296274630400, 140296178894592, 140296012674816, 140295836526336, 140295743162112, 140295576483584, 140295483119360, 140295389755136, 140295296390912, 140295201974016, 
    140295106504448, 140295002642176, 140294907172608, 140294811436800, 140294715700992, 140294619965184, 140294524229376, 140294428493568, 140294332757760, 140294237021952, 140294141286144, 140294045550336, 140293949814528, 140293854078720, 140293758342912, 
    140293662607104, 140293558478592, 140293454350080, 140293350221568, 140293254485760, 140293158749952, 140293063014144, 140292967278336, 140292863149824, 140292767414016, 140292598515456, 140292496758528, 140292403394304, 140292308711168, 140292212975360, 
    140292117239552, 140292021503744, 140291925767936, 140291751245568, 140291717674752, 0 <repeats 827 times>}, destroy = 0, activethreadcount = 198, auto_thread_count = 195}
(gdb) f 6
#6  0x00007f9efc8f8870 in event_dispatch_epoll_handler (event=0x7f98fffc9e70, event_pool=0x563fac588150) at event-epoll.c:643
643	                ret = handler (fd, idx, gen, data,
(gdb) p fd
$2 = 12
(gdb) p idx
$3 = 3
(gdb) l    
638	
639	        if (!handler)
640			goto out;
641	
642	        if (!handled_error_previously) {
643	                ret = handler (fd, idx, gen, data,
644	                               (event->events & (EPOLLIN|EPOLLPRI)),
645	                               (event->events & (EPOLLOUT)),
646	                               (event->events & (EPOLLERR|EPOLLHUP)), 0);
647	        }
(gdb) p data
$4 = (void *) 0x7f9ee80403f0
(gdb) f 5socket_server_event_handler
Invalid number "5socket".
(gdb) f 5
#5  0x00007f9ef798ceb2 in socket_server_event_handler (fd=<optimized out>, idx=<optimized out>, gen=<optimized out>, data=0x7f9ee80403f0, poll_in=<optimized out>, poll_out=<optimized out>, poll_err=0, event_thread_died=0 '\000') at socket.c:2950
2950	                                        event_register (ctx->event_pool,
(gdb) l
2945	                         */
2946	                        ret = rpc_transport_notify (this, RPC_TRANSPORT_ACCEPT, new_trans);
2947	
2948	                        if (ret != -1) {
2949	                                new_priv->idx =
2950	                                        event_register (ctx->event_pool,
2951	                                                        new_sock,
2952	                                                        socket_event_handler,
2953	                                                        new_trans,
2954	                                                        1, 0,
(gdb) p nds_sock
No symbol "nds_sock" in current context.
(gdb) p new_sock
$5 = 1340
(gdb) f 4
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150, fd=<optimized out>, handler=<optimized out>, data=<optimized out>, poll_in=<optimized out>, poll_out=<optimized out>, notify_poller_death=0 '\000') at event-epoll.c:417
417		assert (slot->fd == fd);
(gdb) l
412			return -1;
413		}
414	
415		slot = event_slot_get (event_pool, idx);
416	
417		assert (slot->fd == fd);
418	
419		LOCK (&slot->lock);
420		{
421			/* make epoll 'singleshot', which
(gdb) p *event_pool
$6 = {ops = 0x7f9efcb6b4e0 <event_ops_epoll>, fd = 3, breaker = {0, 0}, count = 16384, reg = 0x0, ereg = {0x563fac5b6e50, 0x7f9e247f81b0, 0x0 <repeats 1022 times>}, slots_used = {1021, 10, 0 <repeats 1022 times>}, poller_death = {next = 0x563fac5b6fe0, 
    prev = 0x563fac5b6fe0}, poller_death_sliced = 0, poller_gen = 54, used = 0, changed = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, 
    __size = '\000' <repeats 39 times>, __align = 0}, cond = {__data = {__lock = 0, __futex = 0, __total_seq = 0, __wakeup_seq = 0, __woken_seq = 0, __mutex = 0x0, __nwaiters = 0, __broadcast_seq = 0}, __size = '\000' <repeats 47 times>, __align = 0}, evcache = 0x0, 
  evcache_size = 0, eventthreadcount = 197, pollers = {140320709637888, 140320701245184, 140319441008384, 140318519322368, 140318091491072, 140317311362816, 140317135181568, 140316095010560, 140315306489600, 140314870265600, 140314023028480, 140313586804480, 
    140312873785088, 140312370452224, 140311590323968, 140311028274944, 140310508189440, 140310340400896, 140310247036672, 140310153672448, 140310060308224, 140309966944000, 140309803562752, 140309710198528, 140309616834304, 140309451233024, 140309357868800, 
    140309130221312, 140309028464384, 140308935100160, 140308841735936, 140308748371712, 140308655007488, 140308561643264, 140308468279040, 140308374914816, 140308281550592, 140308188186368, 140308086429440, 140307993065216, 140307899700992, 140307806336768, 
    140307704579840, 140307611215616, 140307509458688, 140307416094464, 140307322730240, 140307229366016, 140307136001792, 140307042637568, 140306949273344, 140306855909120, 140306762544896, 140306660787968, 140306559031040, 140306457274112, 140306363909888, 
    140306270545664, 140306177181440, 140306083817216, 140305990452992, 140305818961664, 140305725597440, 140305632233216, 140305538868992, 140305445504768, 140305352140544, 140305258776320, 140305022048000, 140304928683776, 140304762005248, 140304668641024, 
    140304566884096, 140304465127168, 140304363370240, 140304261613312, 140304168249088, 140304074884864, 140303981520640, 140303879763712, 140303786399488, 140303693035264, 140303599671040, 140303506306816, 140303412942592, 140303317473024, 140303222003456, 
    140303126533888, 140303031064320, 140302935594752, 140302840125184, 140302744655616, 140302640527104, 140302536398592, 140302432270080, 140302336534272, 140302240798464, 140302145062656, 140302049326848, 140301953591040, 140301857855232, 140301711763200, 
    140301607634688, 140301503506176, 140301407770368, 140301230380800, 140301128623872, 140301026866944, 140300924057344, 140300819928832, 140300715800320, 140300550899456, 140300383143680, 140300281386752, 140300106315520, 140300004558592, 140299902801664, 
    140299801044736, 140299697182464, 140299601712896, 140299506243328, 140299410507520, 140299306379008, 140299202250496, 140299098121984, 140298993993472, 140298898257664, 140298802521856, 140298698393344, 140298602657536, 140298498529024, 140298402793216, 
    140298307057408, 140298211321600, 140298107193088, 140298003064576, 140297907328768, 140297811592960, 140297715857152, 140297603335936, 140297499207424, 140297403471616, 140297307735808, 140297212000000, 140297116264192, 140297020528384, 140296924792576, 
    140296820664064, 140296650192640, 140296556828416, 140296463464192, 140296370099968, 140296274630400, 140296178894592, 140296012674816, 140295836526336, 140295743162112, 140295576483584, 140295483119360, 140295389755136, 140295296390912, 140295201974016, 
    140295106504448, 140295002642176, 140294907172608, 140294811436800, 140294715700992, 140294619965184, 140294524229376, 140294428493568, 140294332757760, 140294237021952, 140294141286144, 140294045550336, 140293949814528, 140293854078720, 140293758342912, 
    140293662607104, 140293558478592, 140293454350080, 140293350221568, 140293254485760, 140293158749952, 140293063014144, 140292967278336, 140292863149824, 140292767414016, 140292598515456, 140292496758528, 140292403394304, 140292308711168, 140292212975360, 
    140292117239552, 140292021503744, 140291925767936, 140291751245568, 140291717674752, 0 <repeats 827 times>}, destroy = 0, activethreadcount = 198, auto_thread_count = 195}
(gdb) p idx
$7 = <optimized out>
(gdb) ptype event_pool
type = struct event_pool {
    struct event_ops *ops;
    int fd;
    int breaker[2];
    int count;
    struct event_slot_poll *reg;
    struct event_slot_epoll *ereg[1024];
    int slots_used[1024];
    struct list_head poller_death;
    int poller_death_sliced;
    int poller_gen;
    int used;
    int changed;
    pthread_mutex_t mutex;
    pthread_cond_t cond;
    void *evcache;
    int evcache_size;
    int eventthreadcount;
    pthread_t pollers[1024];
    int destroy;
    int activethreadcount;
    int auto_thread_count;
} *
(gdb) p *event_pool->event_slot_epoll
There is no member named event_slot_epoll.
(gdb) p *event_pool->ereg
$8 = (struct event_slot_epoll *) 0x563fac5b6e50
(gdb) p (struct event_slot_epoll *)event_pool->ereg
$9 = (struct event_slot_epoll *) 0x563fac588170
(gdb) p *(struct event_slot_epoll *)event_pool->ereg
$10 = {fd = -1403294128, events = 22079, gen = 612336048, idx = 32670, ref = 0, do_close = 0, in_handler = 0, handled_error = 0, data = 0x0, handler = 0x0, lock = {spinlock = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, 
        __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}}, poller_death = {next = 0x0, prev = 0x0}}
(gdb) p *event_pool->slots_used
$11 = 1021
(gdb) p *event_pool->slots_used[1]
Cannot access memory at address 0xa
(gdb) p *event_pool->slots_used[0]
Cannot access memory at address 0x3fd
(gdb) p event_pool->slots_used[0]
$12 = 1021
(gdb) p event_pool->slots_used[1]
$13 = 10
(gdb) p event_pool->slots_used[2]
$14 = 0
(gdb) p event_pool->slots_used[3]
$15 = 0
(gdb) p event_pool->slots_used[1021]
$16 = 0
(gdb) p event_pool->slots_used[1024]
$17 = -1403293728
(gdb) p event_pool->slots_used[1023]
$18 = 0
(gdb) p event_pool->slots_used[10]
$19 = 0
(gdb) p *event_pool->ereg[0]
$20 = {fd = 7, events = 1073741851, gen = 1, idx = 0, ref = 1, do_close = 0, in_handler = 0, handled_error = 0, data = 0x563fac595a20, handler = 0x7f9ef798c780 <socket_server_event_handler>, lock = {spinlock = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, 
        __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}}, poller_death = {next = 0x563fac5b6ea8, prev = 0x563fac5b6ea8}}
(gdb) p *event_pool->ereg[1]
$21 = {fd = -1, events = 1073741851, gen = 216, idx = 0, ref = 1, do_close = 1, in_handler = 0, handled_error = 0, data = 0x7f9e3c050be0, handler = 0x7f9ef7989980 <socket_event_handler>, lock = {spinlock = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, 
        __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}}, poller_death = {next = 0x7f9e247f8208, prev = 0x7f9e247f8208}}
(gdb) bt
#0  0x00007f9efaed5207 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007f9efaed68f8 in __GI_abort () at abort.c:90
#2  0x00007f9efaece026 in __assert_fail_base (fmt=0x7f9efb028ea0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7f9efc94507a "slot->fd == fd", file=file@entry=0x7f9efc945054 "event-epoll.c", line=line@entry=417, 
    function=function@entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118> "event_register_epoll") at assert.c:92
#3  0x00007f9efaece0d2 in __GI___assert_fail (assertion=assertion@entry=0x7f9efc94507a "slot->fd == fd", file=file@entry=0x7f9efc945054 "event-epoll.c", line=line@entry=417, function=function@entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118> "event_register_epoll")
    at assert.c:101
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150, fd=<optimized out>, handler=<optimized out>, data=<optimized out>, poll_in=<optimized out>, poll_out=<optimized out>, notify_poller_death=0 '\000') at event-epoll.c:417
#5  0x00007f9ef798ceb2 in socket_server_event_handler (fd=<optimized out>, idx=<optimized out>, gen=<optimized out>, data=0x7f9ee80403f0, poll_in=<optimized out>, poll_out=<optimized out>, poll_err=0, event_thread_died=0 '\000') at socket.c:2950
#6  0x00007f9efc8f8870 in event_dispatch_epoll_handler (event=0x7f98fffc9e70, event_pool=0x563fac588150) at event-epoll.c:643
#7  event_dispatch_epoll_worker (data=0x7f992c8ea110) at event-epoll.c:759
#8  0x00007f9efb6d5dd5 in start_thread (arg=0x7f98fffca700) at pthread_create.c:307
#9  0x00007f9efaf9cead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) l
422			   means we need to re-add the fd with
423			   epoll_ctl(EPOLL_CTL_MOD) after delivery of every
424			   single event. This assures us that while a poller
425			   thread has picked up and is processing an event,
426			   another poller will not try to pick this at the same
427			   time as well.
428			*/
429	
430			slot->events = EPOLLPRI | EPOLLHUP | EPOLLERR | EPOLLONESHOT;
431			slot->handler = handler;
(gdb) p *slot
$22 = {fd = -1, events = 1073741851, gen = 216, idx = 0, ref = 1, do_close = 1, in_handler = 0, handled_error = 0, data = 0x7f9e3c050be0, handler = 0x7f9ef7989980 <socket_event_handler>, lock = {spinlock = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, 
        __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}}, poller_death = {next = 0x7f9e247f8208, prev = 0x7f9e247f8208}}
(gdb) p event_pool->slots_used[11]
$23 = 0
(gdb) p event_pool->slots_used[12]
$24 = 0
(gdb) p *slot
$25 = {fd = -1, events = 1073741851, gen = 216, idx = 0, ref = 1, do_close = 1, in_handler = 0, handled_error = 0, data = 0x7f9e3c050be0, handler = 0x7f9ef7989980 <socket_event_handler>, lock = {spinlock = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, 
        __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}}, poller_death = {next = 0x7f9e247f8208, prev = 0x7f9e247f8208}}
(gdb) p slot->fd
$26 = -1
(gdb) p fd      
$27 = <optimized out>
(gdb) f 4
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150, fd=<optimized out>, handler=<optimized out>, data=<optimized out>, poll_in=<optimized out>, poll_out=<optimized out>, notify_poller_death=0 '\000') at event-epoll.c:417
417		assert (slot->fd == fd);
(gdb) p slot->fd
$28 = -1

--- Additional comment from Prashant Dhange on 2019-09-25 03:43:12 UTC ---

(In reply to Prashant Dhange from comment #33)
> (In reply to Prashant Dhange from comment #32)
> > (In reply to nravinas from comment #30)
> > > Hi, Nithya.
> > > 
> > > * I've been requested to try to help here. Unfortunately, setting a digicore
> > > environment to analyze this core is not working as expected:
> > > 
> > > IP: 10.72.93.42
> > > Username / password: root / redhat
> > > 
> > > cd /root/digicore
> > > 
> > > sh digicore.sh attach 02441282
> > > 
> > > Once attached to the container, I have downloaded a core file at /core
> > > 
> > > gdb /usr/sbin/glusterfsd core
> > > 
> > > .
> > > .
> > > .
> > > 
> > > Core was generated by `/usr/sbin/glusterfsd -s 10.7.199.47 --volfile-id
> > > vol_b763bb6afb974e6f80384673ba'.
> > > Program terminated with signal 6, Aborted.
> > > #0  0x00007f9efaed5207 in sigorset (dest=0x4193, left=0x5377, right=0x6) at
> > > sigorset.c:27
> > > 27	  if (dest == NULL || left == NULL || right == NULL)
> > > 
> > The above digicore environment is having different gluster/glibc package
> > version and are not matching with customer's coredump environment that is
> > why we are getting incomplete backtrace.
> > 
> > We had passed wrong installed-rpms (rpm -qa) to digicore while creating
> > digicore containter, whereas coredump was generated on environment having
> > glusterfs 3.12.2-47.el7rhgs.x86_64 and glibc 2.17-260.el7_6.4.x86_64
> > 
> > The above digicore container is having glusterfs 3.12.2-47.4.el7rhgs.x86_64
> > and glibc 2.17-292.el7.x86_64 :
> > [root@7c3000b2e002 case]# rpm -qa |egrep 'gluster|glibc'
> > glibc-2.17-292.el7.x86_64                                   Tue Sep  3
> > 09:12:15 2019
> > glibc-common-2.17-292.el7.x86_64                            Tue Sep  3
> > 09:12:12 2019
> > glusterfs-3.12.2-47.4.el7rhgs.x86_64                        Sun Sep 15
> > 10:20:34 2019
> > glusterfs-api-3.12.2-47.4.el7rhgs.x86_64                    Sun Sep 15
> > 10:20:35 2019
> > glusterfs-client-xlators-3.12.2-47.4.el7rhgs.x86_64         Sun Sep 15
> > 10:20:34 2019
> > glusterfs-cli-3.12.2-47.4.el7rhgs.x86_64                    Sun Sep 15
> > 10:20:34 2019
> > glusterfs-events-3.12.2-47.4.el7rhgs.x86_64                 Sun Sep 15
> > 10:20:36 2019
> > glusterfs-fuse-3.12.2-47.4.el7rhgs.x86_64                   Sun Sep 15
> > 10:20:35 2019
> > glusterfs-geo-replication-3.12.2-47.4.el7rhgs.x86_64        Sun Sep 15
> > 10:20:36 2019
> > glusterfs-libs-3.12.2-47.4.el7rhgs.x86_64                   Sun Sep 15
> > 10:20:34 2019
> > glusterfs-rdma-3.12.2-47.4.el7rhgs.x86_64                   Sun Sep 15
> > 10:20:38 2019
> > glusterfs-server-3.12.2-47.4.el7rhgs.x86_64                 Sun Sep 15
> > 10:20:35 2019
> > gluster-block-0.2.1-32.el7rhgs.x86_64                       Sun Sep 15
> > 10:25:48 2019
> > gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64               Wed Sep 11
> > 13:24:39 2019
> > gluster-nagios-common-0.2.4-1.el7rhgs.noarch                Wed Sep 11
> > 13:24:12 2019
> > libvirt-daemon-driver-storage-gluster-4.5.0-23.el7_7.1.x86_64 Sun Sep 15
> > 10:20:36 2019
> > python2-gluster-3.12.2-47.4.el7rhgs.x86_64                  Sun Sep 15
> > 10:20:36 2019
> > vdsm-gluster-4.30.18-1.0.el7rhgs.x86_64                     Sun Sep 15
> > 10:20:38 2019
> > 
> > Correcting glusterfs/glibc version to
> > 3.12.2-47.4.el7rhgs.x86_64/2.17-292.el7.x86_64 respectively on above
> > container should fix the issue and we should get complete backtrace here.
> Correction: it should be "Correcting glusterfs/glibc version to
> 3.12.2-47.el7rhgs.x86_64/2.17-260.el7_6.4.x86_64 respectively on above
> container should fix the issue and we should get complete backtrace here."

I have fixed the package mismatch on above container and we are able to get proper backtrace.
[root@7c3000b2e002 case]# rpm -qa|egrep 'gluster|glibc'
glusterfs-libs-3.12.2-47.el7rhgs.x86_64
glusterfs-client-xlators-3.12.2-47.el7rhgs.x86_64
glusterfs-cli-3.12.2-47.el7rhgs.x86_64
glusterfs-server-3.12.2-47.el7rhgs.x86_64
gluster-block-0.2.1-31.el7rhgs.x86_64
glibc-2.17-260.el7_6.4.x86_64
glibc-debuginfo-2.17-260.el7_6.4.x86_64
glusterfs-3.12.2-47.el7rhgs.x86_64
glusterfs-api-3.12.2-47.el7rhgs.x86_64
glusterfs-fuse-3.12.2-47.el7rhgs.x86_64
python2-gluster-3.12.2-47.el7rhgs.x86_64
glusterfs-geo-replication-3.12.2-47.el7rhgs.x86_64
glibc-common-2.17-260.el7_6.4.x86_64
glibc-debuginfo-common-2.17-260.el7_6.4.x86_64
glusterfs-debuginfo-3.12.2-47.el7rhgs.x86_64

[root@7c3000b2e002 case]# gdb /usr/sbin/glusterfsd core 
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/glusterfsd...Reading symbols from /usr/lib/debug/usr/sbin/glusterfsd.debug...done.
done.
....
....
[New LWP 21301]
[New LWP 21371]
[New LWP 23168]
[New LWP 117329]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/glusterfsd -s 10.7.199.47 --volfile-id vol_b763bb6afb974e6f80384673ba'.
Program terminated with signal 6, Aborted.
#0  0x00007f9efaed5207 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55
55	  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0  0x00007f9efaed5207 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007f9efaed68f8 in __GI_abort () at abort.c:90
#2  0x00007f9efaece026 in __assert_fail_base (fmt=0x7f9efb028ea0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7f9efc94507a "slot->fd == fd", file=file@entry=0x7f9efc945054 "event-epoll.c", line=line@entry=417, 
    function=function@entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118> "event_register_epoll") at assert.c:92
#3  0x00007f9efaece0d2 in __GI___assert_fail (assertion=assertion@entry=0x7f9efc94507a "slot->fd == fd", file=file@entry=0x7f9efc945054 "event-epoll.c", line=line@entry=417, function=function@entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118> "event_register_epoll")
    at assert.c:101
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150, fd=<optimized out>, handler=<optimized out>, data=<optimized out>, poll_in=<optimized out>, poll_out=<optimized out>, notify_poller_death=0 '\000') at event-epoll.c:417
#5  0x00007f9ef798ceb2 in socket_server_event_handler (fd=<optimized out>, idx=<optimized out>, gen=<optimized out>, data=0x7f9ee80403f0, poll_in=<optimized out>, poll_out=<optimized out>, poll_err=0, event_thread_died=0 '\000') at socket.c:2950
#6  0x00007f9efc8f8870 in event_dispatch_epoll_handler (event=0x7f98fffc9e70, event_pool=0x563fac588150) at event-epoll.c:643
#7  event_dispatch_epoll_worker (data=0x7f992c8ea110) at event-epoll.c:759
#8  0x00007f9efb6d5dd5 in start_thread (arg=0x7f98fffca700) at pthread_create.c:307
#9  0x00007f9efaf9cead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb)


So we have 2 coredump debug environments:
1. Natalia's ocs environment

ocs master:
ip-address : 10.74.251.54 
username : root
password : RedHat1! 

[root@vm251-54 ~]# oc get pods -o wide
NAME                                          READY     STATUS    RESTARTS   AGE       IP              NODE                               NOMINATED NODE
glusterblock-storage-provisioner-dc-1-49bmc   1/1       Running   0          11h       10.131.0.119    vm252-19.gsslab.pnq2.redhat.com    <none>
glusterfs-storage-mqkck                       1/1       Running   0          11h       10.74.252.19    vm252-19.gsslab.pnq2.redhat.com    <none>
glusterfs-storage-w9xfz                       1/1       Running   0          11h       10.74.254.205   vm254-205.gsslab.pnq2.redhat.com   <none>
glusterfs-storage-zzwd9                       1/1       Running   0          11h       10.74.252.69    vm252-69.gsslab.pnq2.redhat.com    <none>
heketi-storage-1-8vwsc                        1/1       Running   0          11h       10.131.0.117    vm252-19.gsslab.pnq2.redhat.com    <none>

[root@vm251-54 ~]# oc rsh glusterfs-storage-mqkck 
sh-4.2# cd /cores/
sh-4.2# gdb /usr/sbin/glusterfsd inf02-core.16787

2. digicore container mentioned in comment#30

--- Additional comment from Prashant Dhange on 2019-09-25 05:07:11 UTC ---

Echoing what Mohit has found earlier here in comment#8

(In reply to Mohit Agrawal from comment #8)
> Hi,
> 
> I have checked gluster processes logs specific to brick crash, I have not
> found any reason of brick crash in logs.
> There are no error messages available in logs just before a brick crash.
> 
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> 
> [2019-08-10 06:32:32.415690] I [addr.c:55:compare_addr_and_update]
> 0-/var/lib/heketi/mounts/vg_dac7467f7481ef3dc72981494e55c93c/
> brick_705d09c3f50d87b740ab828b8c819fee/brick: allowed = "*", received addr =
> "10.7.199.96"
> [2019-08-10 06:32:32.415729] I [MSGID: 115029]
> [server-handshake.c:817:server_setvolume]
> 0-vol_0005ed9e52ceb05f16b1bb48cd526ea9-server: accepted client from
> vlmazocpapp32.fisdev.local-57970-2019/08/10-06:32:32:332389-
> vol_ec6393163cc25b9fed3c1e5e2ff8375c-client-2-0-0 (version: 3.12.2)
> /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xa0)[0x7f8e5b3a6cc0]
There was no enough memory available to hold backtrace message/ 

> /lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f8e5b3b0c04]
> /lib64/libc.so.6(+0x36280)[0x7f8e59a06280]
> /lib64/libc.so.6(gsignal+0x37)[0x7f8e59a06207]
> /lib64/libc.so.6(abort+0x148)[0x7f8e59a078f8]
> /lib64/libc.so.6(+0x2f026)[0x7f8e599ff026]
> /lib64/libc.so.6(+0x2f0d2)[0x7f8e599ff0d2]
> /lib64/libglusterfs.so.0(+0x88c4b)[0x7f8e5b403c4b]
> /usr/lib64/glusterfs/3.12.2/rpc-transport/socket.so(+0xcdaf)[0x7f8e564bedaf]
> /lib64/libglusterfs.so.0(+0x894c4)[0x7f8e5b4044c4]
> /lib64/libpthread.so.0(+0x7dd5)[0x7f8e5a205dd5]
> /lib64/libc.so.6(clone+0x6d)[0x7f8e59aceb3d]
> 
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> 
> On the system side, I have found some of the processes got OOM killed (like
> heapster) multiple times due to cgroup memory limit configured but
> on the basis of current logs, I can not say the same is true for gluster
> also.
> We don't have complete backtrace and available backtrace pattern in logs is
> not
> sufficient to find out the reason for a crash.
> 
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> 
> [11799234.175888] SELinux: mount invalid.  Same superblock, different
> security settings for (dev mqueue, type mqueue)
> [11799235.185554] XFS (dm-1222): Unmounting Filesystem
> [11799532.330862] heapster invoked oom-killer: gfp_mask=0x50, order=0,
> oom_score_adj=985
> [11799532.341972] heapster
> cpuset=docker-
> 6e4a91e46c97a87310aab482bdd8b5350d58b0dc2c537f56769f3deddd73246e.scope
> mems_allowed=0
> [11799532.349607] CPU: 5 PID: 95365 Comm: heapster Not tainted
> 3.10.0-957.10.1.el7.x86_64 #1
> [11799532.355169] Hardware name: Microsoft Corporation Virtual
> Machine/Virtual Machine, BIOS 090007  06/02/2017
> [11799532.361995] Call Trace:
> [11799532.364181]  [<ffffffffae162e41>] dump_stack+0x19/0x1b
> [11799532.367727]  [<ffffffffae15d86a>] dump_header+0x90/0x229
> [11799532.371755]  [<ffffffffadbba156>] ? find_lock_task_mm+0x56/0xc0
> [11799532.375694]  [<ffffffffadc31628>] ?
> try_get_mem_cgroup_from_mm+0x28/0x60
> [11799532.381063]  [<ffffffffadbba604>] oom_kill_process+0x254/0x3d0
> [11799532.385394]  [<ffffffffadc35406>]
> mem_cgroup_oom_synchronize+0x546/0x570
> [11799532.389867]  [<ffffffffadc34880>] ? mem_cgroup_charge_common+0xc0/0xc0
> [11799532.393860]  [<ffffffffadbbae94>] pagefault_out_of_memory+0x14/0x90
> [11799532.397908]  [<ffffffffae15bd72>] mm_fault_error+0x6a/0x157
> [11799532.401558]  [<ffffffffae1707a8>] __do_page_fault+0x3c8/0x500
> [11799532.406047]  [<ffffffffae170915>] do_page_fault+0x35/0x90
> [11799532.410130]  [<ffffffffae16c758>] page_fault+0x28/0x30
> [11799532.419650] Task in
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice/docker-
> 6e4a91e46c97a87310aab482bdd8b5350d58b0dc2c537f56769f3deddd73246e.scope
> killed as a result of limit of
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice
> [11799532.446763] memory: usage 3662108kB, limit 3662108kB, failcnt 5883951
> [11799532.450874] memory+swap: usage 3662108kB, limit 9007199254740988kB,
> failcnt 0
> [11799532.455828] kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
> [11799532.459804] Memory cgroup stats for
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice: cache:0KB rss:0KB
> rss_huge:0KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:0KB
> inactive_file:0KB active_file:0KB unevictable:0KB
> [11799532.476295] Memory cgroup stats for
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice/docker-
> 56f2a10610e0295e6716b9de5e32334b96be7201b37f8160f728af4c3e366b08.scope:
> cache:0KB rss:428KB rss_huge:0KB mapped_file:0KB swap:0KB inactive_anon:0KB
> active_anon:428KB inactive_file:0KB active_file:0KB unevictable:0KB
> [11799532.500710] Memory cgroup stats for
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice/docker-
> 6e4a91e46c97a87310aab482bdd8b5350d58b0dc2c537f56769f3deddd73246e.scope:
> cache:452KB rss:3661228KB rss_huge:0KB mapped_file:452KB swap:0KB
> inactive_anon:0KB active_anon:3661224KB inactive_file:0KB active_file:368KB
> unevictable:0KB
> [11799532.521834] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents
> oom_score_adj name
> 
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
cgroup memory limitation not allowing to allocate more pages requested by glusterfsd here.

> 
> I have not found any logs specific to consuming high memory by gluster
> process
> 
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> 
> messages:Aug  5 04:48:38 vlmazocpinf01 kernel: [11799532.330862] heapster
> invoked oom-killer: gfp_mask=0x50, order=0, oom_score_adj=985
> messages:Aug  5 04:48:38 vlmazocpinf01 kernel: [11799532.381063] 
> [<ffffffffadbba604>] oom_kill_process+0x254/0x3d0
> messages:Aug  5 04:48:38 vlmazocpinf01 kernel: [11799532.419650] Task in
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice/docker-
> 6e4a91e46c97a87310aab482bdd8b5350d58b0dc2c537f56769f3deddd73246e.scope
> killed as a result of limit of
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice
> messages:Aug  5 04:48:38 vlmazocpinf01 kernel: [11799532.545283] Memory
> cgroup out of memory: Kill process 107416 (heapster) score 1985 or sacrifice
> child
> messages:Aug  5 04:48:38 vlmazocpinf01 kernel: [11799532.551985] Killed
> process 101585 (heapster) total-vm:5095012kB, anon-rss:3659164kB,
> file-rss:0kB, shmem-rss:0kB
> grep: messages-20190728: No such file or directory
> messages-20190804:Jul 31 03:06:10 vlmazocpinf01 kernel: [11361383.951387]
> heapster invoked oom-killer: gfp_mask=0x50, order=0, oom_score_adj=985
> messages-20190804:Jul 31 03:06:10 vlmazocpinf01 kernel: [11361384.041571] 
> [<ffffffffadbba604>] oom_kill_process+0x254/0x3d0
> messages-20190804:Jul 31 03:06:10 vlmazocpinf01 kernel: [11361384.112290]
> Task in
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice/docker-
> 3d1520b587fa1705dd72e9aacea58b31ae8b3d1b90592e1d00453f2292af1d20.scope
> killed as a result of limit of
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice
> messages-20190804:Jul 31 03:06:10 vlmazocpinf01 kernel: [11361384.353431]
> Memory cgroup out of memory: Kill process 40432 (heapster) score 1985 or
> sacrifice child
> messages-20190804:Jul 31 03:06:10 vlmazocpinf01 kernel: [11361384.368790]
> Killed process 9732 (heapster) total-vm:4980396kB, anon-rss:3659704kB,
> file-rss:0kB, shmem-rss:0kB
> 
> 
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> 
> Can we ask if the customer has observed any odd behavior at the time of
> brick crash?
> 
> Thanks,
> Mohit Agrawal

The crash seems to be because of event_slot_alloc could not find slot for new socket fd=1340, looks like __event_newtable failing to allocate memory to new table.

--- Additional comment from Mohit Agrawal on 2019-09-25 05:18:40 UTC ---

Hi,

It seems brick process is getting crashed because the function event_slot_alloc is not able to return valid slot.

bt
#0  0x00007f9efaed5207 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007f9efaed68f8 in __GI_abort () at abort.c:90
#2  0x00007f9efaece026 in __assert_fail_base (fmt=0x7f9efb028ea0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=assertion@entry=0x7f9efc94507a "slot->fd == fd", file=file@entry=0x7f9efc945054 "event-epoll.c", 
    line=line@entry=417, function=function@entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118> "event_register_epoll")
    at assert.c:92
#3  0x00007f9efaece0d2 in __GI___assert_fail (assertion=assertion@entry=0x7f9efc94507a "slot->fd == fd", 
    file=file@entry=0x7f9efc945054 "event-epoll.c", line=line@entry=417, 
    function=function@entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118> "event_register_epoll") at assert.c:101
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150, fd=<optimized out>, handler=<optimized out>, 
    data=<optimized out>, poll_in=<optimized out>, poll_out=<optimized out>, notify_poller_death=0 '\000')
    at event-epoll.c:417
#5  0x00007f9ef798ceb2 in socket_server_event_handler (fd=<optimized out>, idx=<optimized out>, gen=<optimized out>, 
    data=0x7f9ee80403f0, poll_in=<optimized out>, poll_out=<optimized out>, poll_err=0, event_thread_died=0 '\000')
    at socket.c:2950
#6  0x00007f9efc8f8870 in event_dispatch_epoll_handler (event=0x7f98fffc9e70, event_pool=0x563fac588150) at event-epoll.c:643
#7  event_dispatch_epoll_worker (data=0x7f992c8ea110) at event-epoll.c:759
#8  0x00007f9efb6d5dd5 in start_thread (arg=0x7f98fffca700) at pthread_create.c:307
#9  0x00007f9efaf9cead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) f 4
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150, fd=<optimized out>, handler=<optimized out>, 
    data=<optimized out>, poll_in=<optimized out>, poll_out=<optimized out>, notify_poller_death=0 '\000')
    at event-epoll.c:417
417		assert (slot->fd == fd);
(gdb) p slot
$3184 = (struct event_slot_epoll *) 0x7f9e247f81b0
(gdb) p *slot
$3185 = {fd = -1, events = 1073741851, gen = 216, idx = 0, ref = 1, do_close = 1, in_handler = 0, handled_error = 0, 
  data = 0x7f9e3c050be0, handler = 0x7f9ef7989980 <socket_event_handler>, lock = {spinlock = 0, mutex = {__data = {
        __lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, 
          __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}}, poller_death = {next = 0x7f9e247f8208, 
    prev = 0x7f9e247f8208}}

p event_pool->slots_used
$3188 = {1021, 10, 0 <repeats 1022 times>}
f 4
After print the complete fd table i am not able to figure out any entry for socket 1340
set $tmp = 1
while ($tmp != 1023)
print ((struct event_slot_epoll *)event_pool->ereg[0])[$tmp].fd
set $tmp = $tmp + 1
f 5
p new_sock
1340


As per the current code of event_slot_alloc first, it checks the value of slot_used to validate the free entry in the table. Current bt is showing the value of slots used is 1021(less than 1024) it means still it has some free slot and it sets the table to this registry index(0)

>>>>>>>>>>>>>>>>>>>>>

for (i = 0; i < EVENT_EPOLL_TABLES; i++) {
        switch (event_pool->slots_used[i]) {
            case EVENT_EPOLL_SLOTS:
                continue;
            case 0:
                if (!event_pool->ereg[i]) {
                    table = __event_newtable(event_pool, i);
                    if (!table)
                        return -1;
                } else {
                    table = event_pool->ereg[i];
                }
                break;
            default:
                table = event_pool->ereg[i];
                break;
        }

        if (table)
            /* break out of the loop */
            break;
    }

    if (!table)
        return -1;

    table_idx = i;

>>>>>>>>>>>>>>>>>>>>>>>>>

In below code it tries to check the free entry in the table.As per current slots_used value ideally 3 entry should be free in table but somehow here no entry is free. 

The code is not validating the fd assignment in table. It is just returning idx.

>>>>>>>>>>>>>>>>>>>>>>>>>>

    for (i = 0; i < EVENT_EPOLL_SLOTS; i++) {
        if (table[i].fd == -1) {
            /* wipe everything except bump the generation */
            gen = table[i].gen;
            memset(&table[i], 0, sizeof(table[i]));
            table[i].gen = gen + 1;

            LOCK_INIT(&table[i].lock);
            INIT_LIST_HEAD(&table[i].poller_death);

            table[i].fd = fd;
            if (notify_poller_death) {
                table[i].idx = table_idx * EVENT_EPOLL_SLOTS + i;
                list_add_tail(&table[i].poller_death,
                              &event_pool->poller_death);
            }

            event_pool->slots_used[table_idx]++;

            break;
        }
    }

>>>>>>>>>>>>>>>>>
    return table_idx * EVENT_EPOLL_SLOTS + i;


I think we need to update the code.
I have checked the event code,  I am not able to figure out why slots_used is not showing correct value.
Ideally slots_used value should be 1024 because in registry table no index is free and before returning
the index it should validate fd is successfully assigned or not.

I am still checking the code, need some more time to understand the issue.

Thanks,
Mohit Agrawal

--- Additional comment from Jay Samson on 2019-09-27 13:26:36 UTC ---

Hello,
May I know if we have an update that can be passed to the customer?

Regards,
Jay

--- Additional comment from Mohit Agrawal on 2019-09-30 10:23:38 UTC ---

Hi Jay,

 We are still working on this. I don't have any update yet.
 We are suspecting there is some race condition in socket code path. 
 Do you have any idea what is the frequency of getting this crash?
 Is the issue easily reproducible or can we ask something about the reproducer?

Thanks,
Mohit Agrawal

--- Additional comment from Mohit Agrawal on 2019-10-01 11:34:26 UTC ---

Hi Jay,

Can you confirm with the customer is it possible to test a patch?
On the basis of core we are not able to find out root cause of race.
Does the customer have some test environment where the issue is easily reproducible?


Thanks,
Mohit Agrawal

--- Additional comment from Jay Samson on 2019-10-01 11:48:10 UTC ---

Hello Mohit,
The customer has updated this today:

" Acutally, in taking a look, it does look like one of the gluster pods core dumped saturday morning. I'm grabbing the core file now to upload. The problem is, even though we do tend to see a similar trend here with WHEN the files are generated (usually saturday mornings 3am-7am or so) I have a really hard time trying to figure out just what is going on in the cluster at that time that may be causing this, or trying to map load back to specific pvcs, etc.. 

Last week we updated these nodes to RHEL 7.7 and they were rebooted as well, just FYI."

He has also attached the core file. I have copied it to supportshell. (Please let me know if you have a system IP or something where I can copy this file)

I will verify the feasibility of installing a test patch with them and let you know their inputs. (I'll keep the needinfo on me until this is available)

Thanks,
Jay

--- Additional comment from Mohit Agrawal on 2019-10-01 13:33:33 UTC ---

Hi,

 I don't any extra system, kindly upload new corefile on the same system.

Regards,
Mohit Agrawal

--- Additional comment from Jay Samson on 2019-10-01 14:26:16 UTC ---

Hello,
The latest corefile is copied to 10.72.93.42 under /

filename: core.9838.gz

Regards,
Jay

--- Additional comment from Mohit Agrawal on 2019-10-01 15:04:05 UTC ---

Hi Jay,

  It seems there is sufficient space available on the system, can you please untar and setup
  gdb environment to debug the core. Kindly ask logs also if logs are not available.

Regards,
Mohit Agrawal

--- Additional comment from Raghavendra G on 2019-10-01 17:21:16 UTC ---

(In reply to Mohit Agrawal from comment #38)
> 
> I think we need to update the code.
> I have checked the event code,  I am not able to figure out why slots_used
> is not showing correct value.
> Ideally slots_used value should be 1024 because in registry table no index
> is free and before returning
> the index it should validate fd is successfully assigned or not.
> 
> I am still checking the code, need some more time to understand the issue.

I found the reason why slots_used for table 0 was wrong. From one of the logs of a crashed brick process:

var-lib-heketi-mounts-vg_862823908fb7cc063683414098b0ab22-brick_99bcb9083a46f31076bfc008c88791ef-brick.log-20190817:[2019-08-17 04:14:03.801645] E [MSGID: 101063] [event-epoll.c:611:event_dispatch_epoll_handler] 0-epoll: stale fd found on idx=500, gen=130, events=1, slot->gen=132

The msg "stale fd found.." indicates that the slot is already deallocated and hence slot->ref would be 0. However 
1. event_dispatch_epoll_handler would call slot_ref on this slot (ref would be 0+1 = 1 at this point) 
2. event_dispatch_epoll_handler identifies its a stale fd and hence slot is not valid and exits 
3. event_dispatch_epoll_handler would call slot_unref on this slot (ref would be 1-1=0 again, resulting in dealloc again without any slot_alloc)

Note that only event_slot_alloc increments event_pool->slots_used[table_idx] (event_slot_get doesn't). This means for this slot there was one call to event_slot_alloc but two calls to event_slot_dealloc and hence the value of slots_used[0] would actually be 1 less than the actual number of slots allocated in table 0.

Coming back to the log message printed above, idx=500 means this slot belongs to table 0 and hence there is wrong accounting (1 less than actual value) in table 0. So if event_pool->slots_used[0] is 1023, the table is actually full. But event_slot_alloc would wrongly believe there is one slot available and will loop through all slots of table[0] without any success. But at the end of __event_slot_alloc we don't actually test whether we got an empty slot, but instead assume that the iterator 'i' is actually pointing to an empty slot in table 0 and construct and index out of it. As Mohit as already pointed out above, as index is not pointing to a valid slot it doesn't contain expected fd resulting in assertion failure.

The root cause here is that the slot->ref is not incremented atomically when slot is allocated. Instead it is done later as part of event_slot_get. If in this window if we happen to run into ref/unref cycles (as explained above) it would result in more number of calls to event_slot_deallocation than actually needed resulting in wrong accounting of slots_used in slot table.

The fix would be:
1. increment slot->ref atomically in __event_slot_alloc
2. Add checks to __event_slot_alloc whether it actually returns a valid slot instead of assuming it does.

> 
> Thanks,
> Mohit Agrawal

--- Additional comment from Raghavendra G on 2019-10-02 02:46:39 UTC ---

(In reply to Raghavendra G from comment #46)
>
> The root cause here is that the slot->ref is not incremented atomically when
> slot is allocated. Instead it is done later as part of event_slot_get. If in
> this window if we happen to run into ref/unref cycles (as explained above)
> it would result in more number of calls to event_slot_deallocation than
> actually needed resulting in wrong accounting of slots_used in slot table.
> 
> The fix would be:
> 1. increment slot->ref atomically in __event_slot_alloc
> 2. Add checks to __event_slot_alloc whether it actually returns a valid slot
> instead of assuming it does.


While the problem I mentioned above related to non-atomic increment of slot->ref while allocating a slot is a valid one, the other problem is ref/unref cycles on a deallocated slot resulting in calling of deallocation of an already deallocated slot. This problem won't be fixed by atomically incrementing slot->ref in __event_slot_alloc. The fix for this problem would require preventing of ref/unref cycles and/or deallocation of an unused socket. One trivial fix that comes to my mind is to not return a slot at all (instead return NULL) if index is not pointing to a valid slot (a valid slot will have a non-negative fd and generation number matches with the value passed in argument of event_slot_get). We can continue discussions on this over the patch on gerrit.

--- Additional comment from Mohit Agrawal on 2019-10-03 03:29:35 UTC ---

Hi Jay,

  It seems there is no sufficient space available on the system(10.72.93.42), can you please unzip and setup
  gdb environment to debug the core. Kindly ask logs also if logs are not available.

Regards,
Mohit Agrawal

--- Additional comment from Jay Samson on 2019-10-03 13:28:34 UTC ---

Hi Mohit,

The core debug env is set-up (access instructions same as Comment #30)

# gdb /usr/sbin/glusterfsd core.9838
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/glusterfsd...Reading symbols from /usr/lib/debug/usr/sbin/glusterfsd.debug...done.
done.


Core was generated by `/usr/sbin/glusterfsd -s 10.7.199.47 --volfile-id vol_ad9ae40b319276e6e36ff24f81'.
Program terminated with signal 6, Aborted.
#0  0x00007fb134b52207 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55
55	  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);

(gdb) t a a bt

Thread 3594 (Thread 0x7fae1f7fe700 (LWP 12072)):
#0  0x00007fb134c10f73 in select () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fb12d8d4aca in changelog_ev_dispatch (data=0x7fae9beba838) at changelog-ev-handle.c:350
#2  0x00007fb135352dd5 in start_thread (arg=0x7fae1f7fe700) at pthread_create.c:307
#3  0x00007fb134c19ead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 3593 (Thread 0x7fa90143a700 (LWP 46542)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007fb12c5f4f50 in iot_worker (data=0x7fac1ec29890) at io-threads.c:194
#2  0x00007fb135352dd5 in start_thread (arg=0x7fa90143a700) at pthread_create.c:307
#3  0x00007fb134c19ead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 3592 (Thread 0x7faa7f7fe700 (LWP 14802)):
#0  0x00007fb134be0e2d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fb134be0cc4 in __sleep (seconds=0, seconds@entry=5) at ../sysdeps/unix/sysv/linux/sleep.c:137
#2  0x00007fb12ea3f5c1 in posix_disk_space_check_thread_proc (data=0x7fab07c5d7a0) at posix-helpers.c:1920
#3  0x00007fb135352dd5 in start_thread (arg=0x7faa7f7fe700) at pthread_create.c:307
#4  0x00007fb134c19ead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 3591 (Thread 0x7fabb6158700 (LWP 14062)):
#0  0x00007fb134c10f73 in select () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fb12d8d4aca in changelog_ev_dispatch (data=0x7fac0685e0b8) at changelog-ev-handle.c:350
#2  0x00007fb135352dd5 in start_thread (arg=0x7fabb6158700) at pthread_create.c:307
#3  0x00007fb134c19ead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 3590 (Thread 0x7fad5ad98700 (LWP 12601)):
#0  0x00007fb134c10f73 in select () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fb12d8d4aca in changelog_ev_dispatch (data=0x7fad95d8c0a8) at changelog-ev-handle.c:350
#2  0x00007fb135352dd5 in start_thread (arg=0x7fad5ad98700) at pthread_create.c:307
#3  0x00007fb134c19ead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

<snipped>

Regards,
Jay

--- Additional comment from Mohit Agrawal on 2019-10-14 04:57:03 UTC ---

Hi,

Thanks for setup the environment to debug the core.
The new core(./case/core.9838) is also similar to the older one.

>>>>>>>>>>>>>>>>>>>

#0  0x00007fb134b52207 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007fb134b538f8 in __GI_abort () at abort.c:90
#2  0x00007fb134b4b026 in __assert_fail_base (fmt=0x7fb134ca5ea0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=assertion@entry=0x7fb1365c207a "slot->fd == fd", file=file@entry=0x7fb1365c2054 "event-epoll.c", 
    line=line@entry=417, function=function@entry=0x7fb1365c2420 <__PRETTY_FUNCTION__.11118> "event_register_epoll")
    at assert.c:92
#3  0x00007fb134b4b0d2 in __GI___assert_fail (assertion=assertion@entry=0x7fb1365c207a "slot->fd == fd", 
    file=file@entry=0x7fb1365c2054 "event-epoll.c", line=line@entry=417, 
    function=function@entry=0x7fb1365c2420 <__PRETTY_FUNCTION__.11118> "event_register_epoll") at assert.c:101
#4  0x00007fb136574d04 in event_register_epoll (event_pool=0x561792cb1150, fd=<optimized out>, handler=<optimized out>, 
    data=<optimized out>, poll_in=<optimized out>, poll_out=<optimized out>, notify_poller_death=0 '\000')
    at event-epoll.c:417
#5  0x00007fb131509eb2 in socket_server_event_handler (fd=<optimized out>, idx=<optimized out>, gen=<optimized out>, 
    data=0x7fb12003eee0, poll_in=<optimized out>, poll_out=<optimized out>, poll_err=0, event_thread_died=0 '\000')
    at socket.c:2950
#6  0x00007fb136575870 in event_dispatch_epoll_handler (event=0x7faceb1ece30, event_pool=0x561792cb1150) at event-epoll.c:643
#7  event_dispatch_epoll_worker (data=0x7fad1d7569a0) at event-epoll.c:759
#8  0x00007fb135352dd5 in start_thread (arg=0x7faceb1ed700) at pthread_create.c:307
#9  0x00007fb134c19ead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

>>>>>>>>>>>>>>>>>>>.

Thanks,
Mohit Agrawal

--- Additional comment from Mohit Agrawal on 2019-10-18 06:01:22 UTC ---

Hi,

It seems the brick process is getting crashed because the function event_slot_alloc is not able to 
return a valid slot.

bt
#0  0x00007f9efaed5207 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007f9efaed68f8 in __GI_abort () at abort.c:90
#2  0x00007f9efaece026 in __assert_fail_base (fmt=0x7f9efb028ea0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=assertion@entry=0x7f9efc94507a "slot->fd == fd", file=file@entry=0x7f9efc945054 "event-epoll.c", 
    line=line@entry=417, function=function@entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118> "event_register_epoll")
    at assert.c:92
#3  0x00007f9efaece0d2 in __GI___assert_fail (assertion=assertion@entry=0x7f9efc94507a "slot->fd == fd", 
    file=file@entry=0x7f9efc945054 "event-epoll.c", line=line@entry=417, 
    function=function@entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118> "event_register_epoll") at assert.c:101
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150, fd=<optimized out>, handler=<optimized out>, 
    data=<optimized out>, poll_in=<optimized out>, poll_out=<optimized out>, notify_poller_death=0 '\000')
    at event-epoll.c:417
#5  0x00007f9ef798ceb2 in socket_server_event_handler (fd=<optimized out>, idx=<optimized out>, gen=<optimized out>, 
    data=0x7f9ee80403f0, poll_in=<optimized out>, poll_out=<optimized out>, poll_err=0, event_thread_died=0 '\000')
    at socket.c:2950
#6  0x00007f9efc8f8870 in event_dispatch_epoll_handler (event=0x7f98fffc9e70, event_pool=0x563fac588150) at event-epoll.c:643
#7  event_dispatch_epoll_worker (data=0x7f992c8ea110) at event-epoll.c:759
#8  0x00007f9efb6d5dd5 in start_thread (arg=0x7f98fffca700) at pthread_create.c:307
#9  0x00007f9efaf9cead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) f 4
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150, fd=<optimized out>, handler=<optimized out>, 
    data=<optimized out>, poll_in=<optimized out>, poll_out=<optimized out>, notify_poller_death=0 '\000')
    at event-epoll.c:417
417		assert (slot->fd == fd);
(gdb) p slot
$3184 = (struct event_slot_epoll *) 0x7f9e247f81b0
(gdb) p *slot
$3185 = {fd = -1, events = 1073741851, gen = 216, idx = 0, ref = 1, do_close = 1, in_handler = 0, handled_error = 0, 
  data = 0x7f9e3c050be0, handler = 0x7f9ef7989980 <socket_event_handler>, lock = {spinlock = 0, mutex = {__data = {
        __lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, 
          __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}}, poller_death = {next = 0x7f9e247f8208, 
    prev = 0x7f9e247f8208}}

p event_pool->slots_used
$3188 = {1021, 10, 0 <repeats 1022 times>}
f 4
After print the complete fd table i am not able to figure out any entry for socket 1340
set $tmp = 1
while ($tmp != 1023)
print ((struct event_slot_epoll *)event_pool->ereg[0])[$tmp].fd
set $tmp = $tmp + 1
f 5
p new_sock
1340


As per the current code of event_slot_alloc first, it checks the value of slot_used to validate the free entry in the table. Current bt is showing the value of slots used is 1021(less than 1024) it means still it has some free slot and it sets the table to this registry index(0)

>>>>>>>>>>>>>>>>>>>>>

for (i = 0; i < EVENT_EPOLL_TABLES; i++) {
        switch (event_pool->slots_used[i]) {
            case EVENT_EPOLL_SLOTS:
                continue;
            case 0:
                if (!event_pool->ereg[i]) {
                    table = __event_newtable(event_pool, i);
                    if (!table)
                        return -1;
                } else {
                    table = event_pool->ereg[i];
                }
                break;
            default:
                table = event_pool->ereg[i];
                break;
        }

        if (table)
            /* break out of the loop */
            break;
    }

    if (!table)
        return -1;

    table_idx = i;

>>>>>>>>>>>>>>>>>>>>>>>>>

In below code it tries to check the free entry in the table.As per current slots_used value ideally 3 entry should be free in table but somehow here no entry is free. 

The code is not validating the fd assignment in the table. It is just returning idx.

>>>>>>>>>>>>>>>>>>>>>>>>>>

    for (i = 0; i < EVENT_EPOLL_SLOTS; i++) {
        if (table[i].fd == -1) {
            /* wipe everything except bump the generation */
            gen = table[i].gen;
            memset(&table[i], 0, sizeof(table[i]));
            table[i].gen = gen + 1;

            LOCK_INIT(&table[i].lock);
            INIT_LIST_HEAD(&table[i].poller_death);

            table[i].fd = fd;
            if (notify_poller_death) {
                table[i].idx = table_idx * EVENT_EPOLL_SLOTS + i;
                list_add_tail(&table[i].poller_death,
                              &event_pool->poller_death);
            }

            event_pool->slots_used[table_idx]++;

            break;
        }
    }

>>>>>>>>>>>>>>>>>
    return table_idx * EVENT_EPOLL_SLOTS + i;


I think we need to update the code.
I have checked the event code,  I am not able to figure out why slots_used is not showing correct value.
Ideally slots_used value should be 1024 because in registry table no index is free and before returning
the index it should validate fd is successfully assigned or not.

RCA: The slot->ref is not incremented atomically when slot is allocated. Instead it is done later as part of event_slot_get. If in this window if we happen to run into ref/unref cycles (as explained above) it would result in more calls to event_slot_deallocation than actually needed resulting in wrong accounting of slots_used in slot table.

The fix would be:
1. increment slot->ref atomically in __event_slot_alloc
2. Add checks to __event_slot_alloc whether it actually returns a valid slot instead of assuming it does.

Thanks,
Mohit Agrawal

--- Additional comment from Mohit Agrawal on 2019-10-18 06:05:29 UTC ---

Patch is posted to resolve the same
https://review.gluster.org/#/c/glusterfs/+/23508/

--- Additional comment from Worker Ant on 2019-10-22 13:47:20 UTC ---

REVIEW: https://review.gluster.org/23508 (rpc: Synchronize slot allocation code) merged (#7) on master by Raghavendra G

Comment 1 Worker Ant 2019-11-29 12:57:46 UTC
REVIEW: https://review.gluster.org/23774 (rpc: Synchronize slot allocation code) posted (#2) for review on release-7 by MOHIT AGRAWAL

Comment 2 Worker Ant 2019-12-05 07:22:57 UTC
REVIEW: https://review.gluster.org/23774 (rpc: Synchronize slot allocation code) merged (#3) on release-7 by Rinku Kothiya


Note You need to log in before you can comment on or make changes to this bug.