Bug 1597768
Summary: | br-state-check.t crashed while brick multiplex is enabled | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Mohit Agrawal <moagrawa> | |
Component: | glusterfs | Assignee: | Mohit Agrawal <moagrawa> | |
Status: | CLOSED ERRATA | QA Contact: | Bala Konda Reddy M <bmekala> | |
Severity: | unspecified | Docs Contact: | ||
Priority: | unspecified | |||
Version: | rhgs-3.4 | CC: | amukherj, nchilaka, rhs-bugs, sankarshan, sheggodu, vbellur | |
Target Milestone: | --- | |||
Target Release: | RHGS 3.4.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.12.2-14 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1597776 (view as bug list) | Environment: | ||
Last Closed: | 2018-09-04 06:50:20 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1503137, 1597776 |
Description
Mohit Agrawal
2018-07-03 15:08:01 UTC
Hi, Below is the bt pattern for brick process (gdb) bt #0 0x00007fe461b5e34d in memset (__len=2792, __ch=0, __dest=0x0) at /usr/include/bits/string3.h:84 #1 rpcsvc_request_create (svc=svc@entry=0x7fe420058be0, trans=trans@entry=0x7fe4501e68b0, msg=msg@entry=0x7fe4501e9650) at rpcsvc.c:459 #2 0x00007fe461b5e7c5 in rpcsvc_handle_rpc_call (svc=0x7fe420058be0, trans=trans@entry=0x7fe4501e68b0, msg=0x7fe4501e9650) at rpcsvc.c:615 #3 0x00007fe461b5ebeb in rpcsvc_notify (trans=0x7fe4501e68b0, mydata=<optimized out>, event=<optimized out>, data=<optimized out>) at rpcsvc.c:789 #4 0x00007fe461b60b23 in rpc_transport_notify (this=this@entry=0x7fe4501e68b0, event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7fe4501e9650) at rpc-transport.c:538 #5 0x00007fe45698f5d6 in socket_event_poll_in (this=this@entry=0x7fe4501e68b0, notify_handled=<optimized out>) at socket.c:2315 #6 0x00007fe456991b7c in socket_event_handler (fd=23, idx=10, gen=4, data=0x7fe4501e68b0, poll_in=1, poll_out=0, poll_err=0) at socket.c:2467 #7 0x00007fe461dfa524 in event_dispatch_epoll_handler (event=0x7fe454edae80, event_pool=0x55e74e601200) at event-epoll.c:583 #8 event_dispatch_epoll_worker (data=0x55e74e64a9e0) at event-epoll.c:659 #9 0x00007fe460bfbe25 in start_thread () from /usr/lib64/libpthread.so.0 #10 0x00007fe4604c834d in clone () from /usr/lib64/libc.so.6 $3 = (xlator_t *) 0x7fe4200062a0 (gdb) p *(xlator_t*)this->xl $4 = {name = 0x7fe420006e60 "patchy-changelog", type = 0x7fe420006fe0 "features/changelog", instance_name = 0x0, next = 0x7fe420003960, prev = 0x7fe420007720, parents = 0x7fe420008460, children = 0x7fe4200076c0, options = 0x0, dlhandle = 0x7fe450011250, fops = 0x7fe44f4f8780 <fops>, cbks = 0x7fe44f4f8720 <cbks>, dumpops = 0x0, volume_options = {next = 0x7fe420006300, prev = 0x7fe420006300}, fini = 0x7fe44f2e9560 <fini>, init = 0x7fe44f2e8a60 <init>, reconfigure = 0x7fe44f2e8370 <reconfigure>, mem_acct_init = 0x7fe44f2e82f0 <mem_acct_init>, notify = 0x7fe44f2e7990 <notify>, loglevel = GF_LOG_NONE, client_latency = 0, latencies = {{min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0} <repeats 55 times>}, history = 0x0, ctx = 0x55e74e5ca010, graph = 0x7fe420000990, itable = 0x0, init_succeeded = 1 '\001', private = 0x0, mem_acct = 0x7fe420053720, winds = 0, switched = 0 '\000', local_pool = 0x0, is_autoloaded = _gf_false, volfile_id = 0x0, xl_id = 4, cleanup_starting = 1, call_cleanup = 1} (gdb) p *svc->xl $7 = {name = 0x7fe420006e60 "patchy-changelog", type = 0x7fe420006fe0 "features/changelog", instance_name = 0x0, next = 0x7fe420003960, prev = 0x7fe420007720, parents = 0x7fe420008460, children = 0x7fe4200076c0, options = 0x0, dlhandle = 0x7fe450011250, fops = 0x7fe44f4f8780 <fops>, cbks = 0x7fe44f4f8720 <cbks>, dumpops = 0x0, volume_options = {next = 0x7fe420006300, prev = 0x7fe420006300}, fini = 0x7fe44f2e9560 <fini>, init = 0x7fe44f2e8a60 <init>, reconfigure = 0x7fe44f2e8370 <reconfigure>, mem_acct_init = 0x7fe44f2e82f0 <mem_acct_init>, notify = 0x7fe44f2e7990 <notify>, loglevel = GF_LOG_NONE, client_latency = 0, latencies = {{min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0} <repeats 55 times>}, history = 0x0, ctx = 0x55e74e5ca010, graph = 0x7fe420000990, itable = 0x0, init_succeeded = 1 '\001', private = 0x0, mem_acct = 0x7fe420053720, winds = 0, switched = 0 '\000', local_pool = 0x0, is_autoloaded = _gf_false, volfile_id = 0x0, xl_id = 4, cleanup_starting = 1, call_cleanup = 1} (gdb) p svc $8 = (rpcsvc_t *) 0x7fe420058be0 (gdb) p *svc $9 = {rpclock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, memfactor = 8, authschemes = {next = 0x7fe420058ea0, prev = 0x7fe420059080}, options = 0x7fe420058470, allow_insecure = _gf_true, register_portmap = _gf_true, root_squash = _gf_false, anonuid = 65534, anongid = 65534, ctx = 0x55e74e5ca010, listeners = { next = 0x7fe420058c48, prev = 0x7fe420058c48}, programs = {next = 0x7fe420059638, prev = 0x7fe420059638}, notify = {next = 0x7fe420058c68, prev = 0x7fe420058c68}, notify_count = 1, xl = 0x7fe4200062a0, mydata = 0x7fe4200062a0, notifyfn = 0x0, rxpool = 0x0, drc = 0x0, outstanding_rpc_limit = 0, addr_namelookup = _gf_false, throttle = _gf_false} It is showing clearly xprt is getting a request for changelog rpc for that rxpool is already destroyed so at the time of allocating memory for an RPC request brick process is getting crashed. To resolve the same need to update the rpc cleanup code in changelog xlator. Regards Mohit Agrawal Build: 3.12.2-14 On brick mux setup of 3 nodes, Followed the steps from the br-state-check.t and no glusterd crashes seen. Run the prove tests by installing by source on RHEL7.5 "prove -vf tests/bitrot/br-state-check.t" [root@dhcp37-188 rhs-glusterfs]# prove -vf tests/bitrot/br-state-check.t tests/bitrot/br-state-check.t .. 1..35 ok All tests successful. Files=1, Tests=35, 41 wallclock secs ( 0.05 usr 0.01 sys + 2.23 cusr 2.52 csys = 4.81 CPU) Result: PASS Hence marking it as verified Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607 |