Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1597768 - br-state-check.t crashed while brick multiplex is enabled
br-state-check.t crashed while brick multiplex is enabled
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterfs (Show other bugs)
3.4
Unspecified Unspecified
unspecified Severity unspecified
: ---
: RHGS 3.4.0
Assigned To: Mohit Agrawal
Bala Konda Reddy M
:
Depends On:
Blocks: 1503137 1597776
  Show dependency treegraph
 
Reported: 2018-07-03 11:08 EDT by Mohit Agrawal
Modified: 2018-09-04 02:51 EDT (History)
6 users (show)

See Also:
Fixed In Version: glusterfs-3.12.2-14
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1597776 (view as bug list)
Environment:
Last Closed: 2018-09-04 02:50:20 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:2607 None None None 2018-09-04 02:51 EDT

  None (edit)
Description Mohit Agrawal 2018-07-03 11:08:01 EDT
Description of problem:

Test case ./tests/bitrot/br-state-check.t crashed while brick multiplex
is enabled.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:

Test case crashed
Expected results:
Test case should not crash.

Additional info:
Comment 2 Mohit Agrawal 2018-07-03 11:17:37 EDT
Hi,

Below is the bt pattern for brick process 
(gdb) bt
#0  0x00007fe461b5e34d in memset (__len=2792, __ch=0, __dest=0x0) at /usr/include/bits/string3.h:84
#1  rpcsvc_request_create (svc=svc@entry=0x7fe420058be0, trans=trans@entry=0x7fe4501e68b0, 
    msg=msg@entry=0x7fe4501e9650) at rpcsvc.c:459
#2  0x00007fe461b5e7c5 in rpcsvc_handle_rpc_call (svc=0x7fe420058be0, trans=trans@entry=0x7fe4501e68b0, 
    msg=0x7fe4501e9650) at rpcsvc.c:615
#3  0x00007fe461b5ebeb in rpcsvc_notify (trans=0x7fe4501e68b0, mydata=<optimized out>, 
    event=<optimized out>, data=<optimized out>) at rpcsvc.c:789
#4  0x00007fe461b60b23 in rpc_transport_notify (this=this@entry=0x7fe4501e68b0, 
    event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7fe4501e9650) at rpc-transport.c:538
#5  0x00007fe45698f5d6 in socket_event_poll_in (this=this@entry=0x7fe4501e68b0, 
    notify_handled=<optimized out>) at socket.c:2315
#6  0x00007fe456991b7c in socket_event_handler (fd=23, idx=10, gen=4, data=0x7fe4501e68b0, poll_in=1, 
    poll_out=0, poll_err=0) at socket.c:2467
#7  0x00007fe461dfa524 in event_dispatch_epoll_handler (event=0x7fe454edae80, event_pool=0x55e74e601200)
    at event-epoll.c:583
#8  event_dispatch_epoll_worker (data=0x55e74e64a9e0) at event-epoll.c:659
#9  0x00007fe460bfbe25 in start_thread () from /usr/lib64/libpthread.so.0
#10 0x00007fe4604c834d in clone () from /usr/lib64/libc.so.6

$3 = (xlator_t *) 0x7fe4200062a0
(gdb) p *(xlator_t*)this->xl
$4 = {name = 0x7fe420006e60 "patchy-changelog", type = 0x7fe420006fe0 "features/changelog", 
  instance_name = 0x0, next = 0x7fe420003960, prev = 0x7fe420007720, parents = 0x7fe420008460, 
  children = 0x7fe4200076c0, options = 0x0, dlhandle = 0x7fe450011250, fops = 0x7fe44f4f8780 <fops>, 
  cbks = 0x7fe44f4f8720 <cbks>, dumpops = 0x0, volume_options = {next = 0x7fe420006300, 
    prev = 0x7fe420006300}, fini = 0x7fe44f2e9560 <fini>, init = 0x7fe44f2e8a60 <init>, 
  reconfigure = 0x7fe44f2e8370 <reconfigure>, mem_acct_init = 0x7fe44f2e82f0 <mem_acct_init>, 
  notify = 0x7fe44f2e7990 <notify>, loglevel = GF_LOG_NONE, client_latency = 0, latencies = {{min = 0, 
      max = 0, total = 0, std = 0, mean = 0, count = 0} <repeats 55 times>}, history = 0x0, 
  ctx = 0x55e74e5ca010, graph = 0x7fe420000990, itable = 0x0, init_succeeded = 1 '\001', private = 0x0, 
  mem_acct = 0x7fe420053720, winds = 0, switched = 0 '\000', local_pool = 0x0, 
  is_autoloaded = _gf_false, volfile_id = 0x0, xl_id = 4, cleanup_starting = 1, call_cleanup = 1}


(gdb) p *svc->xl
$7 = {name = 0x7fe420006e60 "patchy-changelog", type = 0x7fe420006fe0 "features/changelog", 
  instance_name = 0x0, next = 0x7fe420003960, prev = 0x7fe420007720, parents = 0x7fe420008460, 
  children = 0x7fe4200076c0, options = 0x0, dlhandle = 0x7fe450011250, fops = 0x7fe44f4f8780 <fops>, 
  cbks = 0x7fe44f4f8720 <cbks>, dumpops = 0x0, volume_options = {next = 0x7fe420006300, 
    prev = 0x7fe420006300}, fini = 0x7fe44f2e9560 <fini>, init = 0x7fe44f2e8a60 <init>, 
  reconfigure = 0x7fe44f2e8370 <reconfigure>, mem_acct_init = 0x7fe44f2e82f0 <mem_acct_init>, 
  notify = 0x7fe44f2e7990 <notify>, loglevel = GF_LOG_NONE, client_latency = 0, latencies = {{min = 0, 
      max = 0, total = 0, std = 0, mean = 0, count = 0} <repeats 55 times>}, history = 0x0, 
  ctx = 0x55e74e5ca010, graph = 0x7fe420000990, itable = 0x0, init_succeeded = 1 '\001', private = 0x0, 
  mem_acct = 0x7fe420053720, winds = 0, switched = 0 '\000', local_pool = 0x0, 
  is_autoloaded = _gf_false, volfile_id = 0x0, xl_id = 4, cleanup_starting = 1, call_cleanup = 1}
(gdb) p svc
$8 = (rpcsvc_t *) 0x7fe420058be0
(gdb) p *svc
$9 = {rpclock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, 
      __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, 
    __align = 0}, memfactor = 8, authschemes = {next = 0x7fe420058ea0, prev = 0x7fe420059080}, 
  options = 0x7fe420058470, allow_insecure = _gf_true, register_portmap = _gf_true, 
  root_squash = _gf_false, anonuid = 65534, anongid = 65534, ctx = 0x55e74e5ca010, listeners = {
    next = 0x7fe420058c48, prev = 0x7fe420058c48}, programs = {next = 0x7fe420059638, 
    prev = 0x7fe420059638}, notify = {next = 0x7fe420058c68, prev = 0x7fe420058c68}, notify_count = 1, 
  xl = 0x7fe4200062a0, mydata = 0x7fe4200062a0, notifyfn = 0x0, rxpool = 0x0, drc = 0x0, 
  outstanding_rpc_limit = 0, addr_namelookup = _gf_false, throttle = _gf_false}


It is showing clearly xprt is getting a request for changelog rpc for that rxpool is already destroyed so at the time of allocating memory for an RPC request brick process is getting crashed. To resolve the same need to update the rpc cleanup code in changelog xlator.

Regards
Mohit Agrawal
Comment 11 Bala Konda Reddy M 2018-07-25 07:18:41 EDT
Build: 3.12.2-14
On brick mux setup of 3 nodes,
Followed the steps from the br-state-check.t and no glusterd crashes seen.

Run the prove tests by installing by source on RHEL7.5 "prove -vf tests/bitrot/br-state-check.t"
[root@dhcp37-188 rhs-glusterfs]# prove -vf tests/bitrot/br-state-check.t 
tests/bitrot/br-state-check.t .. 
1..35
ok
All tests successful.
Files=1, Tests=35, 41 wallclock secs ( 0.05 usr  0.01 sys +  2.23 cusr  2.52 csys =  4.81 CPU)
Result: PASS

Hence marking it as verified
Comment 12 errata-xmlrpc 2018-09-04 02:50:20 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607

Note You need to log in before you can comment on or make changes to this bug.