Bug 1690769 - GlusterFS 5.5 crashes in 1x4 replicate setup.
Summary: GlusterFS 5.5 crashes in 1x4 replicate setup.
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: 5
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-20 08:24 UTC by Amar Tumballi
Modified: 2019-05-10 13:23 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-05-10 13:23:27 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Amar Tumballi 2019-03-20 08:24:10 UTC
Description of problem:

Looks like an issue with AFR in 1x4 setup for me looking at the backtraces:

(gdb) bt
#0 0x00007f95a054f0e0 in raise () from /lib64/libc.so.6
#1 0x00007f95a05506c1 in abort () from /lib64/libc.so.6
#2 0x00007f95a05476fa in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007f95a0547772 in __assert_fail () from /lib64/libc.so.6
#4 0x00007f95a08dd0b8 in pthread_mutex_lock () from /lib64/libpthread.so.0
#5 0x00007f95994f0c9d in afr_frame_return () from /usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so
#6 0x00007f9599503ba1 in afr_lookup_cbk () from /usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so
#7 0x00007f9599788f3f in client4_0_lookup_cbk () from /usr/lib64/glusterfs/5.3/xlator/protocol/client.so
#8 0x00007f95a1153820 in rpc_clnt_handle_reply () from /usr/lib64/libgfrpc.so.0
#9 0x00007f95a1153b6f in rpc_clnt_notify () from /usr/lib64/libgfrpc.so.0
#10 0x00007f95a1150063 in rpc_transport_notify () from /usr/lib64/libgfrpc.so.0
#11 0x00007f959aea00b2 in socket_event_handler () from /usr/lib64/glusterfs/5.3/rpc-transport/socket.so
#12 0x00007f95a13e64c3 in event_dispatch_epoll_worker () from /usr/lib64/libglusterfs.so.0
#13 0x00007f95a08da559 in start_thread () from /lib64/libpthread.so.0
#14 0x00007f95a061181f in clone () from /lib64/libc.so.6
(gdb) thr 14

Thread 14 (Thread 0x7f9592ec7700 (LWP 6572)):
#0 0x00007f95a08e3c4d in __lll_lock_wait () from /lib64/libpthread.so.0
No symbol table info available.
#1 0x00007f95a08e68b7 in __lll_lock_elision () from /lib64/libpthread.so.0
No symbol table info available.
#2 0x00007f95994f0c9d in afr_frame_return () from /usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so
No symbol table info available.
#3 0x00007f9599503ba1 in afr_lookup_cbk () from /usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so
No symbol table info available.
#4 0x00007f9599788f3f in client4_0_lookup_cbk () from /usr/lib64/glusterfs/5.3/xlator/protocol/client.so
No symbol table info available.
#5 0x00007f95a1153820 in rpc_clnt_handle_reply () from /usr/lib64/libgfrpc.so.0
No symbol table info available.
#6 0x00007f95a1153b6f in rpc_clnt_notify () from /usr/lib64/libgfrpc.so.0
No symbol table info available.
#7 0x00007f95a1150063 in rpc_transport_notify () from /usr/lib64/libgfrpc.so.0
No symbol table info available.
#8 0x00007f959aea00b2 in socket_event_handler () from /usr/lib64/glusterfs/5.3/rpc-transport/socket.so
No symbol table info available.
#9 0x00007f95a13e64c3 in event_dispatch_epoll_worker () from /usr/lib64/libglusterfs.so.0
No symbol table info available.
#10 0x00007f95a08da559 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#11 0x00007f95a061181f in clone () from /lib64/libc.so.6
No symbol table info available.


Version-Release number of selected component (if applicable):
5.5 (and also 5.3, not seen in 3.x)

How reproducible:
100% 


Additional info:
Please refer to https://lists.gluster.org/pipermail/gluster-users/2019-March/036048.html & https://lists.gluster.org/pipermail/gluster-users/2019-February/035871.html

Comment 1 Pranith Kumar K 2019-03-20 09:18:57 UTC
(In reply to Amar Tumballi from comment #0)
> Description of problem:
> 
> Looks like an issue with AFR in 1x4 setup for me looking at the backtraces:
> 
> (gdb) bt
> #0 0x00007f95a054f0e0 in raise () from /lib64/libc.so.6
> #1 0x00007f95a05506c1 in abort () from /lib64/libc.so.6
> #2 0x00007f95a05476fa in __assert_fail_base () from /lib64/libc.so.6
> #3 0x00007f95a0547772 in __assert_fail () from /lib64/libc.so.6
> #4 0x00007f95a08dd0b8 in pthread_mutex_lock () from /lib64/libpthread.so.0
> #5 0x00007f95994f0c9d in afr_frame_return () from
> /usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so
> #6 0x00007f9599503ba1 in afr_lookup_cbk () from
> /usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so
> #7 0x00007f9599788f3f in client4_0_lookup_cbk () from
> /usr/lib64/glusterfs/5.3/xlator/protocol/client.so
> #8 0x00007f95a1153820 in rpc_clnt_handle_reply () from
> /usr/lib64/libgfrpc.so.0
> #9 0x00007f95a1153b6f in rpc_clnt_notify () from /usr/lib64/libgfrpc.so.0
> #10 0x00007f95a1150063 in rpc_transport_notify () from
> /usr/lib64/libgfrpc.so.0
> #11 0x00007f959aea00b2 in socket_event_handler () from
> /usr/lib64/glusterfs/5.3/rpc-transport/socket.so
> #12 0x00007f95a13e64c3 in event_dispatch_epoll_worker () from
> /usr/lib64/libglusterfs.so.0
> #13 0x00007f95a08da559 in start_thread () from /lib64/libpthread.so.0
> #14 0x00007f95a061181f in clone () from /lib64/libc.so.6
> (gdb) thr 14
> 
> Thread 14 (Thread 0x7f9592ec7700 (LWP 6572)):
> #0 0x00007f95a08e3c4d in __lll_lock_wait () from /lib64/libpthread.so.0
> No symbol table info available.
> #1 0x00007f95a08e68b7 in __lll_lock_elision () from /lib64/libpthread.so.0
> No symbol table info available.
> #2 0x00007f95994f0c9d in afr_frame_return () from
> /usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so
> No symbol table info available.
> #3 0x00007f9599503ba1 in afr_lookup_cbk () from
> /usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so
> No symbol table info available.
> #4 0x00007f9599788f3f in client4_0_lookup_cbk () from
> /usr/lib64/glusterfs/5.3/xlator/protocol/client.so
> No symbol table info available.
> #5 0x00007f95a1153820 in rpc_clnt_handle_reply () from
> /usr/lib64/libgfrpc.so.0
> No symbol table info available.
> #6 0x00007f95a1153b6f in rpc_clnt_notify () from /usr/lib64/libgfrpc.so.0
> No symbol table info available.
> #7 0x00007f95a1150063 in rpc_transport_notify () from
> /usr/lib64/libgfrpc.so.0
> No symbol table info available.
> #8 0x00007f959aea00b2 in socket_event_handler () from
> /usr/lib64/glusterfs/5.3/rpc-transport/socket.so
> No symbol table info available.
> #9 0x00007f95a13e64c3 in event_dispatch_epoll_worker () from
> /usr/lib64/libglusterfs.so.0
> No symbol table info available.
> #10 0x00007f95a08da559 in start_thread () from /lib64/libpthread.so.0
> No symbol table info available.
> #11 0x00007f95a061181f in clone () from /lib64/libc.so.6
> No symbol table info available.
> 
> 
> Version-Release number of selected component (if applicable):
> 5.5 (and also 5.3, not seen in 3.x)
> 
> How reproducible:
> 100% 

I didn't find any steps to recreate this issue on the mail thread. I also ran some workloads on replica 4 and didn't find this issue. Do you know what steps lead to this crash?

> 
> 
> Additional info:
> Please refer to
> https://lists.gluster.org/pipermail/gluster-users/2019-March/036048.html &
> https://lists.gluster.org/pipermail/gluster-users/2019-February/035871.html

Comment 2 Pranith Kumar K 2019-03-20 09:23:34 UTC
> > 
> > How reproducible:
> > 100% 
> 
> I didn't find any steps to recreate this issue on the mail thread. I also
> ran some workloads on replica 4 and didn't find this issue. Do you know what
> steps lead to this crash?

Do we have symbols for this core maybe?

> 
> > 
> > 
> > Additional info:
> > Please refer to
> > https://lists.gluster.org/pipermail/gluster-users/2019-March/036048.html &
> > https://lists.gluster.org/pipermail/gluster-users/2019-February/035871.html

Comment 3 Amar Tumballi 2019-03-22 17:53:16 UTC
(gdb) p *(afr_private_t*)this->private
$5 = {lock = {spinlock = 0, mutex = {_data = {_lock = 0, _count = 0, _owner = 0, _nusers = 0, _kind = 256, _spins = 0, _elision = 0, _list = {_prev = 0x0, _next = 0x0}}, _size = '\000' <repeats 17 times>, "\001", '\000' <repeats 21 times>, __align = 0}}, child_count = 4, arbiter_count = 0, 
  children = 0x7f82ac049eb0, root_inode = 0x7f829c014388, thin_arbiter_count = 0, ta_gfid = '\000' <repeats 15 times>, ta_child_up = 0 '\000', ta_bad_child_index = 0, ta_notify_dom_lock_offset = 0, release_ta_notify_dom_lock = false, ta_in_mem_txn_count = 0, ta_on_wire_txn_count = 0, ta_waitq = {next = 0x0, 
    prev = 0x0}, ta_onwireq = {next = 0x0, prev = 0x0}, child_up = 0x7f82ac049de0 "\001\001\001\001", <incomplete sequence \360\255\272>, child_latency = 0x7f82ac049e40, local = 0x7f82ac049d80 "", pending_key = 0x7f82ac049f20, data_self_heal = 0x7f82abb4c5fd "on", data_self_heal_algorithm = 0x7f82ac015410 "full", 
  data_self_heal_window_size = 1, heal_waiting = {next = 0x7f82ac049b70, prev = 0x7f82ac049b70}, heal_wait_qlen = 128, heal_waiters = 0, healing = {next = 0x7f82ac049b88, prev = 0x7f82ac049b88}, background_self_heal_count = 8, healers = 0, metadata_self_heal = true, entry_self_heal = true, 
  metadata_splitbrain_forced_heal = false, read_child = 3, hash_mode = 1, pending_reads = 0x7f82ac049d10, favorite_child = -1, fav_child_policy = AFR_FAV_CHILD_NONE, wait_count = 1, timer = 0x0, optimistic_change_log = true, eager_lock = true, pre_op_compat = true, post_op_delay_secs = 1, quorum_count = 1, 
  vol_uuid = '\000' <repeats 36 times>, last_event = 0x7f82ac04a210, event_generation = 4, choose_local = true, did_discovery = true, sh_readdir_size = 1024, ensure_durability = true, sh_domain = 0x7f82ac04a190 "androidpolice_data3-replicate-0:self-heal", afr_dirty = 0x7f82abb4c7e4 "trusted.afr.dirty", 
  halo_enabled = false, halo_max_latency_msec = 5, halo_max_replicas = 99999, halo_min_replicas = 2, shd = {iamshd = false, enabled = true, timeout = 600, index_healers = 0x7f82ac04a7f0, full_healers = 0x7f82ac04aae0, split_brain = 0x7f82ac04add0, statistics = 0x7f82ac04cf40, max_threads = 1, wait_qlength = 1024, 
    halo_max_latency_msec = 99999}, nfsd = {iamnfsd = false, halo_max_latency_msec = 5}, consistent_metadata = false, spb_choice_timeout = 300, need_heal = false, pump_private = 0x0, use_afr_in_pump = false, locking_scheme = 0x7f82abb488ca "full", full_lock = true, esh_granular = true, consistent_io = false}

Thread 12 (Thread 0x7f82a9768700 (LWP 22871)):
#0  0x00007f82b6e2c986 in epoll_pwait () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007f82b7c0139e in ?? () from /usr/lib64/libglusterfs.so.0
No symbol table info available.
#2  0x00007f82b70f5559 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x00007f82b6e2c81f in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 11 (Thread 0x7f82a9f69700 (LWP 22870)):
#0  0x00007f82b6e2c986 in epoll_pwait () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007f82b7c0139e in ?? () from /usr/lib64/libglusterfs.so.0
No symbol table info available.
#2  0x00007f82b70f5559 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x00007f82b6e2c81f in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 10 (Thread 0x7f82a37fe700 (LWP 22876)):
#0  0x00007f82b70fb89d in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x00007f82b3fab75b in notify_kernel_loop (data=<optimized out>) at fuse-bridge.c:4037
        len = <optimized out>
        rv = <optimized out>
        this = <optimized out>
        priv = 0x556281c00b50
        node = <optimized out>
        tmp = 0x0
        pfoh = <optimized out>
        iov_out = {
          iov_base = 0x7f82a4029ff0, 
          iov_len = 40
        }
        __FUNCTION__ = "notify_kernel_loop"
#2  0x00007f82b70f5559 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x00007f82b6e2c81f in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 9 (Thread 0x7f82b808d880 (LWP 22862)):
#0  0x00007f82b70f691d in pthread_join () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x00007f82b7c00a5b in ?? () from /usr/lib64/libglusterfs.so.0
No symbol table info available.
#2  0x00005562812f3dfc in main (argc=<optimized out>, argv=<optimized out>) at glusterfsd.c:2762
        ctx = 0x556281bb6920
        ret = 0
        cmdlinestr = "/usr/sbin/glusterfs --lru-limit=0 --process-name fuse --volfile-server=localhost --volfile-id=/androidpolice_data3 /mnt/androidpolice_data3", '\000' <repeats 3956 times>
        cmd = 0x556281bb6920
        __FUNCTION__ = "main"

Thread 8 (Thread 0x7f82a3fff700 (LWP 22875)):
#0  0x00007f82b6e234e4 in readv () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007f82b7bceb59 in sys_readv () from /usr/lib64/libglusterfs.so.0
No symbol table info available.
#2  0x00007f82b3fc22ad in fuse_thread_proc (data=0x556281bfbc40) at fuse-bridge.c:5031
        mount_point = 0x0
        this = 0x556281bfbc40
        priv = 0x556281c00b50
        res = <optimized out>
        iobuf = 0x7f8294040598
        finh = <optimized out>
        iov_in = {{
            iov_base = 0x7f82940fe1b0, 
            iov_len = 80
          }, {
            iov_base = 0x7f82a2b9e000, 
            iov_len = 131072
          }}
        msg = <optimized out>
        fuse_ops = 0x7f82b41d7d60 <fuse_std_ops>
        pfd = {{
            fd = 6,
events = 25, 
            revents = 1
          }, {
            fd = 8, 
            events = 25, 
            revents = 1
          }}
        __FUNCTION__ = "fuse_thread_proc"
#3  0x00007f82b70f5559 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4  0x00007f82b6e2c81f in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 7 (Thread 0x7f82b279f700 (LWP 22867)):
#0  0x00007f82b70fbc56 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x00007f82b7bdfd98 in ?? () from /usr/lib64/libglusterfs.so.0
No symbol table info available.
#2  0x00007f82b7be0a60 in ?? () from /usr/lib64/libglusterfs.so.0
No symbol table info available.
#3  0x00007f82b70f5559 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4  0x00007f82b6e2c81f in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 6 (Thread 0x7f82b08ad700 (LWP 22869)):
#0  0x00007f82b70fec4d in __lll_lock_wait () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x00007f82b71018b7 in __lll_lock_elision () from /lib64/libpthread.so.0
No symbol table info available.
#2  0x00007f82abb2de3d in afr_frame_return (frame=frame@entry=0x7f82942079a8) at afr-common.c:2105
        local = 0x7f8294103cb8
        call_count = 0
#3  0x00007f82abb40d51 in afr_lookup_cbk (frame=0x7f82942079a8, cookie=<optimized out>, this=0x7f82ac015c20, op_ret=<optimized out>, op_errno=<optimized out>, inode=<optimized out>, buf=0x7f82b08ac820, xdata=0x7f8294002668, postparent=0x7f82b08ac8c0) at afr-common.c:2951
        local = 0x7f8294103cb8
        call_count = -1
        child_index = <optimized out>
        ret = <optimized out>
        need_heal = 0 '\000'
#4  0x00007f82abdc5e1f in client4_0_lookup_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, myframe=0x7f8294021318) at client-rpc-fops_v2.c:2641
        fn = 0x7f82abb40b80 <afr_lookup_cbk>
        _parent = 0x7f82942079a8
        old_THIS = 0x7f82ac00a340
        __local = 0x7f82940251d8
        rsp = {
          op_ret = -1, 
          op_errno = 2, 
          xdata = {
            xdr_size = 28, 
            count = 1, 
            pairs = {
              pairs_len = 1, 
              pairs_val = 0x7f82ac060500
            }
          }, 
          prestat = {
            ia_gfid = '\000' <repeats 15 times>, 
            ia_flags = 0, 
            ia_ino = 0, 
            ia_dev = 0, 
            ia_rdev = 0, 
            ia_size = 0, 
            ia_blocks = 0, 
            ia_attributes = 0, 
            ia_attributes_mask = 0, 
            ia_atime = 0, 
            ia_mtime = 0, 
            ia_ctime = 0, 
            ia_btime = 0, 
            ia_atime_nsec = 0, 
            ia_mtime_nsec = 0, 
            ia_ctime_nsec = 0, 
            ia_btime_nsec = 0, 
            ia_nlink = 0, 
            ia_uid = 0, 
            ia_gid = 0, 
            ia_blksize = 0,
mode = 0
          }, 
          poststat = {
            ia_gfid = '\000' <repeats 15 times>, 
            ia_flags = 0, 
            ia_ino = 0, 
            ia_dev = 0, 
            ia_rdev = 0, 
            ia_size = 0, 
            ia_blocks = 0, 
            ia_attributes = 0, 
            ia_attributes_mask = 0, 
            ia_atime = 0, 
            ia_mtime = 0, 
            ia_ctime = 0, 
            ia_btime = 0, 
            ia_atime_nsec = 0, 
            ia_mtime_nsec = 0, 
            ia_ctime_nsec = 0, 
            ia_btime_nsec = 0, 
            ia_nlink = 0, 
            ia_uid = 0, 
            ia_gid = 0, 
            ia_blksize = 0, 
            mode = 0
          }
        }
        local = <optimized out>
        frame = 0x7f8294021318
        ret = <optimized out>
        stbuf = {
          ia_flags = 0, 
          ia_ino = 0, 
          ia_dev = 0, 
          ia_rdev = 0, 
          ia_size = 0, 
          ia_nlink = 0, 
          ia_uid = 0, 
          ia_gid = 0, 
          ia_blksize = 0, 
          ia_blocks = 0, 
          ia_atime = 0, 
          ia_mtime = 0, 
          ia_ctime = 0, 
          ia_btime = 0, 
          ia_atime_nsec = 0, 
          ia_mtime_nsec = 0, 
          ia_ctime_nsec = 0, 
          ia_btime_nsec = 0, 
          ia_attributes = 0, 
          ia_attributes_mask = 0, 
          ia_gfid = '\000' <repeats 15 times>, 
          ia_type = IA_INVAL, 
          ia_prot = {
            suid = 0 '\000', 
            sgid = 0 '\000', 
            sticky = 0 '\000', 
            owner = {
              read = 0 '\000', 
              write = 0 '\000', 
              exec = 0 '\000'
            }, 
            group = {
              read = 0 '\000', 
              write = 0 '\000', 
              exec = 0 '\000'
            }, 
            other = {
              read = 0 '\000', 
              write = 0 '\000', 
              exec = 0 '\000'
            }
          }
        }
        postparent = {
          ia_flags = 0, 
          ia_ino = 0,
ia_dev = 0, 
          ia_rdev = 0, 
          ia_size = 0, 
          ia_nlink = 0, 
          ia_uid = 0, 
          ia_gid = 0, 
          ia_blksize = 0, 
          ia_blocks = 0, 
          ia_atime = 0, 
          ia_mtime = 0, 
          ia_ctime = 0, 
          ia_btime = 0, 
          ia_atime_nsec = 0, 
          ia_mtime_nsec = 0, 
          ia_ctime_nsec = 0, 
          ia_btime_nsec = 0, 
          ia_attributes = 0, 
          ia_attributes_mask = 0, 
          ia_gfid = '\000' <repeats 15 times>, 
          ia_type = IA_INVAL, 
          ia_prot = {
            suid = 0 '\000', 
            sgid = 0 '\000', 
            sticky = 0 '\000', 
            owner = {
              read = 0 '\000', 
              write = 0 '\000', 
              exec = 0 '\000'
            }, 
            group = {
              read = 0 '\000', 
              write = 0 '\000', 
              exec = 0 '\000'
            }, 
            other = {
              read = 0 '\000', 
              write = 0 '\000', 
              exec = 0 '\000'
            }
          }
        }
        op_errno = <optimized out>
        xdata = 0x7f8294002668
        inode = 0x7f829404c648
        this = 0x7f82ac00a340
        __FUNCTION__ = "client4_0_lookup_cbk"
#5  0x00007f82b796e820 in ?? () from /usr/lib64/libgfrpc.so.0
No symbol table info available.
#6  0x00007f82b796eb6f in ?? () from /usr/lib64/libgfrpc.so.0
No symbol table info available.
#7  0x00007f82b796b063 in rpc_transport_notify () from /usr/lib64/libgfrpc.so.0
No symbol table info available.
#8  0x00007f82b15890ce in socket_event_poll_in (notify_handled=true, this=0x7f82ac05ed00) at socket.c:2506
        ret = <optimized out>
        pollin = 0x7f82ac075ca0
        priv = 0x7f82ac05f330
        ctx = 0x556281bb6920
        ret = <optimized out>
        pollin = <optimized out>
        priv = <optimized out>
        ctx = <optimized out>
#9  socket_event_handler (fd=<optimized out>, idx=5, gen=1, data=0x7f82ac05ed00, poll_in=<optimized out>, poll_out=<optimized out>, poll_err=<optimized out>) at socket.c:2907
        this = 0x7f82ac05ed00
        priv = 0x7f82ac05f330
        ret = <optimized out>
        ctx = <optimized out>
        socket_closed = false
        notify_handled = false
        __FUNCTION__ = "socket_event_handler"
#10 0x00007f82b7c01519 in ?? () from /usr/lib64/libglusterfs.so.0
No symbol table info available.
#11 0x00007f82b70f5559 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#12 0x00007f82b6e2c81f in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 5 (Thread 0x7f82b37a1700 (LWP 22865)):
#0  0x00007f82b70ffddf in do_sigwait () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x00007f82b70ffe6d in sigwait () from /lib64/libpthread.so.0
No symbol table info available.
#2  0x00005562812f4293 in glusterfs_sigwaiter (arg=<optimized out>) at glusterfsd.c:2306
        set = {
          __val = {18947, 0 <repeats 15 times>}
        }
        ret = <optimized out>
        sig = 0
        file = <optimized out>
#3  0x00007f82b70f5559 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4  0x00007f82b6e2c81f in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 4 (Thread 0x7f82b2fa0700 (LWP 22866)):
#0  0x00007f82b6dfa040 in nanosleep () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007f82b6df9f4a in sleep () from /lib64/libc.so.6
No symbol table info available.
#2  0x00007f82b7bccca2 in ?? () from /usr/lib64/libglusterfs.so.0
No symbol table info available.
#3  0x00007f82b70f5559 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4  0x00007f82b6e2c81f in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 3 (Thread 0x7f82b3fa2700 (LWP 22864)):
#0  0x00007f82b70ff770 in nanosleep () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x00007f82b7bb1d9e in ?? () from /usr/lib64/libglusterfs.so.0
No symbol table info available.
#2  0x00007f82b70f5559 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x00007f82b6e2c81f in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 2 (Thread 0x7f82b1f9e700 (LWP 22868)):
#0  0x00007f82b70fbc56 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x00007f82b7bdfd98 in ?? () from /usr/lib64/libglusterfs.so.0
No symbol table info available.
#2  0x00007f82b7be0a60 in ?? () from /usr/lib64/libglusterfs.so.0
No symbol table info available.
#3  0x00007f82b70f5559 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4  0x00007f82b6e2c81f in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 1 (Thread 0x7f82a8f67700 (LWP 22872)):
#0  0x00007f82b6d6a0e0 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007f82b6d6b6c1 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x00007f82b6d626fa in __assert_fail_base () from /lib64/libc.so.6
No symbol table info available.
#3  0x00007f82b6d62772 in __assert_fail () from /lib64/libc.so.6
No symbol table info available.
#4  0x00007f82b70f80b8 in pthread_mutex_lock () from /lib64/libpthread.so.0
No symbol table info available.
#5  0x00007f82abb2de3d in afr_frame_return (frame=frame@entry=0x7f82942079a8) at afr-common.c:2105
        local = 0x7f8294103cb8
        call_count = 0
#6  0x00007f82abb40d51 in afr_lookup_cbk (frame=0x7f82942079a8, cookie=<optimized out>, this=0x7f82ac015c20, op_ret=<optimized out>, op_errno=<optimized out>, inode=<optimized out>, buf=0x7f82a8f66820, xdata=0x7f829408e248, postparent=0x7f82a8f668c0) at afr-common.c:2951
        local = 0x7f8294103cb8
        call_count = -1
        child_index = <optimized out>
        ret = <optimized out>
        need_heal = 0 '\000'
#7  0x00007f82abdc5e1f in client4_0_lookup_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, myframe=0x7f82940a8d18) at client-rpc-fops_v2.c:2641
        fn = 0x7f82abb40b80 <afr_lookup_cbk>
        _parent = 0x7f82942079a8
        old_THIS = 0x7f82ac010200
        __local = 0x7f82940c20e8
        rsp = {
          op_ret = -1,
op_errno = 2, 
          xdata = {
            xdr_size = 28, 
            count = 1, 
            pairs = {
              pairs_len = 1, 
              pairs_val = 0x7f829c0ab040
            }
          }, 
          prestat = {
            ia_gfid = '\000' <repeats 15 times>, 
            ia_flags = 0, 
            ia_ino = 0, 
            ia_dev = 0, 
            ia_rdev = 0, 
            ia_size = 0, 
            ia_blocks = 0, 
            ia_attributes = 0, 
            ia_attributes_mask = 0, 
            ia_atime = 0, 
            ia_mtime = 0, 
            ia_ctime = 0, 
            ia_btime = 0, 
            ia_atime_nsec = 0, 
            ia_mtime_nsec = 0, 
            ia_ctime_nsec = 0, 
            ia_btime_nsec = 0, 
            ia_nlink = 0, 
            ia_uid = 0, 
            ia_gid = 0, 
            ia_blksize = 0, 
            mode = 0
          }, 
          poststat = {
            ia_gfid = '\000' <repeats 15 times>, 
            ia_flags = 0, 
            ia_ino = 0, 
            ia_dev = 0, 
            ia_rdev = 0, 
            ia_size = 0, 
            ia_blocks = 0, 
            ia_attributes = 0, 
            ia_attributes_mask = 0, 
            ia_atime = 0, 
            ia_mtime = 0, 
            ia_ctime = 0, 
            ia_btime = 0, 
            ia_atime_nsec = 0, 
            ia_mtime_nsec = 0, 
            ia_ctime_nsec = 0, 
            ia_btime_nsec = 0, 
            ia_nlink = 0, 
            ia_uid = 0, 
            ia_gid = 0, 
            ia_blksize = 0, 
            mode = 0
          }
        }
        local = <optimized out>
        frame = 0x7f82940a8d18
        ret = <optimized out>
        stbuf = {
          ia_flags = 0, 
          ia_ino = 0, 
          ia_dev = 0, 
          ia_rdev = 0, 
          ia_size = 0, 
          ia_nlink = 0, 
          ia_uid = 0, 
          ia_gid = 0, 
          ia_blksize = 0, 
          ia_blocks = 0, 
          ia_atime = 0, 
          ia_mtime = 0, 
          ia_ctime = 0, 
          ia_btime = 0, 
          ia_atime_nsec = 0,
ia_mtime_nsec = 0, 
          ia_ctime_nsec = 0, 
          ia_btime_nsec = 0, 
          ia_attributes = 0, 
          ia_attributes_mask = 0, 
          ia_gfid = '\000' <repeats 15 times>, 
          ia_type = IA_INVAL, 
          ia_prot = {
            suid = 0 '\000', 
            sgid = 0 '\000', 
            sticky = 0 '\000', 
            owner = {
              read = 0 '\000', 
              write = 0 '\000', 
              exec = 0 '\000'
            }, 
            group = {
              read = 0 '\000', 
              write = 0 '\000', 
              exec = 0 '\000'
            }, 
            other = {
              read = 0 '\000', 
              write = 0 '\000', 
              exec = 0 '\000'
            }
          }
        }
        postparent = {
          ia_flags = 0, 
          ia_ino = 0, 
          ia_dev = 0, 
          ia_rdev = 0, 
          ia_size = 0, 
          ia_nlink = 0, 
          ia_uid = 0, 
          ia_gid = 0, 
          ia_blksize = 0, 
          ia_blocks = 0, 
          ia_atime = 0, 
          ia_mtime = 0, 
          ia_ctime = 0, 
          ia_btime = 0, 
          ia_atime_nsec = 0, 
          ia_mtime_nsec = 0, 
          ia_ctime_nsec = 0, 
          ia_btime_nsec = 0, 
          ia_attributes = 0, 
          ia_attributes_mask = 0, 
          ia_gfid = '\000' <repeats 15 times>, 
          ia_type = IA_INVAL, 
          ia_prot = {
            suid = 0 '\000', 
            sgid = 0 '\000', 
            sticky = 0 '\000', 
            owner = {
              read = 0 '\000', 
              write = 0 '\000', 
              exec = 0 '\000'
            }, 
            group = {
              read = 0 '\000', 
              write = 0 '\000', 
              exec = 0 '\000'
            }, 
            other = {
              read = 0 '\000', 
              write = 0 '\000', 
              exec = 0 '\000'
            }
          }
        }
        op_errno = <optimized out>
        xdata = 0x7f829408e248
        inode = 0x7f829404c648
        this = 0x7f82ac010200
        __FUNCTION__ = "client4_0_lookup_cbk"
#8  0x00007f82b796e820 in ?? () from /usr/lib64/libgfrpc.so.0
No symbol table info available.
#9  0x00007f82b796eb6f in ?? () from /usr/lib64/libgfrpc.so.0
No symbol table info available.
#10 0x00007f82b796b063 in rpc_transport_notify () from /usr/lib64/libgfrpc.so.0
No symbol table info available.
#11 0x00007f82b15890ce in socket_event_poll_in (notify_handled=true, this=0x7f82ac0561c0) at socket.c:2506
        ret = <optimized out>
        pollin = 0x7f829c0aab80
        priv = 0x7f82ac0567f0
        ctx = 0x556281bb6920
        ret = <optimized out>
        pollin = <optimized out>
        priv = <optimized out>
        ctx = <optimized out>
#12 socket_event_handler (fd=<optimized out>, idx=1, gen=4, data=0x7f82ac0561c0, poll_in=<optimized out>, poll_out=<optimized out>, poll_err=<optimized out>) at socket.c:2907
        this = 0x7f82ac0561c0
        priv = 0x7f82ac0567f0
        ret = <optimized out>
        ctx = <optimized out>
        socket_closed = false
        notify_handled = false
        __FUNCTION__ = "socket_event_handler"
#13 0x00007f82b7c01519 in ?? () from /usr/lib64/libglusterfs.so.0
No symbol table info available.
#14 0x00007f82b70f5559 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#15 0x00007f82b6e2c81f in clone () from /lib64/libc.so.6
No symbol table info available.

$3 = {
  op = GF_FOP_LOOKUP, 
  call_count = 2, 
  event_generation = 4, 
  open_fd_count = 0, 
  update_open_fd_count = false, 
  num_inodelks = 0, 
  update_num_inodelks = false, 
  saved_lk_owner = {
    len = 0, 
    data = '\000' <repeats 1023 times>
  }, 
  op_ret = -1, 
  op_errno = 117, 
  pending = 0x0, 
  dirty = {0, 0, 0}, 
  loc = {
    path = 0x7f8294054a90 "/uploads/wp-security-audit-log/custom-alerts.php", 
    name = 0x7f8294054aaf "custom-alerts.php", 
    inode = 0x7f829404c648, 
    parent = 0x7f82940707e8, 
    gfid = '\000' <repeats 15 times>, 
    pargfid = "\266ߞ*\340oB\255\202Ħ\177\362\244p\203"
  }, 
  newloc = {
    path = 0x0, 
    name = 0x0, 
    inode = 0x0, 
    parent = 0x0, 
    gfid = '\000' <repeats 15 times>, 
    pargfid = '\000' <repeats 15 times>
  }, 
  fd = 0x0, 
  fd_ctx = 0x0, 
  child_up = 0x7f82940858d0 "\001\001\001\001", <incomplete sequence \360\255\272>, 
  read_attempted = 0x7f829400fc70 "", 
  readfn = 0x0, 
  refreshed = false, 
  inode = 0x7f829404c648, 
  parent = 0x0, 
  parent2 = 0x0, 
  readable = 0x7f82940527d0 "", 
  readable2 = 0x7f8294123940 "", 
  read_subvol = -1, 
  refreshfn = 0x0, 
  refreshinode = 0x0, 
  refreshgfid = '\000' <repeats 15 times>, 
  pre_op_compat = false, 
  xattr_req = 0x7f829405a8e8, 
  internal_lock = {
    lk_loc = 0x0, 
    lockee_count = 0, 
    lockee = {{
        loc = {
          path = 0x0, 
          name = 0x0, 
          inode = 0x0, 
          parent = 0x0, 
          gfid = '\000' <repeats 15 times>, 
          pargfid = '\000' <repeats 15 times>
        }, 
        __xpg_basename = 0x0, 
        locked_nodes = 0x0, 
        locked_count = 0
      }, {
        loc = {
          path = 0x0, 
          name = 0x0, 
          inode = 0x0, 
          parent = 0x0, 
          gfid = '\000' <repeats 15 times>, 
          pargfid = '\000' <repeats 15 times>
        }, 
        __xpg_basename = 0x0, 
        locked_nodes = 0x0, 
        locked_count = 0
      }, {
loc = {
          path = 0x0, 
          name = 0x0, 
          inode = 0x0, 
          parent = 0x0, 
          gfid = '\000' <repeats 15 times>, 
          pargfid = '\000' <repeats 15 times>
        }, 
        __xpg_basename = 0x0, 
        locked_nodes = 0x0, 
        locked_count = 0
      }}, 
    flock = {
      l_type = 0, 
      l_whence = 0, 
      l_start = 0, 
      l_len = 0, 
      l_pid = 0, 
      l_owner = {
        len = 0, 
        data = '\000' <repeats 1023 times>
      }
    }, 
    lk_basename = 0x0, 
    lower_basename = 0x0, 
    higher_basename = 0x0, 
    lower_locked = 0 '\000', 
    higher_locked = 0 '\000', 
    locked_nodes = 0x0, 
    lower_locked_nodes = 0x0, 
    lock_count = 0, 
    entrylk_lock_count = 0, 
    lk_call_count = 0, 
    lk_expected_count = 0, 
    lk_attempted_count = 0, 
    lock_op_ret = 0, 
    lock_op_errno = 0, 
    lock_cbk = 0x0, 
    domain = 0x0
  }, 
  dict = 0x0, 
  optimistic_change_log = 0, 
  stable_write = false, 
  append_write = false, 
  cont = {
    lookup = {
      needs_fresh_lookup = false, 
      gfid_req = "\210\204\000\204I\371CL\230ʌ\310\035\021Co"
    }, 
    statfs = {
      buf_set = 0 '\000', 
      buf = {
        f_bsize = 0, 
        f_frsize = 0, 
        f_blocks = 0, 
        f_bfree = 0, 
        f_bavail = 0, 
        f_files = 0, 
        f_ffree = 0, 
        f_favail = 0, 
        f_fsid = 0, 
        f_flag = 0, 
        f_namemax = 0, 
        __f_spare = {0, 0, 0, 0, 0, 0}
      }
    }, 
    open = {
      flags = 0, 
      fd = 0x0
    }, 
    lk = {
      cmd = 0, 
      user_flock = {
        l_type = 0, 
        l_whence = 0, 
        l_start = 0, 
        l_len = 0,
l_pid = 0, 
        l_owner = {
          len = 0, 
          data = '\000' <repeats 1023 times>
        }
      }, 
      ret_flock = {
        l_type = 0, 
        l_whence = 0, 
        l_start = 0, 
        l_len = 0, 
        l_pid = 0, 
        l_owner = {
          len = 0, 
          data = '\000' <repeats 1023 times>
        }
      }, 
      locked_nodes = 0x0
    }, 
    access = {
      mask = 0, 
      last_index = 0
    }, 
    stat = {
      last_index = 0
    }, 
    fstat = {
      last_index = 0
    }, 
    readlink = {
      size = 0, 
      last_index = 0
    }, 
    getxattr = {
      name = 0x0, 
      last_index = 0, 
      xattr_len = 0
    }, 
    readv = {
      size = 0, 
      offset = 0, 
      last_index = 0, 
      flags = 0
    }, 
    opendir = {
      success_count = 0, 
      op_ret = 0, 
      op_errno = 0, 
      checksum = 0x0
    }, 
    readdir = {
      op_ret = 0, 
      op_errno = 0, 
      size = 0, 
      offset = 0, 
      dict = 0x0, 
      failed = false, 
      last_index = 0
    }, 
    inode_wfop = {
      prebuf = {
        ia_flags = 0, 
        ia_ino = 0, 
        ia_dev = 0, 
        ia_rdev = 0, 
        ia_size = 0, 
        ia_nlink = 0, 
        ia_uid = 0, 
        ia_gid = 0, 
        ia_blksize = 0, 
        ia_blocks = 0, 
        ia_atime = 0, 
        ia_mtime = 0, 
        ia_ctime = 0, 
        ia_btime = 0, 
        ia_atime_nsec = 0, 
        ia_mtime_nsec = 0,
ia_ctime_nsec = 0, 
        ia_btime_nsec = 0, 
        ia_attributes = 0, 
        ia_attributes_mask = 0, 
        ia_gfid = '\000' <repeats 15 times>, 
        ia_type = IA_INVAL, 
        ia_prot = {
          suid = 0 '\000', 
          sgid = 0 '\000', 
          sticky = 0 '\000', 
          owner = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }, 
          group = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }, 
          other = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }
        }
      }, 
      postbuf = {
        ia_flags = 0, 
        ia_ino = 0, 
        ia_dev = 0, 
        ia_rdev = 0, 
        ia_size = 0, 
        ia_nlink = 0, 
        ia_uid = 0, 
        ia_gid = 0, 
        ia_blksize = 0, 
        ia_blocks = 0, 
        ia_atime = 0, 
        ia_mtime = 0, 
        ia_ctime = 0, 
        ia_btime = 0, 
        ia_atime_nsec = 0, 
        ia_mtime_nsec = 0, 
        ia_ctime_nsec = 0, 
        ia_btime_nsec = 0, 
        ia_attributes = 0, 
        ia_attributes_mask = 0, 
        ia_gfid = '\000' <repeats 15 times>, 
        ia_type = IA_INVAL, 
        ia_prot = {
          suid = 0 '\000', 
          sgid = 0 '\000', 
          sticky = 0 '\000', 
          owner = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }, 
          group = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }, 
          other = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }
        }
      }
    }, 
    writev = {
      op_ret = 0, 
      vector = 0x0, 
      iobref = 0x0, 
      count = 0,
write = 0 '\000', 
            exec = 0 '\000'
          }, 
          other = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }
        }
      }, 
      preparent = {
        ia_flags = 0, 
        ia_ino = 0, 
        ia_dev = 0, 
        ia_rdev = 0, 
        ia_size = 0, 
        ia_nlink = 0, 
        ia_uid = 0, 
        ia_gid = 0, 
        ia_blksize = 0, 
        ia_blocks = 0, 
        ia_atime = 0, 
        ia_mtime = 0, 
        ia_ctime = 0, 
        ia_btime = 0, 
        ia_atime_nsec = 0, 
        ia_mtime_nsec = 0, 
        ia_ctime_nsec = 0, 
        ia_btime_nsec = 0, 
        ia_attributes = 0, 
        ia_attributes_mask = 0, 
        ia_gfid = '\000' <repeats 15 times>, 
        ia_type = IA_INVAL, 
        ia_prot = {
          suid = 0 '\000', 
          sgid = 0 '\000', 
          sticky = 0 '\000', 
          owner = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }, 
          group = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }, 
          other = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }
        }
      }, 
      postparent = {
        ia_flags = 0, 
        ia_ino = 0, 
        ia_dev = 0, 
        ia_rdev = 0, 
        ia_size = 0, 
        ia_nlink = 0, 
        ia_uid = 0, 
        ia_gid = 0, 
        ia_blksize = 0, 
        ia_blocks = 0, 
        ia_atime = 0, 
        ia_mtime = 0, 
        ia_ctime = 0, 
        ia_btime = 0, 
        ia_atime_nsec = 0, 
        ia_mtime_nsec = 0, 
        ia_ctime_nsec = 0, 
        ia_btime_nsec = 0, 
        ia_attributes = 0, 
        ia_attributes_mask = 0, 
        ia_gfid = '\000' <repeats 15 times>, 
        ia_type = IA_INVAL,
ia_prot = {
          suid = 0 '\000', 
          sgid = 0 '\000', 
          sticky = 0 '\000', 
          owner = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }, 
          group = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }, 
          other = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }
        }
      }, 
      prenewparent = {
        ia_flags = 0, 
        ia_ino = 0, 
        ia_dev = 0, 
        ia_rdev = 0, 
        ia_size = 0, 
        ia_nlink = 0, 
        ia_uid = 0, 
        ia_gid = 0, 
        ia_blksize = 0, 
        ia_blocks = 0, 
        ia_atime = 0, 
        ia_mtime = 0, 
        ia_ctime = 0, 
        ia_btime = 0, 
        ia_atime_nsec = 0, 
        ia_mtime_nsec = 0, 
        ia_ctime_nsec = 0, 
        ia_btime_nsec = 0, 
        ia_attributes = 0, 
        ia_attributes_mask = 0, 
        ia_gfid = '\000' <repeats 15 times>, 
        ia_type = IA_INVAL, 
        ia_prot = {
          suid = 0 '\000', 
          sgid = 0 '\000', 
          sticky = 0 '\000', 
          owner = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }, 
          group = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }, 
          other = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }
        }
      }, 
      postnewparent = {
        ia_flags = 0, 
        ia_ino = 0, 
        ia_dev = 0, 
        ia_rdev = 0, 
        ia_size = 0, 
        ia_nlink = 0, 
        ia_uid = 0, 
        ia_gid = 0, 
        ia_blksize = 0, 
        ia_blocks = 0, 
        ia_atime = 0,
ia_mtime = 0, 
        ia_ctime = 0, 
        ia_btime = 0, 
        ia_atime_nsec = 0, 
        ia_mtime_nsec = 0, 
        ia_ctime_nsec = 0, 
        ia_btime_nsec = 0, 
        ia_attributes = 0, 
        ia_attributes_mask = 0, 
        ia_gfid = '\000' <repeats 15 times>, 
        ia_type = IA_INVAL, 
        ia_prot = {
          suid = 0 '\000', 
          sgid = 0 '\000', 
          sticky = 0 '\000', 
          owner = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }, 
          group = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }, 
          other = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }
        }
      }
    }, 
    create = {
      fd = 0x0, 
      params = 0x0, 
      flags = 0, 
      mode = 0
    }, 
    mknod = {
      dev = 0, 
      mode = 0, 
      params = 0x0
    }, 
    mkdir = {
      mode = 0, 
      params = 0x0
    }, 
    rmdir = {
      flags = 0
    }, 
    symlink = {
      params = 0x0, 
      linkpath = 0x0
    }, 
    fallocate = {
      mode = 0, 
      offset = 0, 
      len = 0
    }, 
    discard = {
      offset = 0, 
      len = 0
    }, 
    zerofill = {
      offset = 0, 
      len = 0, 
      prebuf = {
        ia_flags = 0, 
        ia_ino = 0, 
        ia_dev = 0, 
        ia_rdev = 0, 
        ia_size = 0, 
        ia_nlink = 0, 
        ia_uid = 0, 
        ia_gid = 0, 
        ia_blksize = 0,
ia_blocks = 0, 
        ia_atime = 0, 
        ia_mtime = 0, 
        ia_ctime = 0, 
        ia_btime = 0, 
        ia_atime_nsec = 0, 
        ia_mtime_nsec = 0, 
        ia_ctime_nsec = 0, 
        ia_btime_nsec = 0, 
        ia_attributes = 0, 
        ia_attributes_mask = 0, 
        ia_gfid = '\000' <repeats 15 times>, 
        ia_type = IA_INVAL, 
        ia_prot = {
          suid = 0 '\000', 
          sgid = 0 '\000', 
          sticky = 0 '\000', 
          owner = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }, 
          group = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }, 
          other = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }
        }
      }, 
      postbuf = {
        ia_flags = 0, 
        ia_ino = 0, 
        ia_dev = 0, 
        ia_rdev = 0, 
        ia_size = 0, 
        ia_nlink = 0, 
        ia_uid = 0, 
        ia_gid = 0, 
        ia_blksize = 0, 
        ia_blocks = 0, 
        ia_atime = 0, 
        ia_mtime = 0, 
        ia_ctime = 0, 
        ia_btime = 0, 
        ia_atime_nsec = 0, 
        ia_mtime_nsec = 0, 
        ia_ctime_nsec = 0, 
        ia_btime_nsec = 0, 
        ia_attributes = 0, 
        ia_attributes_mask = 0, 
        ia_gfid = '\000' <repeats 15 times>, 
        ia_type = IA_INVAL, 
        ia_prot = {
          suid = 0 '\000', 
          sgid = 0 '\000', 
          sticky = 0 '\000', 
          owner = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }, 
          group = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }, 
          other = {
            read = 0 '\000', 
            write = 0 '\000', 
            exec = 0 '\000'
          }
        }
}
    }, 
    inodelk = {
      volume = 0x0, 
      cmd = 0, 
      in_cmd = 0, 
      in_flock = {
        l_type = 0, 
        l_whence = 0, 
        l_start = 0, 
        l_len = 0, 
        l_pid = 0, 
        l_owner = {
          len = 0, 
          data = '\000' <repeats 1023 times>
        }
      }, 
      flock = {
        l_type = 0, 
        l_whence = 0, 
        l_start = 0, 
        l_len = 0, 
        l_pid = 0, 
        l_owner = {
          len = 0, 
          data = '\000' <repeats 1023 times>
        }
      }, 
      xdata = 0x0
    }, 
    entrylk = {
      volume = 0x0, 
      __xpg_basename = 0x0, 
      in_cmd = ENTRYLK_LOCK, 
      cmd = ENTRYLK_LOCK, 
      type = ENTRYLK_RDLCK, 
      xdata = 0x0
    }, 
    seek = {
      offset = 0, 
      what = GF_SEEK_DATA
    }, 
    fsync = {
      datasync = 0
    }, 
    lease = {
      user_lease = {
        cmd = 0, 
        lease_type = NONE, 
        lease_id = '\000' <repeats 15 times>, 
        lease_flags = 0
      }, 
      ret_lease = {
        cmd = 0, 
        lease_type = NONE, 
        lease_id = '\000' <repeats 15 times>, 
        lease_flags = 0
      }, 
      locked_nodes = 0x0
    }
  }, 
  transaction = {
    start = 0, 
    len = 0, 
    eager_lock_on = false, 
    do_eager_unlock = false, 
    __xpg_basename = 0x0, 
    new_basename = 0x0, 
    parent_loc = {
      path = 0x0, 
      name = 0x0, 
      inode = 0x0, 
      parent = 0x0, 
      gfid = '\000' <repeats 15 times>, 
      pargfid = '\000' <repeats 15 times>
    }, 
    new_parent_loc = {
path = 0x0, 
      name = 0x0, 
      inode = 0x0, 
      parent = 0x0, 
      gfid = '\000' <repeats 15 times>, 
      pargfid = '\000' <repeats 15 times>
    }, 
    type = AFR_DATA_TRANSACTION, 
    resume_stub = 0x0, 
    owner_list = {
      next = 0x0, 
      prev = 0x0
    }, 
    wait_list = {
      next = 0x0, 
      prev = 0x0
    }, 
    pre_op = 0x0, 
    changelog_xdata = 0x0, 
    pre_op_sources = 0x0, 
    failed_subvols = 0x0, 
    dirtied = false, 
    inherited = false, 
    no_uninherit = false, 
    uninherit_done = false, 
    uninherit_value = false, 
    in_flight_sb = false, 
    in_flight_sb_errno = 0, 
    changelog_resume = 0x0, 
    main_frame = 0x0, 
    frame = 0x0, 
    wind = 0x0, 
    unwind = 0x0
  }, 
  barrier = {
    initialized = true, 
    guard = {
      __data = {
        __lock = 0, 
        __count = 0, 
        __owner = 0, 
        __nusers = 0, 
        __kind = 0, 
        __spins = 0, 
        __elision = 0, 
        __list = {
          __prev = 0x0, 
          __next = 0x0
        }
      }, 
      __size = '\000' <repeats 39 times>, 
      __align = 0
    }, 
    cond = {
      __data = {
        {
          __wseq = 0, 
          __wseq32 = {
            __low = 0, 
            __high = 0
          }
        }, 
        {
          __g1_start = 0, 
          __g1_start32 = {
            __low = 0, 
            __high = 0
          }
        }, 
        __g_refs = {0, 0}, 
        __g_size = {0, 0}, 
        __g1_orig_size = 0, 
        __wrefs = 0, 
        __g_signals = {0, 0}
      }, 
      __size = '\000' <repeats 47 times>, 
      __align = 0
}, 
    waitq = {
      next = 0x7f8294106340, 
      prev = 0x7f8294106340
    }, 
    count = 0, 
    waitfor = 0
  }, 
  xdata_req = 0x0, 
  xdata_rsp = 0x0, 
  xattr_rsp = 0x0, 
  umask = 0, 
  xflag = 0, 
  do_discovery = false, 
  replies = 0x7f82940aabe0, 
  healer = {
    next = 0x7f8294106388, 
    prev = 0x7f8294106388
  }, 
  heal_frame = 0x0, 
  need_full_crawl = false, 
  fop_lock_state = AFR_FOP_LOCK_PARALLEL, 
  is_read_txn = false, 
  inode_ctx = 0x0, 
  ta_child_up = 0 '\000', 
  ta_waitq = {
    next = 0x0, 
    prev = 0x0
  }, 
  ta_onwireq = {
    next = 0x0, 
    prev = 0x0
  }, 
  fop_state = TA_WAIT_FOR_NOTIFY_LOCK_REL, 
  ta_failed_subvol = 0, 
  is_new_entry = false
}

(gdb) p *frame->root
$3 = {{all_frames = {next = 0x7f82ac004098, prev = 0x556281bf2a30}, {next_call = 0x7f82ac004098, prev_call = 0x556281bf2a30}}, pool = 0x556281bf2a30, stack_lock = {spinlock = 0, mutex = {_data = {_lock = 0, _count = 0, _owner = 0, _nusers = 0, _kind = 256, _spins = 0, _elision = 0, _list = {_prev = 0x0, 
          _next = 0x0}}, _size = '\000' <repeats 17 times>, "\001", '\000' <repeats 21 times>, __align = 0}}, client = 0x0, unique = 11064699, state = 0x7f8294094340, uid = 30, gid = 8, pid = 22120, identifier = '\000' <repeats 107 times>, ngrps = 1, groups_small = {8, 0 <repeats 127 times>}, 
  groups_large = 0x0, groups = 0x7f8294153dbc, lk_owner = {len = 8, data = '\000' <repeats 1023 times>}, ctx = 0x556281bb6920, myframes = {next = 0x7f82941b17b8, prev = 0x7f82940d55b8}, op = 27, type = 1 '\001', tv = {tv_sec = 2226858, tv_nsec = 724965078}, err_xl = 0x7f82ac010200, error = 2, flags = 0, ctime = {
    tv_sec = 0, tv_nsec = 0}, ns_info = {hash = 0, found = false}}

Also note that I tried to check

local->replies[0] and seems that it was NULL (and 1,2,3 were all junk).

Comment 4 Artem Russakovskii 2019-05-03 22:30:15 UTC
Phew, this was a fun one!

Long story short - after weeks of debugging with the amazing Gluster team (thanks, Amar and Xavi!), we have found the root of the problem and a solution.

The crash happens on CPUs with an 'rtm' flag, in combination with slightly older versions of glibc, specifically 2.26. The bug is fixed in glibc 2.29.

For example, 3 of our machines had these CPUs (run lscpu to find out):
Model name:          Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm pti fsgsbase tsc_adjust smep erms xsaveopt arat

And the one that was crashing had this one: 
Model name:          Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt arat

Since the version of glibc for OpenSUSE 15.0 is currently 2.26, the easiest solution was to migrate the box to a CPU without the rtm feature, which we've now done and confirmed the crash is gone.


Before the migration, Xavi did find a workaround:
1. export GLIBC_TUNABLES=glibc.tune.hwcaps=-RTM
2. Unmount and remount.
3. Confirm the above worked: for i in $(pgrep glusterfs); do ps h -o cmd -p $i; cat /proc/$i/environ | xargs -0 -n 1 | grep "GLIBC_TUNABLES"; done


More info about this lock elision feature, as well as a quick test program can be found here: https://sourceware.org/bugzilla/show_bug.cgi?id=23275.

Here are sample runs on hardware with 'rtm' feature (crash observed) and without (no crash):

gcc -pthread test.c -o test

archon810@citadel:/tmp> ./test 
Please add a check if lock-elision is available on your architecture. The check in check_if_lock_elision_is_available () assumes, that lock-elision is enabled!

main: start 3 threads to run 2000000 iterations.
#0: started
#1: started
#2: started
.#0: pthread_mutex_destroy: failed with 16; in round=2295;
Aborted


archon810@hive:/tmp> ./test 
Please add a check if lock-elision is available on your architecture. The check in check_if_lock_elision_is_available () assumes, that lock-elision is enabled!

main: start 3 threads to run 2000000 iterations.
#0: started
#2: started
#1: started
........................................................................................................................................................................................................main: end.


Not sure how the maintainers will choose to close this issue, but I hope it'll help someone in the future, especially since we spent countless hours analyzing and debugging (hopefully, not all in vain!).

Comment 5 Amar Tumballi 2019-05-10 13:23:27 UTC
Closing as WORKSFORME in GlusterFS, as it was found to be a CPU flag issue.

Thanks to Xavi for pitching in, and Artem for the patience.


Note You need to log in before you can comment on or make changes to this bug.