Bug 1422776 - multiple glusterfsd process crashed making the complete subvolume unavailable
Summary: multiple glusterfsd process crashed making the complete subvolume unavailable
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: marker
Version: mainline
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On: 1422431
Blocks: 1424937
TreeView+ depends on / blocked
 
Reported: 2017-02-16 07:57 UTC by Poornima G
Modified: 2017-05-30 18:44 UTC (History)
10 users (show)

Fixed In Version: glusterfs-3.11.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1422431
: 1424937 (view as bug list)
Environment:
Last Closed: 2017-05-30 18:44:10 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Poornima G 2017-02-16 07:57:36 UTC
+++ This bug was initially created as a clone of Bug #1422431 +++

Description of problem:
=======================

While trying the geo-rep sanity check with md-cache options enabled. Multiple brick crashes were seen during rename which made the complete subvolume and its data unavailable. 

(gdb) bt
#0  0x00007fc7b5510210 in pthread_spin_lock () from /lib64/libpthread.so.0
#1  0x00007fc7a2a96189 in upcall_inode_ctx_get (inode=inode@entry=0x0, this=this@entry=0x7fc7a4017970)
    at upcall-internal.c:231
#2  0x00007fc7a2a8b59f in upcall_local_init (frame=frame@entry=0x7fc7b419924c, 
    this=this@entry=0x7fc7a4017970, loc=loc@entry=0x7fc7b3b15498, fd=fd@entry=0x0, inode=0x0, 
    xattr=xattr@entry=0x7fc7b3931cec) at upcall.c:2263
#3  0x00007fc7a2a8f377 in up_setxattr (frame=0x7fc7b419924c, this=0x7fc7a4017970, loc=0x7fc7b3b15498, 
    dict=0x7fc7b3931cec, flags=0, xdata=0x0) at upcall.c:1688
#4  0x00007fc7b673b684 in default_setxattr_resume (frame=0x7fc7b41ee490, this=0x7fc7a4018ee0, 
    loc=0x7fc7b3b15498, dict=0x7fc7b3931cec, flags=0, xdata=0x0) at defaults.c:1646
#5  0x00007fc7b66cd64d in call_resume (stub=0x7fc7b3b15448) at call-stub.c:2508
#6  0x00007fc7a287b957 in iot_worker (data=0x7fc7a4069450) at io-threads.c:220
#7  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#8  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6
(gdb) thread apply all bt

Thread 50 (Thread 0x7fc790b6e700 (LWP 13042)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a8ca5555 in janitor_get_next_fd (this=0x7fc7a4006d60) at posix-helpers.c:1341
#2  posix_janitor_thread_proc (data=0x7fc7a4006d60) at posix-helpers.c:1388
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 49 (Thread 0x7fc73dffb700 (LWP 14749)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7b66e3068 in syncenv_task (proc=proc@entry=0x7fc7b833a1f0) at syncop.c:603
#2  0x00007fc7b66e3eb0 in syncenv_processor (thdata=0x7fc7b833a1f0) at syncop.c:695
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 48 (Thread 0x7fc73e7fc700 (LWP 14748)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7b66e3068 in syncenv_task (proc=proc@entry=0x7fc7b8339e30) at syncop.c:603
#2  0x00007fc7b66e3eb0 in syncenv_processor (thdata=0x7fc7b8339e30) at syncop.c:695
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 47 (Thread 0x7fc73d4f7700 (LWP 14970)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a287b90c in iot_worker (data=0x7fc7a4069450) at io-threads.c:180
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 46 (Thread 0x7fc73effd700 (LWP 14735)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#1  0x00007fc7b66e3068 in syncenv_task (proc=proc@entry=0x7fc7b8339a70) at syncop.c:603
#2  0x00007fc7b66e3eb0 in syncenv_processor (thdata=0x7fc7b8339a70) at syncop.c:695
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 45 (Thread 0x7fc7525fa700 (LWP 14688)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7b66e3068 in syncenv_task (proc=proc@entry=0x7fc7b83383f0) at syncop.c:603
#2  0x00007fc7b66e3eb0 in syncenv_processor (thdata=0x7fc7b83383f0) at syncop.c:695
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 44 (Thread 0x7fc752dfb700 (LWP 14687)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7b66e3068 in syncenv_task (proc=proc@entry=0x7fc7b8338030) at syncop.c:603
#2  0x00007fc7b66e3eb0 in syncenv_processor (thdata=0x7fc7b8338030) at syncop.c:695
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 43 (Thread 0x7fc753dfd700 (LWP 14685)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7b66e3068 in syncenv_task (proc=proc@entry=0x7fc7b83378b0) at syncop.c:603
#2  0x00007fc7b66e3eb0 in syncenv_processor (thdata=0x7fc7b83378b0) at syncop.c:695
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 42 (Thread 0x7fc7502f3700 (LWP 14717)):
#0  0x00007fc7b55121bd in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fc7b550dd02 in _L_lock_791 () from /lib64/libpthread.so.0
#2  0x00007fc7b550dc08 in pthread_mutex_lock () from /lib64/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#3  0x00007fc7aaf61be2 in socket_submit_reply (this=0x7fc7a410b8c0, reply=0x7fc7502f26a0)
    at socket.c:3507
#4  0x00007fc7b646be62 in rpcsvc_transport_submit (trans=trans@entry=0x7fc7a410b8c0, 
    rpchdr=rpchdr@entry=0x7fc7502f2770, rpchdrcount=rpchdrcount@entry=1, 
    proghdr=proghdr@entry=0x7fc7502f2820, proghdrcount=proghdrcount@entry=1, 
    progpayload=progpayload@entry=0x0, progpayloadcount=progpayloadcount@entry=0, 
    iobref=iobref@entry=0x7fc71021f490, priv=0x0) at rpcsvc.c:1110
#5  0x00007fc7b646d9a0 in rpcsvc_submit_generic (req=req@entry=0x7fc7a0c6b404, 
    proghdr=proghdr@entry=0x7fc7502f2820, hdrcount=hdrcount@entry=1, payload=payload@entry=0x0, 
    payloadcount=payloadcount@entry=0, iobref=iobref@entry=0x7fc71021f490) at rpcsvc.c:1294
#6  0x00007fc7a1995f19 in server_submit_reply (frame=frame@entry=0x7fc7b41ca6ac, req=0x7fc7a0c6b404, 
    arg=arg@entry=0x7fc7502f28d0, payload=payload@entry=0x0, payloadcount=payloadcount@entry=0, 
    iobref=0x7fc71021f490, iobref@entry=0x0, xdrproc=0x7fc7b625aca0 <xdr_gf_common_rsp>)
    at server.c:187
#7  0x00007fc7a19a5d36 in server_entrylk_cbk (frame=0x7fc7b41ca6ac, cookie=<optimized out>, 
    this=0x7fc7a4022fa0, op_ret=0, op_errno=0, xdata=<optimized out>) at server-rpc-fops.c:358
#8  0x00007fc7a1e05502 in io_stats_entrylk_cbk (frame=0x7fc7b41d8718, cookie=<optimized out>, 
    this=<optimized out>, op_ret=0, op_errno=0, xdata=0x0) at io-stats.c:2480
#9  0x00007fc7b6727a4d in default_entrylk_cbk (frame=0x7fc7b41c7b9c, cookie=<optimized out>, 
    this=<optimized out>, op_ret=0, op_errno=0, xdata=0x0) at defaults.c:1157
#10 0x00007fc7a32dd3fb in pl_common_entrylk (frame=frame@entry=0x7fc7b4212ee0, 
    this=this@entry=0x7fc7a4012400, volume=volume@entry=0x7fc7a4201580 "master-replicate-2", 
    inode=<optimized out>, basename=basename@entry=0x7fc7a416f1f0 "58a29fee%%819MSNM71M", 
    cmd=cmd@entry=ENTRYLK_UNLOCK, type=type@entry=ENTRYLK_WRLCK, loc=loc@entry=0x7fc7b3aa5708, 
    fd=fd@entry=0x0, xdata=xdata@entry=0x7fc7b3940814) at entrylk.c:696
#11 0x00007fc7a32dd7ff in pl_entrylk (frame=frame@entry=0x7fc7b4212ee0, 
    this=this@entry=0x7fc7a4012400, volume=volume@entry=0x7fc7a4201580 "master-replicate-2", 
    loc=loc@entry=0x7fc7b3aa5708, basename=basename@entry=0x7fc7a416f1f0 "58a29fee%%819MSNM71M", 
    cmd=cmd@entry=ENTRYLK_UNLOCK, type=type@entry=ENTRYLK_WRLCK, xdata=xdata@entry=0x7fc7b3940814)
    at entrylk.c:716
---Type <return> to continue, or q <return> to quit---
#12 0x00007fc7a30bb3b2 in ro_entrylk (frame=frame@entry=0x7fc7b4212ee0, 
    this=this@entry=0x7fc7a40138a0, volume=volume@entry=0x7fc7a4201580 "master-replicate-2", 
    loc=loc@entry=0x7fc7b3aa5708, basename=basename@entry=0x7fc7a416f1f0 "58a29fee%%819MSNM71M", 
    cmd=cmd@entry=ENTRYLK_UNLOCK, type=type@entry=ENTRYLK_WRLCK, xdata=xdata@entry=0x7fc7b3940814)
    at read-only-common.c:85
#13 0x00007fc7a2eb2d02 in ro_entrylk (frame=frame@entry=0x7fc7b4212ee0, 
    this=this@entry=0x7fc7a4014eb0, volume=volume@entry=0x7fc7a4201580 "master-replicate-2", 
    loc=loc@entry=0x7fc7b3aa5708, basename=basename@entry=0x7fc7a416f1f0 "58a29fee%%819MSNM71M", 
    cmd=cmd@entry=ENTRYLK_UNLOCK, type=type@entry=ENTRYLK_WRLCK, xdata=xdata@entry=0x7fc7b3940814)
    at read-only-common.c:85
#14 0x00007fc7b6723832 in default_entrylk (frame=frame@entry=0x7fc7b4212ee0, 
    this=this@entry=0x7fc7a4016400, volume=volume@entry=0x7fc7a4201580 "master-replicate-2", 
    loc=loc@entry=0x7fc7b3aa5708, basename=basename@entry=0x7fc7a416f1f0 "58a29fee%%819MSNM71M", 
    cmd=cmd@entry=ENTRYLK_UNLOCK, type=type@entry=ENTRYLK_WRLCK, xdata=xdata@entry=0x7fc7b3940814)
    at defaults.c:2427
#15 0x00007fc7b6723832 in default_entrylk (frame=0x7fc7b4212ee0, this=<optimized out>, 
    volume=0x7fc7a4201580 "master-replicate-2", loc=0x7fc7b3aa5708, 
    basename=0x7fc7a416f1f0 "58a29fee%%819MSNM71M", cmd=ENTRYLK_UNLOCK, type=ENTRYLK_WRLCK, 
    xdata=0x7fc7b3940814) at defaults.c:2427
#16 0x00007fc7b673c93f in default_entrylk_resume (frame=0x7fc7b41c7b9c, this=0x7fc7a4018ee0, 
    volume=0x7fc7a4201580 "master-replicate-2", loc=0x7fc7b3aa5708, 
    basename=0x7fc7a416f1f0 "58a29fee%%819MSNM71M", cmd=ENTRYLK_UNLOCK, type=ENTRYLK_WRLCK, 
    xdata=0x7fc7b3940814) at defaults.c:1754
#17 0x00007fc7b66cd43c in call_resume_wind (stub=0x7fc7b3aa56b8) at call-stub.c:2143
#18 0x00007fc7b66cd64d in call_resume (stub=0x7fc7b3aa56b8) at call-stub.c:2508
#19 0x00007fc7a287b957 in iot_worker (data=0x7fc7a4069450) at io-threads.c:220
#20 0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#21 0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 41 (Thread 0x7fc791bf7700 (LWP 13040)):
---Type <return> to continue, or q <return> to quit---
#0  0x00007fc7b4e47ba3 in select () from /lib64/libc.so.6
#1  0x00007fc7a3b4214a in changelog_ev_dispatch (data=0x7fc7a40aa998) at changelog-ev-handle.c:349
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 40 (Thread 0x7fc7933fa700 (LWP 13037)):
#0  0x00007fc7b550f6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a3b41f03 in changelog_ev_connector (data=0x7fc7a40aa998) at changelog-ev-handle.c:202
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 39 (Thread 0x7fc792bf9700 (LWP 13038)):
#0  0x00007fc7b4e47ba3 in select () from /lib64/libc.so.6
#1  0x00007fc7a3b4214a in changelog_ev_dispatch (data=0x7fc7a40aa998) at changelog-ev-handle.c:349
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 38 (Thread 0x7fc73d5f8700 (LWP 14969)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a287b90c in iot_worker (data=0x7fc7a4069450) at io-threads.c:180
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 37 (Thread 0x7fc7501f2700 (LWP 14750)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a287b90c in iot_worker (data=0x7fc7a4069450) at io-threads.c:180
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 36 (Thread 0x7fc73d6f9700 (LWP 14752)):
---Type <return> to continue, or q <return> to quit---
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a287b90c in iot_worker (data=0x7fc7a4069450) at io-threads.c:180
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 35 (Thread 0x7fc7515f8700 (LWP 14708)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a287b90c in iot_worker (data=0x7fc7a4069450) at io-threads.c:180
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 34 (Thread 0x7fc7819d5700 (LWP 13477)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a287b90c in iot_worker (data=0x7fc7a4069450) at io-threads.c:180
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 33 (Thread 0x7fc7b6b81780 (LWP 13026)):
#0  0x00007fc7b550cef7 in pthread_join () from /lib64/libpthread.so.0
#1  0x00007fc7b6704a38 in event_dispatch_epoll (event_pool=0x7fc7b8326e10) at event-epoll.c:758
#2  0x00007fc7b6b9cae2 in main (argc=19, argv=<optimized out>) at glusterfsd.c:2452

Thread 32 (Thread 0x7fc7535fc700 (LWP 14686)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7b66e3068 in syncenv_task (proc=proc@entry=0x7fc7b8337c70) at syncop.c:603
#2  0x00007fc7b66e3eb0 in syncenv_processor (thdata=0x7fc7b8337c70) at syncop.c:695
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 31 (Thread 0x7fc7810d3700 (LWP 14675)):
---Type <return> to continue, or q <return> to quit---
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7b66e3068 in syncenv_task (proc=proc@entry=0x7fc7b8337130) at syncop.c:603
#2  0x00007fc7b66e3eb0 in syncenv_processor (thdata=0x7fc7b8337130) at syncop.c:695
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 30 (Thread 0x7fc751df9700 (LWP 14689)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7b66e3068 in syncenv_task (proc=proc@entry=0x7fc7b83387b0) at syncop.c:603
#2  0x00007fc7b66e3eb0 in syncenv_processor (thdata=0x7fc7b83387b0) at syncop.c:695
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 29 (Thread 0x7fc7514f7700 (LWP 14709)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a287b90c in iot_worker (data=0x7fc7a4069450) at io-threads.c:180
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 28 (Thread 0x7fc7503f4700 (LWP 14716)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a287b90c in iot_worker (data=0x7fc7a4069450) at io-threads.c:180
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 27 (Thread 0x7fc79136f700 (LWP 13827)):
#0  0x00007fc7b4e1766d in nanosleep () from /lib64/libc.so.6
#1  0x00007fc7b4e17504 in sleep () from /lib64/libc.so.6
#2  0x00007fc7a8ca87fc in posix_health_check_thread_proc (data=0x7fc7a4006d60) at posix-helpers.c:1809
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 26 (Thread 0x7fc7923f8700 (LWP 13039)):
#0  0x00007fc7b4e47ba3 in select () from /lib64/libc.so.6
#1  0x00007fc7a3b4214a in changelog_ev_dispatch (data=0x7fc7a40aa998) at changelog-ev-handle.c:349
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 25 (Thread 0x7fc7818d4700 (LWP 13837)):
#0  0x00007fc7b4e47ba3 in select () from /lib64/libc.so.6
#1  0x00007fc7a3b3e372 in changelog_fsync_thread (data=0x7fc7a40aa610) at changelog-helpers.c:1427
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 24 (Thread 0x7fc73d3f6700 (LWP 14971)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a287b90c in iot_worker (data=0x7fc7a4069450) at io-threads.c:180
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 23 (Thread 0x7fc7822d7700 (LWP 13836)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a3b3df5e in changelog_rollover (data=0x7fc7a40aa610) at changelog-helpers.c:1317
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 22 (Thread 0x7fc753efe700 (LWP 14684)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a287b90c in iot_worker (data=0x7fc7a4069450) at io-threads.c:180
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 21 (Thread 0x7fc7513f6700 (LWP 14710)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7b66e3068 in syncenv_task (proc=proc@entry=0x7fc7b8338b70) at syncop.c:603
#2  0x00007fc7b66e3eb0 in syncenv_processor (thdata=0x7fc7b8338b70) at syncop.c:695
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 20 (Thread 0x7fc7ab974700 (LWP 13030)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7b66e3068 in syncenv_task (proc=proc@entry=0x7fc7b8336d70) at syncop.c:603
#2  0x00007fc7b66e3eb0 in syncenv_processor (thdata=0x7fc7b8336d70) at syncop.c:695
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 19 (Thread 0x7fc7823d8700 (LWP 13175)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a287b90c in iot_worker (data=0x7fc7a4069450) at io-threads.c:180
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 18 (Thread 0x7fc73f7fe700 (LWP 14734)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7b66e3068 in syncenv_task (proc=proc@entry=0x7fc7b83396b0) at syncop.c:603
#2  0x00007fc7b66e3eb0 in syncenv_processor (thdata=0x7fc7b83396b0) at syncop.c:695
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 17 (Thread 0x7fc7a0223700 (LWP 13034)):
---Type <return> to continue, or q <return> to quit---
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a287b90c in iot_worker (data=0x7fc7a4069450) at io-threads.c:180
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 16 (Thread 0x7fc793fff700 (LWP 13035)):
#0  0x00007fc7b550f6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a37011b3 in br_stub_signth (arg=<optimized out>) at bit-rot-stub.c:774
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 15 (Thread 0x7fc750bf5700 (LWP 14711)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7b66e3068 in syncenv_task (proc=proc@entry=0x7fc7b8338f30) at syncop.c:603
#2  0x00007fc7b66e3eb0 in syncenv_processor (thdata=0x7fc7b8338f30) at syncop.c:695
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 14 (Thread 0x7fc7ac976700 (LWP 13028)):
#0  0x00007fc7b5513101 in sigwait () from /lib64/libpthread.so.0
#1  0x00007fc7b6b9fbfb in glusterfs_sigwaiter (arg=<optimized out>) at glusterfsd.c:2055
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 13 (Thread 0x7fc782bd9700 (LWP 13255)):
#0  0x00007fc7b4e1766d in nanosleep () from /lib64/libc.so.6
#1  0x00007fc7b4e17504 in sleep () from /lib64/libc.so.6
#2  0x00007fc7a2a964bc in upcall_reaper_thread (data=0x7fc7a4017970) at upcall-internal.c:414
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6
---Type <return> to continue, or q <return> to quit---

Thread 12 (Thread 0x7fc7a0122700 (LWP 13036)):
#0  0x00007fc7b550f6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a36ffc4b in br_stub_worker (data=<optimized out>) at bit-rot-stub-helpers.c:369
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 11 (Thread 0x7fc753fff700 (LWP 14683)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a287b90c in iot_worker (data=0x7fc7a4069450) at io-threads.c:180
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 10 (Thread 0x7fc73ffff700 (LWP 14718)):
#0  0x00007fc7b4e41c1d in read () from /lib64/libc.so.6
#1  0x00007fc7b4dd05a0 in __GI__IO_file_underflow () from /lib64/libc.so.6
#2  0x00007fc7b4dd152e in __GI__IO_default_uflow () from /lib64/libc.so.6
#3  0x00007fc7b4db42da in __GI__IO_vfscanf () from /lib64/libc.so.6
#4  0x00007fc7b4dc0d97 in fscanf () from /lib64/libc.so.6
#5  0x00007fc7b66af4ed in gf_backtrace_fillframes (buf=buf@entry=0x7fc74812f220 "")
    at common-utils.c:3883
#6  0x00007fc7b66b69b5 in gf_backtrace_save (buf=buf@entry=0x7fc74812f220 "") at common-utils.c:3924
#7  0x00007fc7b66e0a9c in synctask_yield (task=0x7fc74812ed90) at syncop.c:336
#8  0x00007fc7b66f3d57 in syncop_inodelk (subvol=0x7fc7a4018ee0, 
    volume=0x7fc7a4018570 "master-marker", loc=loc@entry=0x7fc7481582f0, cmd=cmd@entry=7, 
    lock=lock@entry=0x7fc748157e20, xdata_in=xdata_in@entry=0x0, xdata_out=xdata_out@entry=0x0)
    at syncop.c:2992
#9  0x00007fc7a2668bcb in mq_lock (this=this@entry=0x7fc7a401a3b0, loc=loc@entry=0x7fc7481582f0, 
    l_type=l_type@entry=1) at marker-quota.c:495
#10 0x00007fc7a266bcf5 in mq_reduce_parent_size_task (opaque=0x7fc7481f4d50) at marker-quota.c:1306
---Type <return> to continue, or q <return> to quit---
#11 0x00007fc7b66e0b32 in synctask_wrap (old_task=<optimized out>) at syncop.c:375
#12 0x00007fc7b4d9fcf0 in ?? () from /lib64/libc.so.6
#13 0x0000000000000000 in ?? ()

Thread 9 (Thread 0x7fc7a96b9700 (LWP 13031)):
#0  0x00007fc7b4e47590 in readv () from /lib64/libc.so.6
#1  0x00007fc7b66d1f75 in sys_readv (fd=<optimized out>, iov=<optimized out>, iovcnt=<optimized out>)
    at syscall.c:249
#2  0x00007fc7aaf60e1c in __socket_ssl_readv (this=this@entry=0x7fc7a410b8c0, 
    opvector=opvector@entry=0x7fc7a96b8bb0, opcount=opcount@entry=1) at socket.c:389
#3  0x00007fc7aaf6135c in __socket_ssl_read (count=<optimized out>, buf=<optimized out>, 
    this=0x7fc7a410b8c0) at socket.c:405
#4  __socket_cached_read (opcount=1, opvector=0x7fc7a410c438, this=0x7fc7a410b8c0) at socket.c:446
#5  __socket_rwv (this=this@entry=0x7fc7a410b8c0, vector=<optimized out>, count=count@entry=1, 
    pending_vector=pending_vector@entry=0x7fc7a410c4c8, 
    pending_count=pending_count@entry=0x7fc7a410c4d0, bytes=bytes@entry=0x7fc7a96b8cb0, 
    write=write@entry=0) at socket.c:556
#6  0x00007fc7aaf62648 in __socket_readv (bytes=0x7fc7a96b8cb0, pending_count=0x7fc7a410c4d0, 
    pending_vector=0x7fc7a410c4c8, count=1, vector=<optimized out>, this=0x7fc7a410b8c0)
    at socket.c:650
#7  __socket_read_frag (this=0x7fc7a410b8c0) at socket.c:1967
#8  __socket_proto_state_machine (pollin=<synthetic pointer>, this=0x7fc7a410b8c0) at socket.c:2140
#9  socket_proto_state_machine (pollin=<synthetic pointer>, this=0x7fc7a410b8c0) at socket.c:2247
#10 socket_event_poll_in (this=this@entry=0x7fc7a410b8c0) at socket.c:2263
#11 0x00007fc7aaf64785 in socket_event_handler (fd=<optimized out>, idx=29, data=0x7fc7a410b8c0, 
    poll_in=1, poll_out=0, poll_err=0) at socket.c:2397
#12 0x00007fc7b67045b0 in event_dispatch_epoll_handler (event=0x7fc7a96b8e80, 
    event_pool=0x7fc7b8326e10) at event-epoll.c:571
#13 event_dispatch_epoll_worker (data=0x7fc7b837a170) at event-epoll.c:674
#14 0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#15 0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 8 (Thread 0x7fc783fff700 (LWP 13043)):
#0  0x00007fc7b550f6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a8ca8b6b in posix_fsyncer_pick (this=this@entry=0x7fc7a4006d60, 
    head=head@entry=0x7fc783ffee80) at posix-helpers.c:1908
#2  0x00007fc7a8ca8df5 in posix_fsyncer (d=0x7fc7a4006d60) at posix-helpers.c:2007
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 7 (Thread 0x7fc7ac175700 (LWP 13029)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7b66e3068 in syncenv_task (proc=proc@entry=0x7fc7b83369b0) at syncop.c:603
#2  0x00007fc7b66e3eb0 in syncenv_processor (thdata=0x7fc7b83369b0) at syncop.c:695
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 6 (Thread 0x7fc7ad177700 (LWP 13027)):
#0  0x00007fc7b5512bdd in nanosleep () from /lib64/libpthread.so.0
#1  0x00007fc7b66b7c66 in gf_timer_proc (data=0x7fc7b83363b0) at timer.c:176
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 5 (Thread 0x7fc7808d2700 (LWP 14680)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7b66e3068 in syncenv_task (proc=proc@entry=0x7fc7b83374f0) at syncop.c:603
#2  0x00007fc7b66e3eb0 in syncenv_processor (thdata=0x7fc7b83374f0) at syncop.c:695
#3  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

---Type <return> to continue, or q <return> to quit---
Thread 4 (Thread 0x7fc7a0324700 (LWP 13033)):
#0  0x00007fc7b550f6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a223deab in index_worker (data=<optimized out>) at index.c:211
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x7fc7a198c700 (LWP 13032)):
#0  0x00007fc7b4e50d13 in epoll_wait () from /lib64/libc.so.6
#1  0x00007fc7b6704490 in event_dispatch_epoll_worker (data=0x7fc7a4024520) at event-epoll.c:664
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7fc781ad6700 (LWP 13463)):
#0  0x00007fc7b550fa82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fc7a287b90c in iot_worker (data=0x7fc7a4069450) at io-threads.c:180
#2  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7fc73d7fa700 (LWP 14751)):
#0  0x00007fc7b5510210 in pthread_spin_lock () from /lib64/libpthread.so.0
#1  0x00007fc7a2a96189 in upcall_inode_ctx_get (inode=inode@entry=0x0, this=this@entry=0x7fc7a4017970)
    at upcall-internal.c:231
#2  0x00007fc7a2a8b59f in upcall_local_init (frame=frame@entry=0x7fc7b419924c, 
    this=this@entry=0x7fc7a4017970, loc=loc@entry=0x7fc7b3b15498, fd=fd@entry=0x0, inode=0x0, 
    xattr=xattr@entry=0x7fc7b3931cec) at upcall.c:2263
#3  0x00007fc7a2a8f377 in up_setxattr (frame=0x7fc7b419924c, this=0x7fc7a4017970, loc=0x7fc7b3b15498, 
    dict=0x7fc7b3931cec, flags=0, xdata=0x0) at upcall.c:1688
#4  0x00007fc7b673b684 in default_setxattr_resume (frame=0x7fc7b41ee490, this=0x7fc7a4018ee0, 
    loc=0x7fc7b3b15498, dict=0x7fc7b3931cec, flags=0, xdata=0x0) at defaults.c:1646
#5  0x00007fc7b66cd64d in call_resume (stub=0x7fc7b3b15448) at call-stub.c:2508
---Type <return> to continue, or q <return> to quit---
#6  0x00007fc7a287b957 in iot_worker (data=0x7fc7a4069450) at io-threads.c:220
#7  0x00007fc7b550bdc5 in start_thread () from /lib64/libpthread.so.0
#8  0x00007fc7b4e5073d in clone () from /lib64/libc.so.6
(gdb) 


brick log suggests:

ng down connection dj.lab.eng.blr.redhat.com-10304-2017/02/14-07:19:45:645412-master-client-0-0-0
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 
2017-02-14 20:10:22
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.8.4
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xc2)[0x7f890ee04b92]
/lib64/libglusterfs.so.0(gf_print_trace+0x324)[0x7f890ee0e624]
/lib64/libc.so.6(+0x35250)[0x7f890d4e8250]
/lib64/libpthread.so.0(pthread_spin_lock+0x0)[0x7f890dc6a210]
---------
(END)


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.8.4-13.el7rhgs.x86_64


How reproducible:
=================

2/2 during sanity check


Steps carried:
==============
1. Create master and slave volume with md-cache options being enabled
2. Create geo-rep session and start it
3. Run the automated sanity test to do create,chmod,chown,chgroup,hardlink,symlink,truncate,rename

Upon rename it crashes.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-02-15 05:44:08 EST ---

This bug is automatically being proposed for the current release of Red Hat Gluster Storage 3 under active development, by setting the release flag 'rhgs‑3.2.0' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Rahul Hinduja on 2017-02-15 05:50:57 EST ---

sosreports @: http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1422431/

[root@dhcp42-7 ~]# gluster volume info master
 
Volume Name: master
Type: Distributed-Replicate
Volume ID: fffd27c9-a083-4aac-9db8-52fe70dabaab
Status: Started
Snapshot Count: 0
Number of Bricks: 6 x 2 = 12
Transport-type: tcp
Bricks:
Brick1: :/bricks/brick0/master_brick0
Brick2: :/bricks/brick0/master_brick1
Brick3: :/bricks/brick0/master_brick2
Brick4: :/bricks/brick0/master_brick3
Brick5: :/bricks/brick1/master_brick4
Brick6: :/bricks/brick1/master_brick5
Brick7: :/bricks/brick1/master_brick6
Brick8: :/bricks/brick1/master_brick7
Brick9: :/bricks/brick2/master_brick8
Brick10 :/bricks/brick2/master_brick9
Brick11 :/bricks/brick2/master_brick10
Brick12 :/bricks/brick2/master_brick11
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
features.cache-invalidation-timeout: 600
performance.stat-prefetch: on
performance.md-cache-timeout: 600
features.cache-invalidation: on
performance.cache-invalidation: on
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
cluster.enable-shared-storage: enable
 
[root@dhcp42-7 ~]#

--- Additional comment from Soumya Koduri on 2017-02-15 05:54:50 EST ---

Upcall invalidations for XATTR operations were added as part of md-cache optimizations. The crash looks similar to bug1387204. Request poornima to take a look. Thanks!

--- Additional comment from Rahul Hinduja on 2017-02-15 05:57:03 EST ---

While dev is looking into the issue , I am trying to see the scenario mentioned in 

https://bugzilla.redhat.com/show_bug.cgi?id=1387204#c11

--- Additional comment from Rahul Hinduja on 2017-02-15 06:09:59 EST ---

(In reply to Rahul Hinduja from comment #4)
> While dev is looking into the issue , I am trying to see the scenario
> mentioned in 
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1387204#c11

Tried the scenario. It doesn't crash with the steps mentioned in 1387204:

[root@dhcp42-7 scripts]# gluster v info 
 
Volume Name: vol
Type: Distribute
Volume ID: 1511e280-8d38-4133-bd3c-758fce3c4c6c
Status: Started
Snapshot Count: 0
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: 10.70.42.7:/rhs/brick1/b1
Options Reconfigured:
geo-replication.indexing: on
features.cache-invalidation: on
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
cluster.enable-shared-storage: enable
[root@dhcp42-7 scripts]# 
[root@dhcp42-7 scripts]# gluster volume status vol
Status of volume: vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.42.7:/rhs/brick1/b1             49152     0          Y       53818
 
Task Status of Volume vol
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@dhcp42-7 scripts]#

Client:
=======

[root@dj vol]# for i in {1..100}; do dd if=/dev/zero of=dd.$i bs=1M count=1 ; mv dd.$i new_file.$i ; done
[root@dj vol]#
[root@dj vol]# echo $?
0
[root@dj vol]# 


[root@dhcp42-7 scripts]# gluster volume status vol
Status of volume: vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.42.7:/rhs/brick1/b1             49152     0          Y       53818
 
Task Status of Volume vol
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@dhcp42-7 scripts]# 
[root@dhcp42-7 scripts]# ls /rhs/brick1/b1/
new_file.1    new_file.24  new_file.4   new_file.55  new_file.70  new_file.86
new_file.10   new_file.25  new_file.40  new_file.56  new_file.71  new_file.87
new_file.100  new_file.26  new_file.41  new_file.57  new_file.72  new_file.88
new_file.11   new_file.27  new_file.42  new_file.58  new_file.73  new_file.89
new_file.12   new_file.28  new_file.43  new_file.59  new_file.74  new_file.9
new_file.13   new_file.29  new_file.44  new_file.6   new_file.75  new_file.90
new_file.14   new_file.3   new_file.45  new_file.60  new_file.76  new_file.91
new_file.15   new_file.30  new_file.46  new_file.61  new_file.77  new_file.92
new_file.16   new_file.31  new_file.47  new_file.62  new_file.78  new_file.93
new_file.17   new_file.32  new_file.48  new_file.63  new_file.79  new_file.94
new_file.18   new_file.33  new_file.49  new_file.64  new_file.8   new_file.95
new_file.19   new_file.34  new_file.5   new_file.65  new_file.80  new_file.96
new_file.2    new_file.35  new_file.50  new_file.66  new_file.81  new_file.97
new_file.20   new_file.36  new_file.51  new_file.67  new_file.82  new_file.98
new_file.21   new_file.37  new_file.52  new_file.68  new_file.83  new_file.99
new_file.22   new_file.38  new_file.53  new_file.69  new_file.84
new_file.23   new_file.39  new_file.54  new_file.7   new_file.85
[root@dhcp42-7 scripts]#

--- Additional comment from Poornima G on 2017-02-16 02:34:13 EST ---

The simple reproducer for this issue:
Create a plain distribute volume, enable cache-invalidation and marker feature on the server side:
gluster vol set <VOLNAME> features.cache-invalidation on
gluster vol ser <VOLNAME> indexing on
gluster vol quota <VOLNAME> enable

And from the fuse mount point, create a file and rename the file. After this the bricks will crash.

The reason for the crash is, on recieving a rename fop, marker_rename() stores the, oldloc and newloc in its 'local' struct, once the rename is done, the xtime marker(last updated time) is set on the file, but sending a setxattr fop. When upcall receives the setxattr fop, the loc->inode is NULL and it crashes. The loc->inode can be NULL only in one valid case, i.e. in rename case where the inode of new loc will be NULL. Hence, marker should have got the inode of the new_loc and filled it before issuing a setxattr.

Hence moving the component to marker.

This is similar to BZ: 1387204 that is already fixed, but when quota is enabled it takes a different code path. Will send the patch in marker-quota to fix the same.

--- Additional comment from RHEL Product and Program Management on 2017-02-16 02:41:18 EST ---

This bug report previously had all acks and release flag approved.
However since at least one of its acks has been changed, the
release flag has been reset to ? by the bugbot (pm-rhel).  The
ack needs to become approved before the release flag can become
approved again.

Comment 1 Worker Ant 2017-02-16 08:02:24 UTC
REVIEW: https://review.gluster.org/16633 (marker: Fix inode value in loc, in setxattr fop) posted (#1) for review on master by Poornima G (pgurusid)

Comment 2 Worker Ant 2017-02-16 08:04:25 UTC
REVIEW: https://review.gluster.org/16633 (marker: Fix inode value in loc, in setxattr fop) posted (#2) for review on master by Poornima G (pgurusid)

Comment 3 Worker Ant 2017-02-16 09:44:19 UTC
REVIEW: https://review.gluster.org/16633 (marker: Fix inode value in loc, in setxattr fop) posted (#3) for review on master by Poornima G (pgurusid)

Comment 4 Worker Ant 2017-02-20 05:11:46 UTC
COMMIT: https://review.gluster.org/16633 committed in master by Raghavendra G (rgowdapp) 
------
commit 73defab8be16b73241225bb1c2588a61e3e425d5
Author: Poornima G <pgurusid>
Date:   Thu Feb 16 13:05:25 2017 +0530

    marker: Fix inode value in loc, in setxattr fop
    
    On recieving a rename fop, marker_rename() stores the,
    oldloc and newloc in its 'local' struct, once the rename
    is done, the xtime marker(last updated time) is set on
    the file, but sending a setxattr fop. When upcall
    receives the setxattr fop, the loc->inode is NULL and
    it crashes. The loc->inode can be NULL only in one valid
    case, i.e. in rename case where the inode of new loc
    can be NULL. Hence, marker should have filled the inode
    of the new_loc before issuing a setxattr.
    
    marker_rename_cbk was already fixed in a previous commit.
    Fixing marker_rename_done to send valid inode in this commit.
    
    Also in upcall check for NULL inode so that there is no crash.
    
    Change-Id: I3ed2a05118fed3367dfe3251ce4477310cb480d0
    BUG: 1422776
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: https://review.gluster.org/16633
    Reviewed-by: Kotresh HR <khiremat>
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: soumya k <skoduri>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 5 Shyamsundar 2017-05-30 18:44:10 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.11.0, please open a new bug report.

glusterfs-3.11.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-May/000073.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.