Bug 1456696 - Multiple crashes observed on slave side coming from: dht_rmdir_cached_lookup_cbk on 3.2.0_async
Summary: Multiple crashes observed on slave side coming from: dht_rmdir_cached_lookup_...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: distribute
Version: rhgs-3.2
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: RHGS 3.2.0 Async
Assignee: Nithya Balachandran
QA Contact: Prasad Desala
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-05-30 07:47 UTC by Rahul Hinduja
Modified: 2017-06-08 09:37 UTC (History)
5 users (show)

Fixed In Version: glusterfs-3.8.4-18.4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-06-08 09:37:15 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:1418 0 normal SHIPPED_LIVE glusterfs bug fix update 2017-06-08 13:33:58 UTC

Description Rahul Hinduja 2017-05-30 07:47:35 UTC
Description of problem:
=======================

While running the fop (rm -rf) on geo-rep master setup found lots of crashes on slave side with bt:

(gdb) bt
#0  0x00007f570287bc00 in dht_rmdir_do (frame=frame@entry=0x7f570e68b1d0, this=this@entry=0x7f56fc00e5e0) at dht-common.c:7944
#1  0x00007f570287c4ab in dht_rmdir_cached_lookup_cbk (frame=frame@entry=0x7f570e68a06c, cookie=<optimized out>, this=0x7f56fc00e5e0, op_ret=0, op_errno=<optimized out>, inode=<optimized out>, 
    stbuf=stbuf@entry=0x7f56f0021410, xattr=0x7f570de29e88, parent=0x7f56f0021480) at dht-common.c:8137
#2  0x00007f5702b13056 in afr_lookup_done (frame=frame@entry=0x7f570e68ac04, this=this@entry=0x7f56fc00d670) at afr-common.c:2167
#3  0x00007f5702b13a04 in afr_lookup_metadata_heal_check (frame=frame@entry=0x7f570e68ac04, this=0x7f56fc00d670, this@entry=0x8072f15dde9c0700) at afr-common.c:2410
#4  0x00007f5702b14331 in afr_lookup_entry_heal (frame=frame@entry=0x7f570e68ac04, this=0x8072f15dde9c0700, this@entry=0x7f56fc00d670) at afr-common.c:2501
#5  0x00007f5702b1469d in afr_lookup_cbk (frame=frame@entry=0x7f570e68ac04, cookie=<optimized out>, this=0x7f56fc00d670, op_ret=<optimized out>, op_errno=<optimized out>, inode=inode@entry=0x7f56fa6af20c, 
    buf=buf@entry=0x7f56fb48e940, xdata=0x7f570de2a138, postparent=postparent@entry=0x7f56fb48e9b0) at afr-common.c:2549
#6  0x00007f5702d515dd in client3_3_lookup_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, myframe=0x7f570e68b0fc) at client-rpc-fops.c:2945
#7  0x00007f571097a860 in rpc_clnt_handle_reply (clnt=clnt@entry=0x7f56fc090970, pollin=pollin@entry=0x7f56f0004e20) at rpc-clnt.c:794
#8  0x00007f571097ab4f in rpc_clnt_notify (trans=<optimized out>, mydata=0x7f56fc0909a0, event=<optimized out>, data=0x7f56f0004e20) at rpc-clnt.c:987
#9  0x00007f57109769f3 in rpc_transport_notify (this=this@entry=0x7f56fc0a0690, event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7f56f0004e20) at rpc-transport.c:538
#10 0x00007f570523b314 in socket_event_poll_in (this=this@entry=0x7f56fc0a0690) at socket.c:2272
#11 0x00007f570523d7c5 in socket_event_handler (fd=<optimized out>, idx=1, data=0x7f56fc0a0690, poll_in=1, poll_out=0, poll_err=0) at socket.c:2402
#12 0x00007f5710c0a770 in event_dispatch_epoll_handler (event=0x7f56fb48ee80, event_pool=0x7f571156de10) at event-epoll.c:571
#13 event_dispatch_epoll_worker (data=0x7f56fc07bbf0) at event-epoll.c:674
#14 0x00007f570fa11dc5 in start_thread () from /lib64/libpthread.so.0
#15 0x00007f570f35673d in clone () from /lib64/libc.so.6
(gdb) 



Version-Release number of selected component (if applicable):
=============================================================

glusterfs-server-3.8.4-18.2.el7rhgs.x86_64


How reproducible:
=================

Always


Steps to Reproduce:
===================
1. Setup geo-rep between master and slave
2. Create data on master 
3. Perform rm -rf on master

Actual results:
===============

Multiple fs process crashed

Comment 8 Rahul Hinduja 2017-06-03 10:22:30 UTC
Verified the same case with the build: glusterfs-3.8.4-18.4.el7rhgs.x86_64

No core is observed at slave and the sync is completed for all fops including rmdir. Moving this bug to verified state...

Comment 10 errata-xmlrpc 2017-06-08 09:37:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1418


Note You need to log in before you can comment on or make changes to this bug.