Bug 1694595

Summary: gluster fuse mount crashed, when deleting 2T image file from RHV Manager UI
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: SATHEESARAN <sasundar>
Component: shardingAssignee: Krutika Dhananjay <kdhananj>
Status: CLOSED ERRATA QA Contact: SATHEESARAN <sasundar>
Severity: urgent Docs Contact:
Priority: urgent    
Version: rhgs-3.4CC: amukherj, bkunal, jahernan, kdhananj, pasik, rhs-bugs, sabose, sheggodu, storage-qa-internal, ykaul
Target Milestone: ---   
Target Release: RHGS 3.5.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-6.0-5 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1694604 1696136 (view as bug list) Environment:
Last Closed: 2019-10-30 12:20:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1694604, 1696136, 1696807    

Description SATHEESARAN 2019-04-01 08:32:45 UTC
Description of problem:
------------------------
When deleting the 2TB image file , gluster fuse mount process has crashed

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
RHGS 3.4.4 ( glusterfs-3.12.2-47.el7rhgs )

How reproducible:
-----------------
1/1

Steps to Reproduce:
-------------------
1. Create a image file of 2T from RHV Manager UI
2. Delete the same image file after its created successfully

Actual results:
---------------
Fuse mount crashed

Expected results:
-----------------
All should work fine and no fuse mount crashes

Comment 1 SATHEESARAN 2019-04-01 08:33:14 UTC
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash: 
2019-04-01 07:57:53
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.12.2
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x9d)[0x7fc72c186b9d]
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7fc72c191114]
/lib64/libc.so.6(+0x36280)[0x7fc72a7c2280]
/usr/lib64/glusterfs/3.12.2/xlator/features/shard.so(+0x9627)[0x7fc71f8ba627]
/usr/lib64/glusterfs/3.12.2/xlator/features/shard.so(+0x9ef1)[0x7fc71f8baef1]
/usr/lib64/glusterfs/3.12.2/xlator/cluster/distribute.so(+0x3ae9c)[0x7fc71fb15e9c]
/usr/lib64/glusterfs/3.12.2/xlator/cluster/replicate.so(+0x9e8c)[0x7fc71fd88e8c]
/usr/lib64/glusterfs/3.12.2/xlator/cluster/replicate.so(+0xb79b)[0x7fc71fd8a79b]
/usr/lib64/glusterfs/3.12.2/xlator/cluster/replicate.so(+0xc226)[0x7fc71fd8b226]
/usr/lib64/glusterfs/3.12.2/xlator/protocol/client.so(+0x17cbc)[0x7fc72413fcbc]
/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0x90)[0x7fc72bf2ca00]
/lib64/libgfrpc.so.0(rpc_clnt_notify+0x26b)[0x7fc72bf2cd6b]
/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fc72bf28ae3]
/usr/lib64/glusterfs/3.12.2/rpc-transport/socket.so(+0x7586)[0x7fc727043586]
/usr/lib64/glusterfs/3.12.2/rpc-transport/socket.so(+0x9bca)[0x7fc727045bca]
/lib64/libglusterfs.so.0(+0x8a870)[0x7fc72c1e5870]
/lib64/libpthread.so.0(+0x7dd5)[0x7fc72afc2dd5]
/lib64/libc.so.6(clone+0x6d)[0x7fc72a889ead]

Comment 4 Krutika Dhananjay 2019-04-11 15:28:51 UTC
Upstream patches - https://review.gluster.org/q/topic:%22ref-1696136%22+(status:open%20OR%20status:merged)

Patch https://review.gluster.org/c/glusterfs/+/22517 fixes the original bug reported by Satheesaran.
I also identified another bug while debugging this original issue which is fixed here - https://review.gluster.org/c/glusterfs/+/22507
The commit message and the .t should explain what it fixes and how to hit this crash.

There's also a third crash I found while reading code that is harder to hit - it will be hit only when the lru list is filled with a mix of shards from > 160-170 different vm images per hypervisor and each of them being > 6GB in size and they're all created with preallocation in parallel and immediately deleted in parallel.
I have yet to fix it because it's a harder problem to solve as the very shards that are required in a deletion operation could end up evicting and inode_unlink()ing the other participant shards of the same image leading to incorrect. In my tests, at best I could see a crash but the unlink succeeded. But I'm surprised unlink even worked. I need to debug why.

Moving the bz to POST in any case.

Comment 5 Atin Mukherjee 2019-06-03 13:50:17 UTC
Krutika - we need to get this bug fixed in the early stage of development. I see that there's one patch which is pending review. Can we please ensure this patch is merged and backports are done so that this BZ can move to ON_QA in next build of RHGS 3.5.0?

Comment 6 Krutika Dhananjay 2019-06-03 14:26:02 UTC
(In reply to Atin Mukherjee from comment #5)
> Krutika - we need to get this bug fixed in the early stage of development. I
> see that there's one patch which is pending review. Can we please ensure
> this patch is merged and backports are done so that this BZ can move to
> ON_QA in next build of RHGS 3.5.0?

Ack. Will ping Xavi for review.

Comment 12 SATHEESARAN 2019-07-03 16:59:47 UTC
Tested with RHVH 4.3.5 based on RHEL 7.7 with glusterfs-6.0-7 with 2 test scenarios

1. Created the multiple preallocated raw images with their aggregate size exceeding 2TB
and deleted them all together ( concurrent )

2. Created multiple 2TB preallocated raw images and delete them concurrently

On the both of the above mentioned scenarios, the deletion of VM images was smooth,
no issues seen, all hosts were operational and DC was fully functional

Comment 14 errata-xmlrpc 2019-10-30 12:20:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:3249