Bug 1662368

Summary: [ovirt-gluster] Fuse mount crashed while deleting a 1 TB image file from ovirt
Product: [Community] GlusterFS Reporter: Krutika Dhananjay <kdhananj>
Component: shardingAssignee: Krutika Dhananjay <kdhananj>
Status: CLOSED CURRENTRELEASE QA Contact: bugs <bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs, guillaume.pavese, rhs-bugs, sankarshan, sasundar, storage-qa-internal
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-6.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1662059
: 1665803 (view as bug list) Environment:
RHV-RHGS Integration
Last Closed: 2019-03-25 16:32:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1662059, 1665803    

Description Krutika Dhananjay 2018-12-28 02:08:01 UTC
+++ This bug was initially created as a clone of Bug #1662059 +++

Description of problem:
-----------------------

Attempts were made to try out the customer scenario, where the disks of bigger sizes residing on gluster volumes are deleted from ovirt. During one such attempts, its found that the fuse mount process has crashed.

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
ovirt 4.0.5
gluster-master

How reproducible:
-----------------
1/1 
Hit it once.

Steps to Reproduce:
-------------------
1. ovirt storage domain is configured to use gluster arbitrated replicate volume,
with sharding enabled
2. Create disk of size 1TB from ovirt Manager UI
3. Delete the disk from ovirt Manager UI

Actual results:
---------------
Gluster fuse mount process crashed on one of the hypervisor

Expected results:
-----------------
No gluster process should crash

--- Additional comment from SATHEESARAN on 2018-12-25 18:03:56 UTC ---

1. Gluster cluster info
------------------------
There are 3 nodes in the gluster cluster

2. Gluster volume info
----------------------
[root@rhsqa-grafton7-nic2 ~]# gluster volume info imstore
 
Volume Name: imstore
Type: Replicate
Volume ID: 878eb828-0735-4ce8-a2b3-c52a757ee1b2
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: server1:/gluster_bricks/vol1/im1
Brick2: server2:/gluster_bricks/vol1/im1
Brick3: server3:/gluster_bricks/vol1/im1 (arbiter)
Options Reconfigured:
performance.strict-o-direct: on
storage.owner-gid: 36
storage.owner-uid: 36
network.ping-timeout: 30
cluster.granular-entry-heal: on
server.event-threads: 4
client.event-threads: 4
cluster.choose-local: off
user.cifs: off
features.shard: on
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
network.remote-dio: off
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: on



4. Test info
-------------
File that was deleted: /rhev/data-center/mnt/glusterSD/server1\:_imstore/3d4f163a-c6e0-476e-a726-bd780e0d1b83/images/075c6ffd-318c-4108-8405-ccf8078c1e16/b62a4640-f02a-4aa1-b249-cfc4cb2f7f59
GFID of this file is: 3d231d2b-4fff-4c03-b593-70befaf77296

Before deleting the file:
[root@server1 ~]# ls /gluster_bricks/vol1/im1/.shard/ |grep 3d231 | wc -l
16383

While the deleting is in progress:
[root@server1 ~]# ls /gluster_bricks/vol1/im1/.shard/ |grep 3d231 | wc -l
3983

After the fuse mount crash, there were some ghost shards, but after 15 mins,
there are no ghost shards
[root@server1 ~]# ls /gluster_bricks/vol1/im1/.shard/.remove_me/
[root@server1 ~]# ls /gluster_bricks/vol1/im1/.shard/ |grep 3d231 | wc -l
0

--- Additional comment from SATHEESARAN on 2018-12-26 06:27:00 UTC ---

(In reply to Krutika Dhananjay from comment #5)
> So there is no core dump and I can't tell much from just the logs.
> 
> From
> [root@dhcp37-127 ~]# cat /proc/sys/kernel/core_pattern 
> |/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t %e %i
> 
> Seems like this should be set to a valid path for us to get the core dump.
> 
> Would be great if you can change this value to a meaningful path and
> recreate the issue.
> 
> -Krutika

I could reproduce the issue consistently outside of ovirt-gluster setup.
With 3 gluster servers and 1 client.

1. Create 5 VM image files on the fuse mounted gluster volume using qemu-img command
    # qemu-img create -f qcow2 -o preallocation=full /mnt/test/vm1.img 10G
    # qemu-img create -f qcow2 -o preallocation=full /mnt/test/vm2.img 7G
    # qemu-img create -f qcow2 -o preallocation=full /mnt/test/vm3.img 5G
    # qemu-img create -f qcow2 -o preallocation=full /mnt/test/vm4.img 4G

2. Delete the files from the mount
    # rm -rf /mnt/testdata/*

The above step hits the crash,close to consistent

I will reinstall the required debug packages and will provide the setup for debugging

--- Additional comment from SATHEESARAN on 2018-12-26 06:33:43 UTC ---

Backtrace from the core file

Core was generated by `/usr/sbin/glusterfs --volfile-server=10.70.37.152 --volfile-id=/volume1 /mnt/te'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007fdb3d8eb53e in shard_unlink_block_inode (local=local@entry=0x7fdb2400a400, shard_block_num=<optimized out>) at shard.c:2945
2945	                        base_ictx->fsync_count--;


--- Additional comment from Krutika Dhananjay on 2018-12-26 14:23:28 UTC ---

(In reply to SATHEESARAN from comment #6)
> (In reply to Krutika Dhananjay from comment #5)
> > So there is no core dump and I can't tell much from just the logs.
> > 
> > From
> > [root@dhcp37-127 ~]# cat /proc/sys/kernel/core_pattern 
> > |/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t %e %i
> > 
> > Seems like this should be set to a valid path for us to get the core dump.
> > 
> > Would be great if you can change this value to a meaningful path and
> > recreate the issue.
> > 
> > -Krutika
> 
> I could reproduce the issue consistently outside of RHV-RHGS setup.
> With 3 RHGS servers and 1 client.
> 
> 1. Create 5 VM image files on the fuse mounted gluster volume using qemu-img
> command
>     # qemu-img create -f qcow2 -o preallocation=full /mnt/test/vm1.img 10G
>     # qemu-img create -f qcow2 -o preallocation=full /mnt/test/vm2.img 7G
>     # qemu-img create -f qcow2 -o preallocation=full /mnt/test/vm3.img 5G
>     # qemu-img create -f qcow2 -o preallocation=full /mnt/test/vm4.img 4G
> 
> 2. Delete the files from the mount
>     # rm -rf /mnt/testdata/*
> 
> The above step hits the crash,close to consistent
> 
> I will reinstall the required debug packages and will provide the setup for
> debugging


Is the mountpoint in step 1 different from the one used in 2? In step 1, files are created under /mnt/test/. But the rm -rf is done from /mnt/testdata/

-Krutika

--- Additional comment from SATHEESARAN on 2018-12-26 15:38:21 UTC ---

(In reply to Krutika Dhananjay from comment #8)
> (In reply to SATHEESARAN from comment #6)
> > (In reply to Krutika Dhananjay from comment #5)
> > > So there is no core dump and I can't tell much from just the logs.
> > > 
> > > From
> > > [root@dhcp37-127 ~]# cat /proc/sys/kernel/core_pattern 
> > > |/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t %e %i
> > > 
> > > Seems like this should be set to a valid path for us to get the core dump.
> > > 
> > > Would be great if you can change this value to a meaningful path and
> > > recreate the issue.
> > > 
> > > -Krutika
> > 
> > I could reproduce the issue consistently outside of RHV-RHGS setup.
> > With 3 RHGS servers and 1 client.
> > 
> > 1. Create 5 VM image files on the fuse mounted gluster volume using qemu-img
> > command
> >     # qemu-img create -f qcow2 -o preallocation=full /mnt/test/vm1.img 10G
> >     # qemu-img create -f qcow2 -o preallocation=full /mnt/test/vm2.img 7G
> >     # qemu-img create -f qcow2 -o preallocation=full /mnt/test/vm3.img 5G
> >     # qemu-img create -f qcow2 -o preallocation=full /mnt/test/vm4.img 4G
> > 
> > 2. Delete the files from the mount
> >     # rm -rf /mnt/testdata/*
> > 
> > The above step hits the crash,close to consistent
> > 
> > I will reinstall the required debug packages and will provide the setup for
> > debugging
> 
> 
> Is the mountpoint in step 1 different from the one used in 2? In step 1,
> files are created under /mnt/test/. But the rm -rf is done from
> /mnt/testdata/
> 
> -Krutika

I did it from same mount. No different mounts

Comment 1 Worker Ant 2018-12-28 02:10:20 UTC
REVIEW: https://review.gluster.org/21946 (features/shard: Assign fop id during background deletion to prevent excessive logging) posted (#1) for review on master by Krutika Dhananjay

Comment 2 Worker Ant 2018-12-28 15:43:45 UTC
REVIEW: https://review.gluster.org/21957 (features/shard: Fix launch of multiple synctasks for background deletion) posted (#1) for review on master by Krutika Dhananjay

Comment 3 Worker Ant 2019-01-08 12:09:39 UTC
REVIEW: https://review.gluster.org/21946 (features/shard: Assign fop id during background deletion to prevent excessive logging) posted (#7) for review on master by Xavi Hernandez

Comment 4 Worker Ant 2019-01-11 08:36:32 UTC
REVIEW: https://review.gluster.org/21957 (features/shard: Fix launch of multiple synctasks for background deletion) posted (#7) for review on master by Xavi Hernandez

Comment 5 Worker Ant 2019-01-14 06:39:47 UTC
REVIEW: https://review.gluster.org/22018 (features/shard: Assign fop id during background deletion to prevent excessive logging) posted (#1) for review on release-5 by Krutika Dhananjay

Comment 6 Worker Ant 2019-01-14 06:42:50 UTC
REVISION POSTED: https://review.gluster.org/22018 (features/shard: Assign fop id during background deletion to prevent excessive logging) posted (#2) for review on release-5 by Krutika Dhananjay

Comment 7 Shyamsundar 2019-03-25 16:32:53 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-6.0, please open a new bug report.

glusterfs-6.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2019-March/000120.html
[2] https://www.gluster.org/pipermail/gluster-users/