Bug 1568758 - Block delete times out for blocks created of very large size
Summary: Block delete times out for blocks created of very large size
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: sharding
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: RHGS 3.5.0
Assignee: Krutika Dhananjay
QA Contact: Sweta Anandpara
URL:
Whiteboard:
Depends On: 1520882
Blocks: 1503143 1696807
TreeView+ depends on / blocked
 
Reported: 2018-04-18 08:59 UTC by Sweta Anandpara
Modified: 2019-10-30 12:20 UTC (History)
12 users (show)

Fixed In Version: glusterfs-6.0-2
Doc Type: Bug Fix
Doc Text:
Deleting a file with a large number of shards timed out because unlink operations occurred on all shards in parallel, which led to contention on the .shard directory. Timeouts resulted in failed deletions and stale shards remaining in the .shard directory. Shard deletion is now a background process that deletes one batch of shards at a time, to control contention on the .shard directory and prevent timeouts. The size of shard deletion batches is controlled with the features.shard-deletion-rate option, which is set to 100 by default.
Clone Of:
Environment:
Last Closed: 2019-10-30 12:19:38 UTC
Embargoed:


Attachments (Terms of Use)
Verification logs on rhgs3.5.0 (85.31 KB, text/plain)
2019-07-02 07:17 UTC, Sweta Anandpara
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1522624 0 high CLOSED [GSS]shard files present even after deleting vm from the rhev 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHEA-2019:3249 0 None None None 2019-10-30 12:20:03 UTC

Internal Links: 1522624

Description Sweta Anandpara 2018-04-18 08:59:53 UTC
Description of problem:
======================

Had a 6node cluster, with a 1*3 volume 'ozone' created on node1, node2 and node3. The setup was brickmux enabled, and the volume option was set to group 'gluster-block'

There were quite a few (<10) blocks created while verifying bz 1514344 and bz 1545049. Started deleting the blocks one by one, and the 'gluster-block delete' command timed out for block 'ob10'. Unable to see anything amiss in the logs /var/log/gluster-block or in /var/log/messages, rebooted all the services, and tried it another time. This time the command succeeded for few blocks and again timed out for block 'ob9'. 

Trying to relate the similarity between the two blocks that timed out, they are of fairly large sizes - 1E and 1P. 
The entire system slows down after the command fails, I suppose because internally it keeps trying to do what was intended, and is not able to get through.. In other words, every 'gluster-block' command given after the failure takes a 2-3 mins to show the output. Restarting the gluster-block daemon does get the system back to normal. 

I was not able to gather much from the logs, maybe we need better logging there..

Version-Release number of selected component (if applicable):
=============================================================
glusterfs-3.12.2-7
tcmu-runner-1.2.0-18
gluster-block-0.2.1-17


How reproducible:
================
3:3


Steps to Reproduce:
==================
1. Create a block of 1K, 1M, 1G, 1T, 1P, 1E on a replica 3 volume
2. Execute 'gluster-block' delete on all the above created blocks


Actual results:
==============
Block delete succeeds for blocks of sizes 1K, 1M, 1G, 1T, but times out on 1P and 1E.


Expected results:
================
Either block delete should succeed immediately, or it should disallow creation of such large blocks if it affects the functionality. 



Additional info:
===============
Sosreports and gluster-block logs will be copied at http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/swetas/<bugnumber>

Comment 3 Pranith Kumar K 2018-04-18 09:08:58 UTC
Sweta,
   Could you disable sharding and redo this test? I am suspecting that this has to do with sharding xlator taking lot of time to delete the individual shards. Krutika is working on doing unlinks in background as part of https://bugzilla.redhat.com/show_bug.cgi?id=1520882 for 3.4.0.

Comment 4 Pranith Kumar K 2018-04-18 09:09:59 UTC
(In reply to Pranith Kumar K from comment #3)
> Sweta,
>    Could you disable sharding and redo this test? I am suspecting that this
> has to do with sharding xlator taking lot of time to delete the individual
> shards. Krutika is working on doing unlinks in background as part of
> https://bugzilla.redhat.com/show_bug.cgi?id=1520882 for 3.4.0.

Please note that you need to both create and delete the block volume while sharding is disabled for us to confirm that the delay was introduced because of sharding.

Comment 14 Krutika Dhananjay 2018-10-31 04:48:40 UTC
Note: The fixes to this issue have been merged upstream: https://review.gluster.org/#/q/status:merged+project:glusterfs+branch:master+topic:ref-1568521

Moving this bug to POST state.

Comment 18 SATHEESARAN 2018-11-29 12:14:43 UTC
The fix for this issue is already merged and the other bug BZ 1520882 is ON_QA.
It would be more relevant to have this bug too on ON_QA, as the fix addresses this issue too.

Why is that, this bug is not moved ON_QA ?

Comment 19 Krutika Dhananjay 2018-11-29 13:19:57 UTC
(In reply to SATHEESARAN from comment #18)
> The fix for this issue is already merged and the other bug BZ 1520882 is
> ON_QA.
> It would be more relevant to have this bug too on ON_QA, as the fix
> addresses this issue too.
> 
> Why is that, this bug is not moved ON_QA ?

Ok. I don't completely understand the process, but shouldn't this be done only when all 3 acks are in place? Let me know if that is not the case.

-Krutika

Comment 29 Sweta Anandpara 2019-07-02 07:17:40 UTC
Created attachment 1586539 [details]
Verification logs on rhgs3.5.0

Comment 37 errata-xmlrpc 2019-10-30 12:19:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:3249


Note You need to log in before you can comment on or make changes to this bug.