1288509 – rm -rf is taking very long time

Bug 1288509 - rm -rf is taking very long time

Summary: rm -rf is taking very long time

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	tier
Sub Component:
Version:	rhgs-3.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	RHGS 3.1.2
Assignee:	Bug Updates Notification Mailing List
QA Contact:	RajeshReddy
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-12-04 13:00 UTC by RajeshReddy
Modified:	2016-09-17 15:34 UTC (History)
CC List:	11 users (show)
Fixed In Version:	glusterfs-3.7.5-15
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-03-01 06:00:54 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2016:0193	0	normal	SHIPPED_LIVE	Red Hat Gluster Storage 3.1 update 2	2016-03-01 10:20:36 UTC

Description RajeshReddy 2015-12-04 13:00:21 UTC

Description of problem:
===============
rm -rf is taking very long time 

Version-Release number of selected component (if applicable):
================
glusterfs-server-3.7.5-9


How reproducible:


Steps to Reproduce:
==============
1. Create 2x2 volume and then mount it on client using FUSE and create directory and then create 50k files
2. Attach 2x2 hot bricks to the volume and then create new directory and create around 20k files 
3. Kill all the brick process and restart the volume using force option 
4. While files are getting demoted run rm -rf * but it took more than 2 hours time
Actual results:


Expected results:
==========
Should not take this much time 


Additional info:
=============
[root@rhs-client19 test_tier-tier-dht]# gluster vol info test_tier 
 
Volume Name: test_tier
Type: Tier
Volume ID: 9bca8ffb-d47c-4636-95ab-2cfc58da422e
Status: Started
Number of Bricks: 8
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: rhs-client19.lab.eng.blr.redhat.com:/rhs/brick5/test_tier_hot4
Brick2: rhs-client18.lab.eng.blr.redhat.com:/rhs/brick5/test_tier_hot4
Brick3: rhs-client19.lab.eng.blr.redhat.com:/rhs/brick4/test_tier_hot3
Brick4: rhs-client18.lab.eng.blr.redhat.com:/rhs/brick4/test_tier_hot3
Cold Tier:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick5: rhs-client18.lab.eng.blr.redhat.com:/rhs/brick7/test_tier_hot1
Brick6: rhs-client19.lab.eng.blr.redhat.com:/rhs/brick7/test_tier_hot1
Brick7: rhs-client18.lab.eng.blr.redhat.com:/rhs/brick6/test_tier_hot2
Brick8: rhs-client19.lab.eng.blr.redhat.com:/rhs/brick6/test_tier_hot2
Options Reconfigured:
cluster.tier-mode: test
features.ctr-enabled: on
performance.readdir-ahead: on

Client name:vertigo.lab.eng.blr.redhat.com
Mount:/mnt/test_tier

Comment 2 RajeshReddy 2015-12-04 13:04:26 UTC

sosreport available  @/home/repo/sosreports/bug.1288509 on rhsqe-repo.lab.eng.blr.redhat.com

Comment 3 RajeshReddy 2015-12-08 10:53:10 UTC

Able to reproduce the issue without Step no 3 ( Kill all the brick process and restart the volume using force option)

Comment 6 RajeshReddy 2015-12-11 12:19:44 UTC

I was not running rename operations

Comment 7 Joseph Elwin Fernandes 2015-12-20 04:22:18 UTC

http://review.gluster.org/12972

Comment 8 Joseph Elwin Fernandes 2015-12-22 15:32:58 UTC

https://code.engineering.redhat.com/gerrit/#/c/64372/1

Options provided in for good performance

gluster vol set features.ctr-sql-db-wal-autocheckpoint 25000
gluster vol set features.ctr-sql-db-cachesize 12500

gluster vol set help for details

Comment 10 RajeshReddy 2015-12-23 13:18:10 UTC

To delete 50k files took more than two hour

Comment 11 Dan Lambright 2015-12-23 16:29:25 UTC

Can you turn off ctr and rerun the test?

gluster set volume <vol?> features.ctr-enabled off

Comment 13 RajeshReddy 2015-12-24 07:28:01 UTC

Tested with build glusterfs-server-3.7.5-13 and after setting features.ctr-sql-db-cachesize: 12500 and features.ctr-sql-db-wal-autocheckpoint: 25000 and removal of 50k files took 6m and clearly giving good performance 

As i need to repeat the same tests with build which having above settings as default values so please move this bug to ON_QA once build having proper default values

Comment 14 Joseph Elwin Fernandes 2015-12-24 14:38:57 UTC

Sure. Thanks :)

Comment 15 Joseph Elwin Fernandes 2015-12-24 14:45:24 UTC

Sure

Comment 16 Joseph Elwin Fernandes 2015-12-31 03:59:26 UTC

https://code.engineering.redhat.com/gerrit/64642

Comment 18 RajeshReddy 2016-01-06 12:45:42 UTC

Tested with 3.7.5-14 build and after creation of new tiered volume verified the features.ctr-sql-db-wal-autocheckpoint  and features.ctr-sql-db-cachesize default values and those values are not modified so marking this bug as failed QA 


[root@tettnang afr1x2_tier_bug]# rpm -qa | grep glusterfs 
glusterfs-3.7.5-14.el7rhgs.x86_64
glusterfs-geo-replication-3.7.5-14.el7rhgs.x86_64
glusterfs-fuse-3.7.5-14.el7rhgs.x86_64
glusterfs-debuginfo-3.7.5-14.el7rhgs.x86_64
glusterfs-api-3.7.5-14.el7rhgs.x86_64
glusterfs-rdma-3.7.5-14.el7rhgs.x86_64
glusterfs-client-xlators-3.7.5-14.el7rhgs.x86_64
glusterfs-server-3.7.5-14.el7rhgs.x86_64
glusterfs-ganesha-3.7.5-14.el7rhgs.x86_64
glusterfs-cli-3.7.5-14.el7rhgs.x86_64
glusterfs-libs-3.7.5-14.el7rhgs.x86_64
glusterfs-api-devel-3.7.5-14.el7rhgs.x86_64
glusterfs-devel-3.7.5-14.el7rhgs.x86_64
[root@tettnang afr1x2_tier_bug]# gluster vol get perform_create all | grep cachesize
features.ctr-sql-db-cachesize           1000                                    
[root@tettnang afr1x2_tier_bug]# gluster vol get perform_create all | grep ctr-sql-db-wal-autocheckpoint
features.ctr-sql-db-wal-autocheckpoint  1000                                    
[root@tettnang afr1x2_tier_bug]#

Comment 19 Joseph Elwin Fernandes 2016-01-06 17:57:40 UTC

https://code.engineering.redhat.com/gerrit/64971

Comment 21 RajeshReddy 2016-01-08 11:58:24 UTC

Deletion of 50K files took 2m and by default  features.ctr-sql-db-cachesize:  and features.ctr-sql-db-wal-autocheckpoint set to 12500 & 25000 respectively so marking this bug as verified

Comment 23 errata-xmlrpc 2016-03-01 06:00:54 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html

Note You need to log in before you can comment on or make changes to this bug.