Bug 1718152 - remove (rm -rf) of about 3.5 million entries per client took about 4.5 days
Summary: remove (rm -rf) of about 3.5 million entries per client took about 4.5 days
Keywords:
Status: CLOSED DUPLICATE of bug 1651048
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: rhgs-3.5
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: ---
Assignee: Xavi Hernandez
QA Contact: Bala Konda Reddy M
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-07 06:00 UTC by Nag Pavan Chilakam
Modified: 2020-01-20 12:01 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-01-20 12:01:40 UTC
Embargoed:


Attachments (Terms of Use)
profile output for comment#18 (79.31 KB, text/plain)
2019-07-08 06:27 UTC, Nag Pavan Chilakam
no flags Details

Description Nag Pavan Chilakam 2019-06-07 06:00:51 UTC
Description of problem:
=======================
removal of deep directories and files are taking significantly long time.

For example, I had above 10 directories( 1 for each client) which each hosted about about 50 deep directories with about 70,000 files and directories in each(each directory is a linux untar image)
and apart from those about 100 log files.

From each client when I issued a rm -rf of one first level directory each(each client works on different set of files, no overlapping here), it took about 6560minutes ie about 4 days and 13 hours to complete the task.

Though not a apple to apple comparision, this is a very huge time, considering that a linux untarred directory gets deleted in less than 2 seconds on my desktop (so 50 would be deleted in less than a minute or so)

[root@dhcp42-60 dhcp42-60.lab.eng.blr.redhat.com]# time rm -rf dir* tar* untar*
rm: cannot remove ‘dir.41/linux-5.1.3’: Directory not empty

real    6562m49.361s
user    0m48.952s
sys     6m53.090s


Note: I don;t have the numbers of how much time it took in previous releases, and raising this to track slow rm -rf issue with glusterfs

Version-Release number of selected component (if applicable):
===================
6.0.3 on rhel77




Steps to Reproduce:
1.had a 4x3 volume on a 4 node cluster with quotas enabled
2. had run some tests as detailed out in steps in https://bugzilla.redhat.com/show_bug.cgi?id=1715438#c0

3. added a 5th node to the cluster
4. issued a replace brick to replace all bricks being hosted by n1 with the new n5
6. around same time, below IOs were going on from each client
 a. append to a file every 2 minutes with top o/p (one file each for each client
 b. had about 50 directories with linux untarred image in each in about 10 directories(1 for each client), for which I did rm -rf as above (including some log files, totalling to 100 addition files for each client), so each client would be trying to delete about 3.5 million files seperately(and not overlapping)



Actual results:
=============
rm -rf of each of the 3.5million files took about 4.5 days by each client

Expected results:
================
rm -rf taking this much time is not something a end user would like


Volume Name: nftvol
Type: Distributed-Replicate
Volume ID: bb49cec5-d750-4e2f-a332-8da43efaf2d3
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x 3 = 12
Transport-type: tcp
Bricks:
Brick1: rhs-gp-srv5.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-rb1
Brick2: rhs-gp-srv8.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-b1
Brick3: rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-b1
Brick4: rhs-gp-srv8.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-b2
Brick5: rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-b2
Brick6: rhs-gp-srv10.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-b2
Brick7: rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-b3
Brick8: rhs-gp-srv10.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-b3
Brick9: rhs-gp-srv5.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-rb3
Brick10: rhs-gp-srv10.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-b4
Brick11: rhs-gp-srv5.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-rb4
Brick12: rhs-gp-srv8.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-b4
Options Reconfigured:
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet


sosreports and logs to follow

Comment 3 Nag Pavan Chilakam 2019-06-07 06:45:59 UTC
sosreports and logs @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/bug.1718152/

Comment 4 Nag Pavan Chilakam 2019-06-07 06:48:51 UTC
[root@rhs-gp-srv7 ~]# gluster v heal nftvol info 
Brick rhs-gp-srv5.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-rb1
/IOs/new-samedir/level1.1/level2.1/level3.30/level4.76/level5.87 
/IOs/new-samedir/level1.1/level2.1/level3.30/level4.76 
/IOs/new-samedir/level1.1/level2.1/level3.45/level4.75/level5.12 - Is in split-brain
/IOs/new-samedir/level1.1/level2.1/level3.45/level4.75 
/IOs/new-samedir/level1.1/level2.2/level3.34/level4.80/level5.62 
<gfid:190f7dca-29d3-46e6-b798-aab6781a3c86> - Is in split-brain
/IOs/new-samedir/level1.1/level2.2/level3.64/level4.65/level5.35 - Is in split-brain
/IOs/new-samedir/level1.1/level2.2/level3.64/level4.65 
/IOs/new-samedir/level1.1/level2.2/level3.79/level4.30/level5.87 - Is in split-brain
/IOs/new-samedir/level1.1/level2.2/level3.79/level4.30 
/IOs/new-samedir/level1.1/level2.3/level3.20/level4.30/level5.17 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.20/level4.30/level5.23 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.20/level4.30/level5.28 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.20/level4.30/level5.41 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.20/level4.30/level5.96 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.54/level4.27/level5.56 
/IOs/new-samedir/level1.1/level2.3/level3.84/level4.55/level5.69 - Is in split-brain
Status: Connected
Number of entries: 17

Brick rhs-gp-srv8.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-b1
/IOs/new-samedir/level1.1/level2.2/level3.34/level4.80/level5.62 - Is in split-brain
/IOs/new-samedir/level1.1/level2.2/level3.34/level4.80 
/IOs/new-samedir/level1.1/level2.2/level3.64/level4.65/level5.35 - Is in split-brain
/IOs/new-samedir/level1.1/level2.2/level3.64/level4.65 
/IOs/new-samedir/level1.1/level2.2/level3.79/level4.30/level5.87 - Is in split-brain
/IOs/new-samedir/level1.1/level2.2/level3.79/level4.30 
/IOs/new-samedir/level1.1/level2.3/level3.20/level4.30/level5.17 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.20/level4.30/level5.23 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.20/level4.30/level5.28 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.20/level4.30/level5.41 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.20/level4.30/level5.96 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.54/level4.27 
/IOs/new-samedir/level1.1/level2.3/level3.54/level4.27/level5.56 
/IOs/new-samedir/level1.1/level2.3/level3.84/level4.55/level5.69 - Is in split-brain
/IOs/samedir-creates/level1.1/level2.3/level3.28/level4.13 
/IOs/samedir-creates/level1.1/level2.3/level3.28/level4.13/level5.9 
/IOs/new-samedir/level1.1/level2.1/level3.30/level4.76/level5.87 
/IOs/kernel/dhcp43-21.lab.eng.blr.redhat.com/dir.37/linux-5.1.3/drivers/scsi 
/IOs/new-samedir/level1.1/level2.1/level3.45/level4.75/level5.12 
<gfid:17e99f85-9526-4ab0-aa39-22eb932732be> - Is in split-brain
Status: Connected
Number of entries: 20

Brick rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-b1
/IOs/samedir-creates/level1.1/level2.3/level3.28/level4.13/level5.9 
/IOs/new-samedir/level1.1/level2.2/level3.34/level4.80/level5.62 - Is in split-brain
/IOs/new-samedir/level1.1/level2.2/level3.34/level4.80 
/IOs/new-samedir/level1.1/level2.2/level3.64/level4.65/level5.35 
<gfid:5405eaf8-2589-42dc-afde-1400e4da45ca> - Is in split-brain
/IOs/new-samedir/level1.1/level2.2/level3.79/level4.30/level5.87 
<gfid:8f772dfe-d7d8-4af4-bddd-2960881972ac> - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.20/level4.30/level5.17 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.20/level4.30/level5.23 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.20/level4.30/level5.28 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.20/level4.30/level5.41 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.20/level4.30/level5.96 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.54/level4.27 
/IOs/new-samedir/level1.1/level2.3/level3.54/level4.27/level5.56 
/IOs/new-samedir/level1.1/level2.3/level3.84/level4.55/level5.69 - Is in split-brain
/IOs/kernel/dhcp43-234.lab.eng.blr.redhat.com/dir.37/linux-5.1.3/drivers/char 
/IOs/new-samedir/level1.1/level2.1/level3.30/level4.76/level5.87 
/IOs/new-samedir/level1.1/level2.1/level3.30/level4.76 
/IOs/samedir-creates/level1.1/level2.3/level3.28/level4.13 
/IOs/new-samedir/level1.1/level2.1/level3.45/level4.75 
/IOs/new-samedir/level1.1/level2.1/level3.45/level4.75/level5.12 - Is in split-brain
Status: Connected
Number of entries: 21

Brick rhs-gp-srv8.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-b2
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60 - Is in split-brain
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.66/level5.35 - Is in split-brain
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.50 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.67/level5.59 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.68/level5.3 - Is in split-brain
/IOs/new-samedir/level1.1/level2.2/level3.30/level4.29/level5.98 
<gfid:cb27b874-41b8-4909-b14b-6216c5f78b10> - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.60/level4.50/level5.30 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.60/level4.50 
/IOs/new-samedir/level1.1/level2.4/level3.33/level4.5/level5.77 - Is in split-brain
/IOs/new-samedir/level1.1/level2.4/level3.33/level4.5 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.67 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.68/level5.30 - Is in split-brain
Status: Connected
Number of entries: 14

Brick rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-b2
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.50 
/IOs/new-samedir/level1.1/level2.2/level3.30/level4.29/level5.98 - Is in split-brain
/IOs/new-samedir/level1.1/level2.2/level3.30/level4.29 
/IOs/new-samedir/level1.1/level2.3/level3.60/level4.50/level5.30 
<gfid:bf2c8e61-59c5-4c37-a0d1-6f67b1f4bae1> - Is in split-brain
<gfid:1e7c2e51-86b9-490f-acd5-519eb42cd4ff> - Is in split-brain
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60 - Is in split-brain
/IOs/new-samedir/level1.1/level2.4/level3.33/level4.5/level5.77 
/IOs/kernel/dhcp43-234.lab.eng.blr.redhat.com/dir.37/linux-5.1.3/drivers/char 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.66/level5.35 - Is in split-brain
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.67 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.68/level5.3 - Is in split-brain
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.67/level5.59 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.68/level5.30 - Is in split-brain
Status: Connected
Number of entries: 15

Brick rhs-gp-srv10.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-b2
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60 - Is in split-brain
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.66/level5.35 - Is in split-brain
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.67/level5.59 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.68/level5.3 - Is in split-brain
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.68/level5.30 - Is in split-brain
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.50 
/IOs/new-samedir/level1.1/level2.2/level3.30/level4.29/level5.98 - Is in split-brain
/IOs/new-samedir/level1.1/level2.2/level3.30/level4.29 
/IOs/new-samedir/level1.1/level2.3/level3.60/level4.50/level5.30 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.60/level4.50 
/IOs/new-samedir/level1.1/level2.4/level3.33/level4.5/level5.77 - Is in split-brain
/IOs/new-samedir/level1.1/level2.4/level3.33/level4.5 
Status: Connected
Number of entries: 12

Brick rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-b3
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.8/level5.17 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.66/level5.83 - Is in split-brain
/IOs/new-samedir/level1.1/level2.2/level3.97/level4.30/level5.76 
<gfid:8ba6523c-65fc-44ba-a2fd-cd983ba737ea> - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.38/level4.94/level5.5 
<gfid:f207a315-a5a1-4f17-8a34-bdeb67d5a184> - Is in split-brain
/IOs/new-samedir/level1.1/level2.4/level3.5/level4.67/level5.97 
<gfid:1ca8bc74-2b0e-4e88-a9b6-76b2b978e127> - Is in split-brain
/IOs/new-samedir/level1.1/level2.4/level3.9/level4.40/level5.23 - Is in split-brain
/IOs/new-samedir/level1.1/level2.4/level3.9/level4.40/level5.44 - Is in split-brain
/IOs/new-samedir/level1.1/level2.4/level3.9/level4.40/level5.91 - Is in split-brain
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.67/level5.37 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.8/level5.34 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.66/level5.100 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.67 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.8 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.8/level5.5 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.8/level5.23 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.67/level5.75 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.66 
/IOs/new-samedir/level1.1/level2.1/level3.30/level4.68/level5.62 
/IOs/new-samedir/level1.1/level2.1/level3.30/level4.68 
/IOs/new-samedir/level1.1/level2.1/level3.61/level4.44 
/IOs/new-samedir/level1.1/level2.1/level3.61/level4.44/level5.9 
/IOs/new-samedir/level1.1/level2.1/level3.33/level4.34/level5.76 
/IOs/new-samedir/level1.1/level2.1/level3.33/level4.34 
/IOs/new-samedir/level1.1/level2.1/level3.3/level4.46 
/IOs/new-samedir/level1.1/level2.1/level3.3/level4.46/level5.53 
/IOs/new-samedir/level1.1/level2.1/level3.3/level4.65/level5.23 
/IOs/new-samedir/level1.1/level2.1/level3.3/level4.65 
/IOs/new-samedir/level1.1/level2.1/level3.27/level4.45 
/IOs/new-samedir/level1.1/level2.1/level3.27/level4.45/level5.85 
/IOs/kernel/dhcp43-234.lab.eng.blr.redhat.com/dir.37/linux-5.1.3/drivers/char 
/IOs/new-samedir/level1.1/level2.1/level3.35/level4.37/level5.36 
Status: Connected
Number of entries: 34

Brick rhs-gp-srv10.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-b3
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.66/level5.83 - Is in split-brain
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.66/level5.100 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.67/level5.37 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.67/level5.75 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.8/level5.5 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.8/level5.17 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.8/level5.23 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.8/level5.34 
/IOs/new-samedir/level1.1/level2.2/level3.97/level4.30/level5.76 - Is in split-brain
/IOs/new-samedir/level1.1/level2.2/level3.97/level4.30 
/IOs/new-samedir/level1.1/level2.3/level3.38/level4.94/level5.5 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.38/level4.94 
/IOs/new-samedir/level1.1/level2.4/level3.5/level4.67/level5.97 - Is in split-brain
/IOs/new-samedir/level1.1/level2.4/level3.5/level4.67 
/IOs/new-samedir/level1.1/level2.4/level3.9/level4.40/level5.23 - Is in split-brain
/IOs/new-samedir/level1.1/level2.4/level3.9/level4.40/level5.44 - Is in split-brain
/IOs/new-samedir/level1.1/level2.4/level3.9/level4.40/level5.91 - Is in split-brain
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.8 
/IOs/new-samedir/level1.1/level2.1/level3.30/level4.68/level5.62 
/IOs/new-samedir/level1.1/level2.1/level3.30/level4.68 
/IOs/new-samedir/level1.1/level2.1/level3.33/level4.34/level5.76 
/IOs/new-samedir/level1.1/level2.1/level3.33/level4.34 
/IOs/new-samedir/level1.1/level2.1/level3.3/level4.46 
/IOs/new-samedir/level1.1/level2.1/level3.3/level4.46/level5.53 
/IOs/new-samedir/level1.1/level2.1/level3.3/level4.65/level5.23 
/IOs/new-samedir/level1.1/level2.1/level3.3/level4.65 
/IOs/new-samedir/level1.1/level2.1/level3.61/level4.44/level5.9 
/IOs/new-samedir/level1.1/level2.1/level3.61/level4.44 
/IOs/new-samedir/level1.1/level2.1/level3.27/level4.45 
/IOs/new-samedir/level1.1/level2.1/level3.27/level4.45/level5.85 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.67 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.66 
/IOs/new-samedir/level1.1/level2.1/level3.35/level4.37 
/IOs/new-samedir/level1.1/level2.1/level3.35/level4.37/level5.36 
Status: Connected
Number of entries: 34

Brick rhs-gp-srv5.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-rb3
/IOs/new-samedir/level1.1/level2.1/level3.3/level4.46/level5.53 
/IOs/new-samedir/level1.1/level2.1/level3.3/level4.65/level5.23 
/IOs/new-samedir/level1.1/level2.1/level3.27/level4.45/level5.85 
/IOs/new-samedir/level1.1/level2.1/level3.30/level4.68/level5.62 
/IOs/new-samedir/level1.1/level2.1/level3.33/level4.34/level5.76 
/IOs/new-samedir/level1.1/level2.1/level3.35/level4.37/level5.36 
/IOs/new-samedir/level1.1/level2.1/level3.35/level4.37 
/IOs/new-samedir/level1.1/level2.1/level3.61/level4.44/level5.9 
/IOs/new-samedir/level1.1/level2.2/level3.97/level4.30/level5.76 - Is in split-brain
/IOs/new-samedir/level1.1/level2.2/level3.97/level4.30 
/IOs/new-samedir/level1.1/level2.3/level3.38/level4.94/level5.5 - Is in split-brain
/IOs/new-samedir/level1.1/level2.3/level3.38/level4.94 
/IOs/new-samedir/level1.1/level2.4/level3.5/level4.67/level5.97 - Is in split-brain
/IOs/new-samedir/level1.1/level2.4/level3.5/level4.67 
/IOs/new-samedir/level1.1/level2.4/level3.9/level4.40/level5.23 - Is in split-brain
/IOs/new-samedir/level1.1/level2.4/level3.9/level4.40/level5.44 - Is in split-brain
/IOs/new-samedir/level1.1/level2.4/level3.9/level4.40/level5.91 - Is in split-brain
Status: Connected
Number of entries: 17

Brick rhs-gp-srv10.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-b4
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.67/level5.44 - Is in split-brain
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7 
/IOs/new-samedir/level1.1/level2.1/level3.1 
/IOs/new-samedir/level1.1/level2.1/level3.61/level4.91/level5.15 - Is in split-brain
/IOs/new-samedir/level1.1/level2.1/level3.61/level4.91 
/IOs/new-samedir/level1.1/level2.2/level3.57/level4.64/level5.88 - Is in split-brain
/IOs/new-samedir/level1.1/level2.2/level3.57/level4.64 
/IOs/new-samedir/level1.1/level2.4/level3.9/level4.37/level5.98 - Is in split-brain
/IOs/new-samedir/level1.1/level2.4/level3.9/level4.37 
/IOs/kernel/dhcp43-234.lab.eng.blr.redhat.com/dir.38/linux-5.1.3/Documentation/devicetree/bindings 
/IOs/kernel/dhcp43-18.lab.eng.blr.redhat.com/dir.42/linux-5.1.3/tools 
/IOs/kernel/dhcp43-234.lab.eng.blr.redhat.com/dir.38/linux-5.1.3/arch/arm/mach-pxa/include/mach 
Status: Connected
Number of entries: 12

Brick rhs-gp-srv5.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-rb4
/IOs/new-samedir/level1.1/level2.1/level3.61/level4.91/level5.15 - Is in split-brain
/IOs/new-samedir/level1.1/level2.1/level3.61/level4.91 
/IOs/new-samedir/level1.1/level2.2/level3.57/level4.64/level5.88 - Is in split-brain
/IOs/new-samedir/level1.1/level2.2/level3.57/level4.64 
/IOs/new-samedir/level1.1/level2.4/level3.9/level4.37 
/IOs/new-samedir/level1.1/level2.4/level3.9/level4.37/level5.98 - Is in split-brain
Status: Connected
Number of entries: 6

Brick rhs-gp-srv8.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-b4
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.4 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.1 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.2 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.3 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.5 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.46 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.47 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.48 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.50 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.51 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.49 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.52 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.53 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.54 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.55 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.56 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.57 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.58 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.59 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.60 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.61 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.62 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.63 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.64 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.65 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.69 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.71 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.73 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.74 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.76 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.77 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.80 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.81 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.84 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.86 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.88 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.91 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.93 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.94 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.97 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.98 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.66 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.67 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.68 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.70 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.72 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.75 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.78 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.79 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.82 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.83 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.85 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.87 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.89 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.90 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.92 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.95 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.96 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.99 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.100 
/IOs/new-samedir/level1.1/level2.1/level3.61/level4.91/level5.15 
<gfid:41023c2e-c5c3-452a-8796-3f8629561aaf> - Is in split-brain
/IOs/new-samedir/level1.1/level2.2/level3.57/level4.64/level5.88 
<gfid:cde95c92-fa52-4671-86dc-51b7c78583d9> - Is in split-brain
/IOs/new-samedir/level1.1/level2.4/level3.9/level4.37 
/IOs/new-samedir/level1.1/level2.4/level3.9/level4.37/level5.98 
<gfid:b9db1153-996b-4a5a-8530-181e8892b9c6> - Is in split-brain
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.67/level5.44 - Is in split-brain
/IOs/new-samedir/level1.1/level2.1/level3.1 
/IOs/kernel/dhcp42-166.lab.eng.blr.redhat.com/dir.37/linux-5.1.3/tools/testing/selftests/powerpc 
/IOs/kernel/dhcp43-21.lab.eng.blr.redhat.com/dir.37/linux-5.1.3/drivers/scsi 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.6 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.7 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.8 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.9 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.10 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.11 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.12 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.13 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.14 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.15 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.16 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.17 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.18 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.19 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.20 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.21 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.22 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.23 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.24 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.25 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.26 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.27 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.28 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.29 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.30 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.31 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.32 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.33 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.34 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.35 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.36 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.37 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.39 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.40 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.38 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.41 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.42 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.43 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.44 
/IOs/new-samedir/level1.1/level2.1/level3.1/level4.7/level5.45 
Status: Connected
Number of entries: 112

Comment 5 Amar Tumballi 2019-06-10 04:41:48 UTC
If the volumes are still present, is it possible to collect 'gluster volume profile info'?

Considering there was 'create' and also unlink etc operation I suspect multiple factors causing performance hit.

* Number of entrylk() from dht (each entry creation takes an entrylk(), and if there is parallel operation, the performance may actually be very degraded.
* Backend disk performance when lot of create and deletes are happening.
  - When such a large number of operations are happening on the system, even backend filesystems are going to cause a slow down, mainly as even that needs to handle the fragmentation, and other related block processing.

* I also see split-brain related logs, and having many of them would have stalled the unlink process altogether.

Comment 6 Nag Pavan Chilakam 2019-06-12 05:48:20 UTC
(In reply to Amar Tumballi from comment #5)
> If the volumes are still present, is it possible to collect 'gluster volume
> profile info'?
I didn't collect the profile info.
However I have the volume still available, but will collecting profile info at this moment help(as the test is completed)
> 
> Considering there was 'create' and also unlink etc operation I suspect
> multiple factors causing performance hit.
> 
> * Number of entrylk() from dht (each entry creation takes an entrylk(), and
> if there is parallel operation, the performance may actually be very
> degraded.
> * Backend disk performance when lot of create and deletes are happening.
>   - When such a large number of operations are happening on the system, even
> backend filesystems are going to cause a slow down, mainly as even that
> needs to handle the fragmentation, and other related block processing.
> 
> * I also see split-brain related logs, and having many of them would have
> stalled the unlink process altogether.
splitbrain issues shouldn't have anything to do here, as they are hosted by a completely different parent directory.

Comment 12 Nag Pavan Chilakam 2019-06-21 06:00:03 UTC
Volume ID: bb49cec5-d750-4e2f-a332-8da43efaf2d3
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x 3 = 12
Transport-type: tcp
Bricks:
Brick1: rhs-gp-srv5.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-rb1
Brick2: rhs-gp-srv7.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-rb1
Brick3: rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-b1
Brick4: rhs-gp-srv7.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-rb2
Brick5: rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-b2
Brick6: rhs-gp-srv10.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-b2
Brick7: rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-b3
Brick8: rhs-gp-srv10.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-b3
Brick9: rhs-gp-srv5.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-rb3
Brick10: rhs-gp-srv10.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-b4
Brick11: rhs-gp-srv5.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-rb4
Brick12: rhs-gp-srv7.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-rb4
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
cluster.shd-max-threads: 48
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on

Comment 18 Nag Pavan Chilakam 2019-07-08 06:25:42 UTC
FYI,
Even without enabling quotas,the same took close to 2days {though it is half of what it took with quotas, however, 2 days is still a long time}
[root@dhcp42-60 dhcp42-60.lab.eng.blr.redhat.com]#
[root@dhcp42-60 dhcp42-60.lab.eng.blr.redhat.com]#
[root@dhcp42-60 dhcp42-60.lab.eng.blr.redhat.com]# date;time rm -rf * ;date
Fri Jul  5 17:39:39 IST 2019

real    2580m9.247s
user    0m21.598s
sys     5m50.614s
Sun Jul  7 12:39:49 IST 2019
[root@dhcp42-60 dhcp42-60.lab.eng.blr.redhat.com]# 

[root@rhs-gp-srv5 ~]# gluster v info
 
Volume Name: nftvol
Type: Distributed-Replicate
Volume ID: cb9567c5-051c-4cf9-bf8e-1cc7f5bfc129
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x 3 = 12
Transport-type: tcp
Bricks:
Brick1: rhs-gp-srv5.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-sv1
Brick2: rhs-gp-srv7.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-sv1
Brick3: rhs-gp-srv8.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-sv1
Brick4: rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-sv2
Brick5: rhs-gp-srv5.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-sv2
Brick6: rhs-gp-srv7.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-sv2
Brick7: rhs-gp-srv8.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-sv3
Brick8: rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-sv3
Brick9: rhs-gp-srv5.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-sv3
Brick10: rhs-gp-srv7.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-sv4
Brick11: rhs-gp-srv8.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-sv4
Brick12: rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-sv4
Options Reconfigured:
server.event-threads: 8
client.event-threads: 8
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
cluster.shd-max-threads: 24
features.uss: enable
transport.address-family: inet
storage.fips-mode-rchecksum: on
nfs.disable: on
performance.client-io-threads: off
[root@rhs-gp-srv5 ~]# gluster v status
Status of volume: nftvol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick rhs-gp-srv5.lab.eng.blr.redhat.com:/g
luster/brick1/nftvol-sv1                    49152     0          Y       18581
Brick rhs-gp-srv7.lab.eng.blr.redhat.com:/g
luster/brick1/nftvol-sv1                    49152     0          Y       18696
Brick rhs-gp-srv8.lab.eng.blr.redhat.com:/g
luster/brick1/nftvol-sv1                    49152     0          Y       18636
Brick rhs-gp-srv9.lab.eng.blr.redhat.com:/g
luster/brick1/nftvol-sv2                    49152     0          Y       18639
Brick rhs-gp-srv5.lab.eng.blr.redhat.com:/g
luster/brick2/nftvol-sv2                    49153     0          Y       18601
Brick rhs-gp-srv7.lab.eng.blr.redhat.com:/g
luster/brick2/nftvol-sv2                    49153     0          Y       18716
Brick rhs-gp-srv8.lab.eng.blr.redhat.com:/g
luster/brick2/nftvol-sv3                    49153     0          Y       18657
Brick rhs-gp-srv9.lab.eng.blr.redhat.com:/g
luster/brick2/nftvol-sv3                    49153     0          Y       18659
Brick rhs-gp-srv5.lab.eng.blr.redhat.com:/g
luster/brick3/nftvol-sv3                    49154     0          Y       18621
Brick rhs-gp-srv7.lab.eng.blr.redhat.com:/g
luster/brick3/nftvol-sv4                    49154     0          Y       18736
Brick rhs-gp-srv8.lab.eng.blr.redhat.com:/g
luster/brick3/nftvol-sv4                    49154     0          Y       18677
Brick rhs-gp-srv9.lab.eng.blr.redhat.com:/g
luster/brick3/nftvol-sv4                    49154     0          Y       18679
Snapshot Daemon on localhost                49155     0          Y       18722
Self-heal Daemon on localhost               N/A       N/A        Y       18642
Snapshot Daemon on rhs-gp-srv9.lab.eng.blr.
redhat.com                                  49155     0          Y       18745
Self-heal Daemon on rhs-gp-srv9.lab.eng.blr
.redhat.com                                 N/A       N/A        Y       18700
Snapshot Daemon on rhs-gp-srv7.lab.eng.blr.
redhat.com                                  49155     0          Y       18804
Self-heal Daemon on rhs-gp-srv7.lab.eng.blr
.redhat.com                                 N/A       N/A        Y       18757
Snapshot Daemon on rhs-gp-srv8.lab.eng.blr.
redhat.com                                  49155     0          Y       18743
Self-heal Daemon on rhs-gp-srv8.lab.eng.blr
.redhat.com                                 N/A       N/A        Y       18698
 
Task Status of Volume nftvol
------------------------------------------------------------------------------
There are no active volume tasks
 



[root@rhs-gp-srv5 ~]# uname -a
Linux rhs-gp-srv5.lab.eng.blr.redhat.com 3.10.0-1049.el7.x86_64 #1 SMP Mon May 20 18:49:46 EDT 2019 x86_64 x86_64 x86_64 GNU/Linux
[root@rhs-gp-srv5 ~]# rpm -qa|grep gluster
glusterfs-libs-6.0-7.el7rhgs.x86_64
glusterfs-fuse-6.0-7.el7rhgs.x86_64
glusterfs-client-xlators-6.0-7.el7rhgs.x86_64
glusterfs-api-6.0-7.el7rhgs.x86_64
glusterfs-cli-6.0-7.el7rhgs.x86_64
glusterfs-6.0-7.el7rhgs.x86_64
glusterfs-server-6.0-7.el7rhgs.x86_64

Comment 19 Nag Pavan Chilakam 2019-07-08 06:27:10 UTC
Created attachment 1588247 [details]
profile output for comment#18

Comment 20 Nag Pavan Chilakam 2019-07-09 06:12:20 UTC
based on c#18 proposing it again as a blocker, as quota's impact is only 2x and not 7x-10x as commented in c#10

Comment 24 Amar Tumballi 2019-07-12 04:54:28 UTC
Thanks Xavi for pitching in here. I am in agreement with Xavi's analysis. In general, I would also say, our priority on 'performance' should be more on 'create', write/read, than on delete/rmdir/rename.

Will keep this bug open for keeping this in mind when we design solutions around this problem.


Note You need to log in before you can comment on or make changes to this bug.