Bug 1994593 - Granular entry self-heal is taking more time than full entry self heal for creation and deletion workloads
Summary: Granular entry self-heal is taking more time than full entry self heal for cr...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: replicate
Version: rhgs-3.5
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: RHGS 3.5.z Batch Update 6
Assignee: Karthik U S
QA Contact: Pranav Prakash
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-17 13:37 UTC by Karthik U S
Modified: 2022-01-27 14:26 UTC (History)
9 users (show)

Fixed In Version: glusterfs-6.0-60
Doc Type: Bug Fix
Doc Text:
Previously, granular entry self heal took more time than the full entry self heal when there were many entry self heals pending due to the creation and deletion heavy workloads. With this update, the extra lookup to delete the stale index is removed from the code path of the granular entry self heal, which improves the heal performance in the creation and deletion heavy workloads when the granular entry self heal is enabled.
Clone Of:
Environment:
Last Closed: 2022-01-27 14:26:32 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2022:0315 0 None None None 2022-01-27 14:26:56 UTC

Description Karthik U S 2021-08-17 13:37:17 UTC
Description of the issue:
Clone of upstream issue: https://github.com/gluster/glusterfs/issues/2611

Number of lookups in granular entry self-heal is very high compared to full entry self-heal. Here are the numbers for the following workload:

Once a replica 3 volume(r3) is created and mounted:
gluster volume profile r3 start
pushd /mnt/r3
mkdir d
cd d
kill -9 $(gluster volume status | grep Brick | awk '{print $NF}' | head -1)
for i in {1..100000}; do touch $i; done
gluster volume profile r3 info incremental
gluster volume start r3 force
popd
Once heal completes, take one more profile info incremental.

base-full-heal.txt:      52.72  38828.24  ns  20130.00  ns  31109878.00  ns  300024   LOOKUP
base-full-heal.txt:      65.47  41799.10  ns  10173.00  ns  2497157.00   ns  400022   LOOKUP
base-full-heal.txt:      66.68  70149.87  ns  34635.00  ns  1554337.00   ns  200017   LOOKUP

base-granular-heal.txt:  72.06  57450.77  ns  20245.00  ns  23499908.00  ns  800010   LOOKUP
base-granular-heal.txt:  79.99  57926.16  ns  12702.00  ns  12933708.00  ns  900008   LOOKUP
base-granular-heal.txt:  82.11  69301.03  ns  27820.00  ns  12533029.00  ns  700006   LOOKUP

This is happening because there is a check for stale index before triggering the actual heal which is triggering extra lookups. This lookup happens on AFR xlator which will try metadata heal etc so the number of lookups increases even more.

Comment 1 Karthik U S 2021-08-17 13:40:38 UTC
Upstream patch: https://github.com/gluster/glusterfs/pull/2612

Comment 19 errata-xmlrpc 2022-01-27 14:26:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (glusterfs bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:0315


Note You need to log in before you can comment on or make changes to this bug.