Bug 1273348

Summary: [Tier]: lookup from client takes too long {~7m for 18k files}
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Rahul Hinduja <rhinduja>
Component: tierAssignee: Dan Lambright <dlambrig>
Status: CLOSED ERRATA QA Contact: Nag Pavan Chilakam <nchilaka>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: asrivast, jbyers, nchilaka, rhs-bugs, sankarshan, sarumuga, storage-qa-internal
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.1.2   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.7.5-7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1273333 Environment:
Last Closed: 2016-03-01 05:43:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1273333    
Bug Blocks: 1260783, 1260923    
Attachments:
Description Flags
logs find command none

Comment 3 Rahul Hinduja 2015-11-02 12:38:54 UTC
This bug hits the geo-replication + tiering testing. 

Testing of geo-replication revolves around the following steps:

1. Create Set of files/directories in numbers on master volume
2. Let those files be synced to slave. 
3. Calculate checksum on master volume and slave volume. 

Step 3 involves a lookup and if the entries are more, arequal checksum takes too much time to verify the sync. 

With the increase in number of files on master with every operation, the arequal calculation increases as well.

Comment 4 Dan Lambright 2015-11-16 13:57:51 UTC
Can someone tell me where to find the "crefi" utility?

Comment 5 Nag Pavan Chilakam 2015-11-18 11:47:00 UTC
dan, as mentioned in scrum, we can use general dd command or touch command to create a significant number of files. You don't need crefi utility.

Comment 6 Dan Lambright 2015-11-18 15:46:12 UTC
RCA ; the problem is the overhead of doing a readdir (basically internal lookups) seems to scale with the number of sub volumes. On a tiered volume, there are more sub volumes.

Comment 7 Dan Lambright 2015-11-18 23:09:58 UTC
One question, the bug shows when tiering is disabled the time is 3.11 minutes. When tiering is enabled time is 7.41 minutes. I would like to know if the jump in time is due to demoting data between the nodes. 

Can you repeat the test and, at the same time the "find . | xargs stat" is running, in another window issue "gluster vol tier <volname> status". Check if the counters are increasing at the same time the find command runs.

Wait until all data moves to the cold tier and repeat the test. Monitor counters to be sure they are not increasing and the system is stable. Are the times different or just as bad?

I have seen moderate performance degredation while demotion is happening on a 2 node system , hot tier 6x2 and cold tier 6x2.

Comment 8 Saravanakumar 2015-11-19 11:44:40 UTC
Updating my observation:

18k empty files created, able to sync to slave(in geo-rep setup) .

==================================================
NON-tier volume: 3x2 volume

[root@gfvm3 non-tierd_volume]# echo 3 > /proc/sys/vm/drop_caches
[root@gfvm3 non-tierd_volume]# time ls
..
real    0m10.130s
user    0m0.194s
sys    0m0.345s
[root@gfvm3 non-tierd_volume]#

==================================================
Tier Volume : 3x2 cold tier , 2x2 hot tier

[root@gfvm3 tierd_volume]# echo 3 > /proc/sys/vm/drop_caches
[root@gfvm3 tierd_volume]# time ls
..

real    0m18.290s
user    0m0.206s
sys    0m0.372s
==================================================

Comment 9 Saravanakumar 2015-11-19 11:45:57 UTC
(In reply to Saravanakumar from comment #8)
> Updating my observation:

This is observed with readdirp to cold tier only patch. (http://review.gluster.org/#/c/12530/)

Comment 10 Saravanakumar 2015-11-23 08:11:23 UTC
Created attachment 1097590 [details]
logs find command

Executed the following command with and without tiering:
#time find . | xargs stat

Following is my observation. 

WITHOUT TIERING:

real	0m3.126s
user	0m0.324s
sys	0m0.580s


WITH TIERING:

real	0m7.822s
user	0m0.506s
sys	0m0.889s
------------------------
Please find complete log in attached for all commands executed.

Comment 12 Rahul Hinduja 2015-11-24 11:03:57 UTC
Verified with the build: glusterfs-3.7.5-7.el7rhgs.x86_64

Volume type: Tiered

Number of files: 17077 {Actual data files with total size of 5G}

Time taken in each case {find . | xargs stat}:

Case 1: Default CTR {enabled} and watermarks enabled {mode=cache}

real	1m13.305s
user	0m1.220s
sys	0m2.686s

Case 2: Enabled watermarks low and hi to 10 and 60 respectively

real	1m16.308s
user	0m1.233s
sys	0m2.769s

Case 3: Enabled watermarks low and hi to 10 and 20 respectively

real	1m12.880s
user	0m1.204s
sys	0m2.551s

Case 4: Set the watermark to test mode {mode=test}

real	1m36.250s
user	0m1.239s
sys	0m2.743s

Case 5: Disabled CTR

real	1m36.958s
user	0m1.246s
sys	0m2.661s

In all the above cases, time taken is approximately same. Moving the bug to verified state.

Comment 14 errata-xmlrpc 2016-03-01 05:43:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html