Bug 1441417
Summary: | slow performance with directory operations on Fuse | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Raghavendra Bhat <rabhat> |
Component: | md-cache | Assignee: | Michael Adam <madam> |
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Vivek Das <vdas> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | rhgs-3.2 | CC: | amukherj, bkunal, ccalhoun, guillaume.pavese, jahernan, ndevos, olim, pdhange, rabhat, rgowdapp, rhs-bugs, skoduri, storage-qa-internal, vbellur |
Target Milestone: | --- | Keywords: | Performance, ZStream |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | glfs-readdirplus | ||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-12-09 16:55:10 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1478411, 1492996, 1644389 | ||
Bug Blocks: | 1474007 |
Comment 3
Poornima G
2017-04-13 09:41:31 UTC
As this is a SEV1 case, can we get an update that we can pass on to the customer? Needinfo for comment 13 Wrt to 3.2 and md-cache changes, that is pretty much the performance. There will be further enhancements in the future releases. Poornima, Do you say that customer will not be getting better performance now, they can just adjust "network.inode-lru-limit" value with few trials and that is all we can fine tune as per current situation? What all enhancement are in next release pipeline? Do we have existing bug/rfe for it? Raghu, Can you update the BZ with latest performance figures? Please even mention what all tunings done(like enabling md-cache, changed lru limit etc) to improve performance. Please do provide the tuned values. -Bipin What does this mean "..., however due to file locking issue this is not supported." Are they trying to locally mount a volume? WRT to the md-cache feature there is not much we can do further. Although with some more information we can find where the 20s is being spent, i m not sure if we will definitely be able to reduce it, but we can check if we can tweak some configurations. Can you please provide the profile info, client and server side, when run on FUSE for a fresh mount run and subsequent runs: #gluster volume stop <volname> ; gluster volume start <volname> # Fuse mount # gluster volume profile <volname> start # time find /mnt/shared/asterisk/var/spool/asterisk/voicemail/default-gluster/* -name *.txt | wc #gluster volume profile <volname> info # capture the output of this # setfattr -n trusted.io-stats-dump -v /tmp/fuse-fresh-mount.txt /your/mountpoint # gluster volume profile <volname> info clear # time find /mnt/shared/asterisk/var/spool/asterisk/voicemail/default-gluster/* -name *.txt | wc #gluster volume profile <volname> info # capture the output of this # setfattr -n trusted.io-stats-dump -v /tmp/fuse-subsequent-run.txt /your/mountpoint Please provide the client profiles: (/tmp/fuse-subsequent-run.txt* and /tmp/fuse-fresh-mount.txt*) and server profiles (ouput captured from both the profile info commands) Also this bug looks similar to 1457269 Here is the analysis: On a fresh mount run, the highest time consumed is by readdirp call followed by lookups(these are due to replica setup). On the subsequent runs, the highest time consumed is by readdirp, and the rest of the fops take negligible time. There is no common configuration that can further reduce the readdirp calls and give better performance. Some workarounds that can be tried: - Use readdir instead of readdirp. If the use case is the find command that looks for the name alone and not the type and other informantion. Setting this can slow use cases where find is followed by reading of files/doing any operation on all the files. [1] explains how to do the same. - Increase the entry-timeout and attribute-timeout(mount options, man mount.glusterfs), to a large value. Note that this doesn't decrease the time taken on fresh mount, but definitely on the subsequent runs. Also this can be set to higher values safely, only if there are no multiple mounts accessing the same set of files at the same time. Note that, the above mentioned workarounds be tried on test setup and not directly on production systems. Also would like to know how many mounts do customer have? [1] http://lists.gluster.org/pipermail/gluster-users/2017-March/030148.html Since the customer case is closed can we close this bug? and use https://bugzilla.redhat.com/show_bug.cgi?id=1644389 to track directory listing performance. |