Bug 1559084
Summary: | [EC] Read performance of EC volume exported over gNFS is significantly lower than write performance | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Ashish Pandey <aspandey> |
Component: | disperse | Assignee: | Ashish Pandey <aspandey> |
Status: | CLOSED ERRATA | QA Contact: | Nag Pavan Chilakam <nchilaka> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | rhgs-3.4 | CC: | bturner, bugs, jahernan, pkarampu, rhinduja, rhs-bugs, sheggodu, storage-qa-internal, ubansal |
Target Milestone: | --- | ||
Target Release: | RHGS 3.4.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.12.2-6 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | 1557906 | Environment: | |
Last Closed: | 2018-09-04 06:44:20 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1554743, 1557906, 1558352 | ||
Bug Blocks: | 1503137, 1557904 |
Description
Ashish Pandey
2018-03-21 16:39:25 UTC
onqa validation on 3.12.2-11 as P0 cases(that is cases developed for testing the bug) are passing moving to verified ##### TEST PLAN ########### due to changing read-policy to gfid-hash, the read perf has improved (below is for a 10mb file) tc#1-->PASS (P0) md5sum took below time for congruent files(they are copies of each other) [root@dhcp35-72 dd]# time md5sum file.3 --------->with now default gfid-hash based e84853d61440dada29a64406f17de488 file.3 real 0m7.080s user 0m0.195s sys 0m0.080s [root@dhcp35-72 dd]# time md5sum file.4 ---->with round-robin e84853d61440dada29a64406f17de488 file.4 real 0m43.652s user 0m0.207s sys 0m0.297s tc#2:PASS (P0) check for the default of read-policy, it must be gfid-hash tc#3:PASS (P1) try setting read-policy to different values, must allow either of round-robin or gfid-hash [root@dhcp35-9 glusterfs]# gluster v get general all|grep gfid cluster.randomize-hash-range-by-gfid off storage.build-pgfid off storage.gfid2path on storage.gfid2path-separator : disperse.read-policy gfid-hash [root@dhcp35-9 glusterfs]# gluster v gset general disperse.read-policy unrecognized word: gset (position 1) [root@dhcp35-9 glusterfs]# gluster v gset general disperse.read-policy add unrecognized word: gset (position 1) [root@dhcp35-9 glusterfs]# gluster v set general disperse.read-policy add volume set: failed: option read-policy add: 'add' is not valid (possible options are round-robin, gfid-hash.) tc#4: ->PASS (P1) however raised an RFE BZ#1583662 - RFE: load-balance reads even when the read-policy is set to gfid-hash when multiple clients read same file read same file from multiple clients, should not impact, both clients from read from same set of bricks tc#5-->PASS (P2) softlink to a file and read it? no problem, as it still reads from source file tc#6->PASS (P0) have a file being read and when one of the hashed bricks goes down, no EIO must be seen, as the non-hashed brick must start to serve data Test above even by disabling nfs client cache(passed) checked even with 2 bricks down tc#7->Pass but can be improved (P2) Once the hashed brick comesup check if the hashed brick starts to serve the data Result->yes, for this reason, i raised a bz#1583643 - avoid switching back to the gfid-hashed brick once it is online(up) and instead continue reads from non-hashed brick [root@dhcp35-126 dispersevol1]# dd if=big-dd//10mb of=/dev/null bs=1024 count=10000000 10000000+0 records in 10000000+0 records out 10240000000 bytes (10 GB) copied, 570.747 s, 17.9 MB/s tc#8:->PASS (P2) if brick which is not hashed is brought down should not impact the read tc#9: raised bz#1583643 - avoid switching back to the gfid-hashed brick once it is online(up) and instead continue reads from non-hashed brick Also raised below BZ 1583667 - nfs logs flooded with "Connection refused); disconnecting socket" even after the brick is up due to stale sockets Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607 |