Bug 230803
Summary: | Fix GFS readahead setting | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Wendy Cheng <nobody+wcheng> | ||||
Component: | gfs | Assignee: | Wendy Cheng <nobody+wcheng> | ||||
Status: | CLOSED ERRATA | QA Contact: | Dean Jansa <djansa> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 4 | CC: | hlawatschek, kanderso, mathieu.avila, rkenna | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | RHBA-2007-0998 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2007-11-21 21:14:22 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Wendy Cheng
2007-03-02 23:11:26 UTC
The comment is correct. It has been an oversight of GFS implementation. A quick fix might be in fill_super(), right after GFS registers the diaper device, the readahead value can be pulled from the ra_pages of backing_dev_info. I'll try to get the test patch done today. well, after reading the source code carefully, I see diaper device does get its backing_dev_info from the real device (in diaper_get()): 363 diaper = bdget_disk(gd, 0); 364 if (!diaper) 365 goto fail_remove; 366 367 down(&diaper->bd_sem); 368 if (!diaper->bd_openers) { 369 diaper->bd_disk = gd; 370 diaper->bd_contains = diaper; 371 bd_set_size(diaper, 0x7FFFFFFFFFFFFFFFULL); 372 diaper->bd_inode->i_data.backing_dev_info = &gd->queue-> backing_dev_info; 373 } else 374 printk("GFS: diaper: reopening\n"); This is further confirmed by tracing from vfs super block in the entry of gfs_write, into s_bdev->bd_inode->i_mapping by examining the run time kernel memory: crash> struct backing_dev_info 0x10119b1e6e0 struct backing_dev_info { ra_pages = 32, state = 0, memory_backed = 0, congested_fn = 0xffffffffa033beb0, congested_data = 0x1007dbcda80, unplug_io_fn = 0xffffffff8024e2c2 <blk_backing_dev_unplug>, unplug_io_data = 0x10119b1e490 } So GFS doesn't miss any readahead. However, 32 may be good for general purpose readahead, not for large sequential write ? We happen to have a customer complaning about large sequential read performance. Will give out a tunable to see whether it could improve the number. If proved useful, we'll take it for RHEL 4.5 Or, depending on the device, the driver may not set readahead for its queue at all. So a tunable would be even more proper. VFS layer happens to offer a backing device info block in its super block structure to allow filesystem to override the default setting of its under- lying block device. After adding this function into GFS as a tunable (seq_readahead), a quick and dirty dd test with 2G file shows the following result on my test cluster: default dd read: 27.76s 2048 readahead: 15.19s. Will continue this testing. Created attachment 150135 [details]
Tentative patch for sequential read performance fix
The above patch is not finalized yet - needs some clean up but is good for test runs. ok, after few loops, it is un-mistakenly 2x performance gain. Mathieu, thanks for bringing this issue up ! Moving to proposed for 4.6 - too late in the cycle for 4.5 Our TAM just tested this patch out Sorry, hit submit key too soon.. This patch seems to significantly help IPTV read performance. The tunable is set by "gfs_tool settune <mount> seq_readahead <val>" command. The <val> we use is subject to the file size. Assuming 2G file with 2048 readahead is good, in IPTV test runs, with file size between 400-500M, we use 256 readahead. The change made us passing ext3 number and TAM seems to be very pleased with the results. Code checked into CVS. Move bugzilla into modified state. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0998.html |