Bug 223787 - RFE: GFS2: Don't keep everything in memory at once when expanding dir exhash
Summary: RFE: GFS2: Don't keep everything in memory at once when expanding dir exhash
Alias: None
Product: Fedora
Classification: Fedora
Component: GFS-kernel
Version: rawhide
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Steve Whitehouse
QA Contact:
Depends On: 307091
TreeView+ depends on / blocked
Reported: 2007-01-22 14:22 UTC by Steve Whitehouse
Modified: 2018-04-11 13:24 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Last Closed: 2011-06-02 17:16:19 UTC
Type: ---

Attachments (Terms of Use)
dmesg showing the issue (35.01 KB, application/octet-stream)
2009-06-17 15:37 UTC, Steve Whitehouse
no flags Details

Description Steve Whitehouse 2007-01-22 14:22:44 UTC
The expansion code for the directory exhash table currently works like this:

1. vmalloc a chunk of memory
2. read the current hash table into it
3. write out a new version doubling the size

It involved being able to allocate a lot of memory and potentially also
increasing latency by doing all the reading and then all the writing. A better
algorithm looks like this:

1. Expand the directory's data blocks to the newly required size allocating
blocks as required
2. Use two "pointers", one starts at the new end of the file, the other at the
original end of the file.
3. Copy the data from original to new moving towards the start of the file, the
"new" pointer will catch up with the "original" pointer only on the final copy
4. Update the i_size to indicate that the operation is complete

Potantially we might be able to increase the maximum size of the hash table
since the limiting factor appeared to be set only be the maximum size of memory
that it was reasonable to allocate using vmalloc. If this is done then we need
to check that it will remain backward compatible, but it would seem reasonable
that this should be the case.

Comment 1 Steve Whitehouse 2007-02-02 15:27:12 UTC
Move priority to low as this is really a performance thing rather than correctness.

Comment 2 Matěj Cepl 2008-07-30 16:04:57 UTC
nothing to triage

Comment 3 Steve Whitehouse 2009-06-17 11:22:45 UTC
There are places in the dir code where we are using GFP_NOFAIL for allocations which might be larger in size than order 0. In the latest upstream kernels this is causing warnings to appear as this is not allowed.

We will need to review the memory allocations in the dir code to ensure that this doesn't happen, and we should look at fixing this bug at the same time as its all related.

Comment 4 Steve Whitehouse 2009-06-17 15:37:50 UTC
Created attachment 348275 [details]
dmesg showing the issue

The full stack and warning are shown at the end of this attachment.

Comment 5 Steve Whitehouse 2011-06-02 17:16:19 UTC
This was done upstream some time ago.

Note You need to log in before you can comment on or make changes to this bug.