Bug 223787 - RFE: GFS2: Don't keep everything in memory at once when expanding dir exhash
RFE: GFS2: Don't keep everything in memory at once when expanding dir exhash
Product: Fedora
Classification: Fedora
Component: GFS-kernel (Show other bugs)
All Linux
low Severity medium
: ---
: ---
Assigned To: Steve Whitehouse
: FutureFeature
Depends On: 307091
  Show dependency treegraph
Reported: 2007-01-22 09:22 EST by Steve Whitehouse
Modified: 2011-06-02 13:16 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2011-06-02 13:16:19 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
dmesg showing the issue (35.01 KB, application/octet-stream)
2009-06-17 11:37 EDT, Steve Whitehouse
no flags Details

  None (edit)
Description Steve Whitehouse 2007-01-22 09:22:44 EST
The expansion code for the directory exhash table currently works like this:

1. vmalloc a chunk of memory
2. read the current hash table into it
3. write out a new version doubling the size

It involved being able to allocate a lot of memory and potentially also
increasing latency by doing all the reading and then all the writing. A better
algorithm looks like this:

1. Expand the directory's data blocks to the newly required size allocating
blocks as required
2. Use two "pointers", one starts at the new end of the file, the other at the
original end of the file.
3. Copy the data from original to new moving towards the start of the file, the
"new" pointer will catch up with the "original" pointer only on the final copy
4. Update the i_size to indicate that the operation is complete

Potantially we might be able to increase the maximum size of the hash table
since the limiting factor appeared to be set only be the maximum size of memory
that it was reasonable to allocate using vmalloc. If this is done then we need
to check that it will remain backward compatible, but it would seem reasonable
that this should be the case.
Comment 1 Steve Whitehouse 2007-02-02 10:27:12 EST
Move priority to low as this is really a performance thing rather than correctness.
Comment 2 Matěj Cepl 2008-07-30 12:04:57 EDT
nothing to triage
Comment 3 Steve Whitehouse 2009-06-17 07:22:45 EDT
There are places in the dir code where we are using GFP_NOFAIL for allocations which might be larger in size than order 0. In the latest upstream kernels this is causing warnings to appear as this is not allowed.

We will need to review the memory allocations in the dir code to ensure that this doesn't happen, and we should look at fixing this bug at the same time as its all related.
Comment 4 Steve Whitehouse 2009-06-17 11:37:50 EDT
Created attachment 348275 [details]
dmesg showing the issue

The full stack and warning are shown at the end of this attachment.
Comment 5 Steve Whitehouse 2011-06-02 13:16:19 EDT
This was done upstream some time ago.

Note You need to log in before you can comment on or make changes to this bug.