Red Hat Bugzilla – Bug 223783
GFS2: Deallocate blocks in directory hash chains when empty
Last modified: 2016-10-07 08:21 EDT
When an exhash directory has entries removed from it, leaving an otherwise empty
hash chain block, the block should be deallocated unless its the first block in
that particular hash chain.
Actually I suppose that it would be possible to remove that too, but the other
code paths would need to be updated to know what a zero pointer meant in the
hash table. Since there can be potentially several pointers to the same hashed
block, that would also need to be taken into account. So it might be simpler
(and we'd still get most of the benefit) to simply leave the first block in the
chain and only remove blocks which are empty and further down the chain.
Reassigning to myself. I whipped up an untested prototype in
about an hour. I did this for the sake of speeding up deletes:
If we can ensure that all leaf chain blocks are deallocated at
the time a directory is removed, in theory, we can save ourselves
a lot of work reading in leaf blocks to deallocate them. Instead,
we can just create a matrix of leaf blocks to free, and free them,
making the process a lot quicker. If we do that, it would no
longer set leaf block pointers to 0 as it frees them. Some might
look at that as a file system integrity feature we'd be removing.
However, maybe someday we could leverage it to create an undelete
tool for gfs2.
Created attachment 1206978 [details]
Early prototype patch
This is the early prototype patch I whipped up. It seems to work.
I ran a quick test where I restored the metadata set called
lots2.5million.meta.gz, which Ben was using for his nfs cookie
issue, so I'm reasonably sure it has leaf chain blocks.
An fsck.gfs2 afterward came up clean, and rmmod showed nothing
leftover for gfs2 in slab.
Created attachment 1207635 [details]
Second prototype patch - bug fixed
This version fixes a bug, and it's been tested with lf_next
Created attachment 1208142 [details]
Ben's tool for creating 10000 files with the same hash value
This is Ben Marzinski's script for creating ten thousand files
all of which have the same gfs2 hash value. This forces GFS2
to use several "lf_next" chain leaf blocks.
Created attachment 1208143 [details]
Third prototype patch
Almost the same as the previous patch, but the trans_add_meta
should technically be done before the buffer is modified.