When an exhash directory has entries removed from it, leaving an otherwise empty hash chain block, the block should be deallocated unless its the first block in that particular hash chain. Actually I suppose that it would be possible to remove that too, but the other code paths would need to be updated to know what a zero pointer meant in the hash table. Since there can be potentially several pointers to the same hashed block, that would also need to be taken into account. So it might be simpler (and we'd still get most of the benefit) to simply leave the first block in the chain and only remove blocks which are empty and further down the chain.
Reassigning to myself. I whipped up an untested prototype in about an hour. I did this for the sake of speeding up deletes: If we can ensure that all leaf chain blocks are deallocated at the time a directory is removed, in theory, we can save ourselves a lot of work reading in leaf blocks to deallocate them. Instead, we can just create a matrix of leaf blocks to free, and free them, making the process a lot quicker. If we do that, it would no longer set leaf block pointers to 0 as it frees them. Some might look at that as a file system integrity feature we'd be removing. However, maybe someday we could leverage it to create an undelete tool for gfs2.
Created attachment 1206978 [details] Early prototype patch This is the early prototype patch I whipped up. It seems to work. I ran a quick test where I restored the metadata set called lots2.5million.meta.gz, which Ben was using for his nfs cookie issue, so I'm reasonably sure it has leaf chain blocks. An fsck.gfs2 afterward came up clean, and rmmod showed nothing leftover for gfs2 in slab.
Created attachment 1207635 [details] Second prototype patch - bug fixed This version fixes a bug, and it's been tested with lf_next file systems.
Created attachment 1208142 [details] Ben's tool for creating 10000 files with the same hash value This is Ben Marzinski's script for creating ten thousand files all of which have the same gfs2 hash value. This forces GFS2 to use several "lf_next" chain leaf blocks.
Created attachment 1208143 [details] Third prototype patch Almost the same as the previous patch, but the trans_add_meta should technically be done before the buffer is modified.