Hide Forgot
Reading directory leaf blocks can be slow, so we should readahead these blocks where possible. fadvise() is probably the best way to do that. Adding such readahead has made a big difference to the kernel performance, so I'd like to see something similar in fsck too. I also wonder whether doing an fadvise(FADV_RANDOM) at the start of fsck is not a bad plan, since it seems that it is quite likely that a fair few of the access will be random, or maybe that needs setting on a per pass basis. Either way I think we should consider adding those hints where required.
Some initial observations. A few tests using posix_fadvise(2) with FADV_RANDOM or FADV_SEQUENTIAL at the start seems to suggest that FADV_SEQUENTIAL makes very little difference (a very tiny speed up in my tests) and FADV_RANDOM causes a substantial slow down. This is with a newly created file system which has created a million files under one subdirectory, so not a hugely realistic test, but the slow-down does make me concerned that using posix_fadvise() in this way could be advantageous with some usage patterns and harmful with others.
Created attachment 581231 [details] Patch to avoid rereading directory blocks in check_leaf_blks I'm currently in the process of creating some huge directories on the exxon cluster to test this patch a bit better but in the meantime perhaps you'd like to take a look at it and see if anything stands out.
As per last week's meeting I'm going to abandon the above patch as my tests didn't show any significant performance impact (with or without memory pressure).
Closing this one as per last week's meeting.