Description of problem: The performance of squashfuse turns out to be very awful for multi-core applications. Please see for details: https://github.com/apptainer/apptainer/issues/665 We are about to release a version of apptainer (which I maintain in EPEL and Fedora) that puts a heavy dependence on squashfuse. The performance of the upstream squashfuse_ll multithreading patch at https://github.com/vasi/squashfuse/pull/70 makes a huge difference, so much so that we have decided to for now include a patched version of squashfuse_ll in the apptainer distribution. We would much prefer of course if it would come directly from the EPEL/Fedora version of the squashfuse package. Version-Release number of selected component (if applicable): 0.1.102 How reproducible: Very Steps to Reproduce: Please see the above issue for details. Actual results: With the detailed benchmark on a 16 core el7 node, reading from a local disk takes 6:23, epel squashfuse by itself takes 41:11, and epel squashfuse_ll takes 13:06. Expected results: I expect results much more comparable to local disk, and in fact with the multithreaded patch the benchmark time goes down to 6:35, nearly the local disk time. Additional info: Would you consider upgrading to squashfuse 0.1.105, including the multithreaded patch, and compiling it with `--enable-multithreading`? The upstream provider does not seem to be in any hurry to include the patch; he hasn't even commented on the thread which has been available for several months.
In general I'm not a huge fan of the idea of shipping something that isn't upstream. I'd rather avoid maintaining a fork, here.
I understand, I would be the same, but this makes such a huge difference and the upstream isn't moving on it. Do you have any influence on the upstream owner?
I'm afraid I have no influence upstream. I agree, this is actually my biggest annoyance with squashfuse, but it makes me very uncomfortable to carry a 1k-line patch that is unapproved and indeed unreviewed by the upstream maintainer. Let's keep pressure there. Perhaps someone needs to offer to help maintain the upstream project?
By the way, since your experience seems to contrast with the metrics shared in https://github.com/vasi/squashfuse/pull/70#issuecomment-1186259602, you might consider adding your own. Right now it actually looks like the multithreaded version performs quite a bit worse without a crazy number of threads.
I did add my own metrics in the github issue that I linked from my comment on that PR.
I doubt vasi will look at that. Obviously reviews are few and far between. Anything we can do to: 1. Make it an easy review so we don't need too many (slow) passes 2. Make it look like a worthwhile PR to review would be worthwhile. We didn't write the patch, but we could help with (1) by potentially reviewing it and trying to make sure it's in such a shape that, once vasi gets to it, it takes as few passes as possible. (2) is easier: show that the patch is actually worthwhile. If I'm vasi, taking a quick look at PRs, the current comments on that particular one looks like it drags overall performance down. Not sure that would be worth a closer look with limited time.
I don't understand what you mean -- why do you doubt vasi look at what I posted in the PR? I showed an amazing improvement with the PR, and a benchmark showing it basically equivalent to the kernel squashFS. One person posted something that looked like a decrease in performance at low numbers of threads but never posted methodology even when asked by the author of the PR.