Bug 2124630 - Please include multi-threading in squashfuse_ll
Summary: Please include multi-threading in squashfuse_ll
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora EPEL
Classification: Fedora
Component: squashfuse
Version: epel7
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kyle Fazzari
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-09-06 16:20 UTC by Dave Dykstra
Modified: 2022-09-17 18:34 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-09-16 16:24:52 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Dave Dykstra 2022-09-06 16:20:32 UTC
Description of problem:

The performance of squashfuse turns out to be very awful for multi-core applications.  Please see for details:

https://github.com/apptainer/apptainer/issues/665

We are about to release a version of apptainer (which I maintain in EPEL and Fedora) that puts a heavy dependence on squashfuse.  The performance of the upstream squashfuse_ll multithreading patch at

https://github.com/vasi/squashfuse/pull/70

makes a huge difference, so much so that we have decided to for now include a patched version of squashfuse_ll in the apptainer distribution.  We would much prefer of course if it would come directly from the EPEL/Fedora version of the squashfuse package.

Version-Release number of selected component (if applicable):

0.1.102

How reproducible:

Very

Steps to Reproduce:

Please see the above issue for details.

Actual results:

With the detailed benchmark on a 16 core el7 node, reading from a local disk takes 6:23, epel squashfuse by itself takes 41:11, and epel squashfuse_ll takes 13:06.

Expected results:

I expect results much more comparable to local disk, and in fact with the multithreaded patch the benchmark time goes down to 6:35, nearly the local disk time.

Additional info:

Would you consider upgrading to squashfuse 0.1.105, including the multithreaded patch, and compiling it with `--enable-multithreading`?  The upstream provider does not seem to be in any hurry to include the patch; he hasn't even commented on the thread which has been available for several months.

Comment 1 Kyle Fazzari 2022-09-06 16:23:31 UTC
In general I'm not a huge fan of the idea of shipping something that isn't upstream. I'd rather avoid maintaining a fork, here.

Comment 2 Dave Dykstra 2022-09-06 16:28:26 UTC
I understand, I would be the same, but this makes such a huge difference and the upstream isn't moving on it.  Do you have any influence on the upstream owner?

Comment 3 Kyle Fazzari 2022-09-16 16:24:52 UTC
I'm afraid I have no influence upstream. I agree, this is actually my biggest annoyance with squashfuse, but it makes me very uncomfortable to carry a 1k-line patch that is unapproved and indeed unreviewed by the upstream maintainer. Let's keep pressure there. Perhaps someone needs to offer to help maintain the upstream project?

Comment 4 Kyle Fazzari 2022-09-16 16:31:21 UTC
By the way, since your experience seems to contrast with the metrics shared in https://github.com/vasi/squashfuse/pull/70#issuecomment-1186259602, you might consider adding your own. Right now it actually looks like the multithreaded version performs quite a bit worse without a crazy number of threads.

Comment 5 Dave Dykstra 2022-09-16 19:44:42 UTC
I did add my own metrics in the github issue that I linked from my comment on that PR.

Comment 6 Kyle Fazzari 2022-09-16 19:53:23 UTC
I doubt vasi will look at that. Obviously reviews are few and far between. Anything we can do to:

1. Make it an easy review so we don't need too many (slow) passes
2. Make it look like a worthwhile PR to review

would be worthwhile. We didn't write the patch, but we could help with (1) by potentially reviewing it and trying to make sure it's in such a shape that, once vasi gets to it, it takes as few passes as possible. (2) is easier: show that the patch is actually worthwhile. If I'm vasi, taking a quick look at PRs, the current comments on that particular one looks like it drags overall performance down. Not sure that would be worth a closer look with limited time.

Comment 7 Dave Dykstra 2022-09-17 18:34:41 UTC
I don't understand what you mean -- why do you doubt vasi look at what I posted in the PR?  I showed an amazing improvement with the PR, and a benchmark showing it basically equivalent to the kernel squashFS.  One person posted something that looked like a decrease in performance at low numbers of threads but never posted methodology even when asked by the author of the PR.


Note You need to log in before you can comment on or make changes to this bug.