Bug 2013000
Summary: | RFE: allow nbdkit to fix the effective URL for mirrored sites | ||||||
---|---|---|---|---|---|---|---|
Product: | [Community] Virtualization Tools | Reporter: | Alexander Wels <awels> | ||||
Component: | nbdkit | Assignee: | Richard W.M. Jones <rjones> | ||||
Status: | CLOSED UPSTREAM | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | unspecified | CC: | ptoscano, rjones | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2021-10-19 20:57:36 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Alexander Wels
2021-10-11 19:16:29 UTC
Created attachment 1832139 [details] test.c To be clear about this, are we talking about nbdkit-curl-plugin? qemu's curl plugin is something different. Anyway it is indeed true that: (a) nbdkit-curl-plugin will make many small byte-range requests from different threads and (b) if the source issues a redirect then we could redirect to a different mirror on each request and (c) if one request fails then the whole thing will fail. There are a few possible ways to work around this: (1) https://libguestfs.org/nbdkit-retry-filter.1.html can be inserted in the chain and it will restart the plugin if a failure happens. It's likely to be quite a sledgehammer fix, it might be possible to improve the curl plugin to do something less aggressive. (2) Prefetch the URL yourself, which will resolve the mirror, check the mirror works and retry, then pass the resolved URL to nbdkit. I wrote something similar to (2) a while back when this same question came up previously, see attached program. I am sorry, yes I mean the nbdkit-curl-plugin. I have a PR that implements 2, and IMO works great, however in the review I got comments that nbdkit should be doing the prefetch instead of us, which is the reason I opened this bug. Let me see if 1 can work for our use case. For (1) I think the easiest thing would be if there's a curl option to "pin" the redirection to a single mirror (although you'd still have a problem if the mirror it happened to choose was broken). I don't see much here that looks like it could help: https://curl.se/libcurl/c/curl_easy_setopt.html It might be asking the curl developers if they've considered this case. So if the mirror picked is the broken one is not a huge problem for us. The import will fail, and be retried using the mirror URL, and hopefully we will get a different mirror that works. The problem manifests itself because each time we read a small byte range, there is a chance we hit the broken mirror. So if we have 10 mirrors, and 1 is broken, and during an import we read 1000 different byte ranges, we have, it is almost guaranteed that at some point we get the broken mirror. Does the retry filter retry just one byte range if it fails, or does it retry the entire import. I am hoping just the one byte range. If so I think that will make it sufficiently robust for our purposes. The retry filter is going to be a big (too big) hammer here. It will actually reload the entire plugin if any request fails. I think what you actually want is more like some way to pin the redirect to a fixed URL, ie the first time any range is requested, we get the resolved URL from curl and use that URL in future requests. (This would be opt-in through a new command line option). I'm open to an implementation of this in nbdkit-curl-plugin provided it's not going to be too invasive. Yes that is basically what I am asking for. And if the one it resolves to is the broken one, it will fail, which is fine since that is exactly what would have happened before anyway. A flag to pin it would be a great solution for us. Patch posted: https://listman.redhat.com/archives/libguestfs/2021-October/thread.html#00048 Can you tell us what are the specific objections to doing the prefetch outside of nbdkit (as you said in comment 2)? Personally I think it is fine for us to pre-resolve the URL for a real mirror, and then passing that to nbdkit, I did exactly that in this PR https://github.com/kubevirt/containerized-data-importer/pull/1981 but other members of my team feel like it is going against the way http works for us to pre-resolve the URL, and then passing that to nbdkit, and that that responsibility should reside in nbdkit. Alternative proposal of a new nbdkit-retry-request-filter: https://listman.redhat.com/archives/libguestfs/2021-October/thread.html#00084 |