Description of problem: When using nbdkit with qemu-img-curl to download and convert an image, it should only resolve a redirect once. For instance if I am downloading from a mirrored source like download.fedoraproject.org it will follow the redirect to a different mirror many times during the download. I am not sure if the problem is in nbdkit or qemu-img-curl plugin. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Start a conversion with nbdkit with the qemu-img-curl plugin and the source URL is one of the mirrored URLs, like download.fedoraproject.org, AND one of the mirrors is flaky. Actual results: Each time nbdkit makes a request (each byte-range I think) and the URL is a mirrored URL like above, there is a chance that the returned mirror is down or invalid. This will break the import immediately. Expected results: Well there are IMO 2 possible solutions: 1. Resolve the mirrored URL to a real mirror first, and then only use that real mirror for all the requests. 2. If the returned mirror fails, retry the original URL, and hope you get a new mirror that does work. Additional info: We use nbdkit with the qemu-img-curl extensively in Containerized Data Importer, and whenever a fedora or centos mirror is flaky, we get bug reports because the import fails. The real problem is the flaky mirror, but because there are soo many chances of getting the bad mirror returned during an import, it is almost guaranteed that the import fails. If one of the two solutions suggested is implemented, it should make the whole import more resilient.
Created attachment 1832139 [details] test.c To be clear about this, are we talking about nbdkit-curl-plugin? qemu's curl plugin is something different. Anyway it is indeed true that: (a) nbdkit-curl-plugin will make many small byte-range requests from different threads and (b) if the source issues a redirect then we could redirect to a different mirror on each request and (c) if one request fails then the whole thing will fail. There are a few possible ways to work around this: (1) https://libguestfs.org/nbdkit-retry-filter.1.html can be inserted in the chain and it will restart the plugin if a failure happens. It's likely to be quite a sledgehammer fix, it might be possible to improve the curl plugin to do something less aggressive. (2) Prefetch the URL yourself, which will resolve the mirror, check the mirror works and retry, then pass the resolved URL to nbdkit. I wrote something similar to (2) a while back when this same question came up previously, see attached program.
I am sorry, yes I mean the nbdkit-curl-plugin. I have a PR that implements 2, and IMO works great, however in the review I got comments that nbdkit should be doing the prefetch instead of us, which is the reason I opened this bug. Let me see if 1 can work for our use case.
For (1) I think the easiest thing would be if there's a curl option to "pin" the redirection to a single mirror (although you'd still have a problem if the mirror it happened to choose was broken). I don't see much here that looks like it could help: https://curl.se/libcurl/c/curl_easy_setopt.html It might be asking the curl developers if they've considered this case.
So if the mirror picked is the broken one is not a huge problem for us. The import will fail, and be retried using the mirror URL, and hopefully we will get a different mirror that works. The problem manifests itself because each time we read a small byte range, there is a chance we hit the broken mirror. So if we have 10 mirrors, and 1 is broken, and during an import we read 1000 different byte ranges, we have, it is almost guaranteed that at some point we get the broken mirror. Does the retry filter retry just one byte range if it fails, or does it retry the entire import. I am hoping just the one byte range. If so I think that will make it sufficiently robust for our purposes.
The retry filter is going to be a big (too big) hammer here. It will actually reload the entire plugin if any request fails. I think what you actually want is more like some way to pin the redirect to a fixed URL, ie the first time any range is requested, we get the resolved URL from curl and use that URL in future requests. (This would be opt-in through a new command line option). I'm open to an implementation of this in nbdkit-curl-plugin provided it's not going to be too invasive.
Yes that is basically what I am asking for. And if the one it resolves to is the broken one, it will fail, which is fine since that is exactly what would have happened before anyway. A flag to pin it would be a great solution for us.
Patch posted: https://listman.redhat.com/archives/libguestfs/2021-October/thread.html#00048 Can you tell us what are the specific objections to doing the prefetch outside of nbdkit (as you said in comment 2)?
Personally I think it is fine for us to pre-resolve the URL for a real mirror, and then passing that to nbdkit, I did exactly that in this PR https://github.com/kubevirt/containerized-data-importer/pull/1981 but other members of my team feel like it is going against the way http works for us to pre-resolve the URL, and then passing that to nbdkit, and that that responsibility should reside in nbdkit.
Alternative proposal of a new nbdkit-retry-request-filter: https://listman.redhat.com/archives/libguestfs/2021-October/thread.html#00084
Fixed upstream and in 1.29.1 by: https://gitlab.com/nbdkit/nbdkit/-/commit/73ff1ad1bf11988949509ba299a5454f4397f952 https://libguestfs.org/nbdkit-retry-request-filter.1.html