Bug 2044633
| Summary: | [RFE] Ensure layer isn't unnecessarily re-pushed in c/image | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Robb Manes <rmanes> |
| Component: | buildah | Assignee: | Aditya R <arajan> |
| Status: | CLOSED MIGRATED | QA Contact: | atomic-bugs <atomic-bugs> |
| Severity: | low | Docs Contact: | |
| Priority: | low | ||
| Version: | 8.5 | CC: | cpippin, dornelas, dwalsh, ltitov, mitr, nalin, pthomas, tsweeney, umohnani, vrothber |
| Target Milestone: | rc | Keywords: | FutureFeature, MigratedToJIRA |
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-09-11 18:37:44 UTC | Type: | Story |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Robb Manes
2022-01-24 20:53:52 UTC
The layer blobs have to be reconstructed when we push them, and while we can reconstruct the layer blob pretty reliably, the copy of a layer blob that we pull from a registry is compressed, and recompressing a layer blob will frequently produce a compressed blob that has a different digest than the compressed blob which we originally pulled down had. When base images and built images are in different registries, it's possible that each registry's compressed version of a given uncompressed layer blob will have a different digest. To attempt to handle cases like this, and the general "is the blob already there" case, clients maintain a blob info cache, where they cache digests of the blobs that they've "seen", which repositories they saw those blobs in, and the correlation between their uncompressed and (possibly multiple) compressed digests. When the first client made its second push attempt, the first client had a cache record indicating that, even if the first client would ordinarily need to recompress the uncompressed blob that the first client had locally, there was already a blob in the registry that had the same content as the uncompressed blob that the first client needed to push, except in an already-compressed form. The first client updated the manifest that the first client was preparing to write to have the manifest reference the compressed blob, and the first client then skipped uploading and recompressing the uncompressed blob. The second client had none of that information about the destination repository, so once the second client had checked for the presence of a blob in the repository with the digest of the uncompressed blob and didn't find such a blob, the second client had exhausted the set of information it had about that repository, so the second client pushed the blob it had, recompressing as it went. We could tweak the cache so that a client would, instead of checking the registry for a blob with the same digest as the uncompressed layer blob it wants to push, and then checking the registry for compressed blobs that it's previously seen in that registry whose digests correspond to the uncompressed layer blob's digest, the client would check the registry for the presence of a blobs with digests matching any compressed version of that blob that the client had seen, anywhere, including the registry from which it had pulled a base image that had been used to build an image that it was now attempting to push to a different registry. That check could fail more often, of course, so it could be slower for more people, but I think it would avoid pushing the layer in the case you're laying out. Yes ; a simple existence check for known-compressed-digests (with some heuristic limit on the number of checks) would work in this case. (Alternatively, right now, `types.SystemContext.DockerRegistryPushPrecomputeDigests` could _in fewer cases_ avoid an upload (when we happen to compress the blob to exactly the on-registry representation — and that circumstance can change at any time for any reason, so it can’t be relied upon if users _neeed_ the efficiency for some reason), but on a fast network that’s _slower_ than uploading the data and having the registry detect a duplicate. So I don’t recommend using or exposing this option.) @mitr My customer is proposing this way: - before actual layer push perform a query to check this layer existence in target repo, as described in HTTP API docs: https://docs.docker.com/registry/spec/api/#pushing-an-image See under "Existing Layers": "The existence of a layer can be checked via a HEAD request to the blob store API. ... When this response is received, the client can assume that the layer is already available in the registry under the given name and should take no further action to upload the layer." This check should make sense for 'compressed-on-destination-registry' blob. Also, can you elaborate more on subtleties you mentioned? Thanks in advance! Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug. This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there. Due to differences in account names between systems, some fields were not replicated. Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information. To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer. You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like: "Bugzilla Bug" = 1234567 In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |