Bug 1605072
| Summary: | scopeo image copy fails on first attempt within Jenkins pipeline | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Luke Stanton <lstanton> | |
| Component: | Image Registry | Assignee: | Alexey Gladkov <agladkov> | |
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Dongbo Yan <dyan> | |
| Severity: | medium | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 3.6.0 | CC: | aos-bugs, bparees, jokerman, lstanton, mitr, mmccomas, mpatel | |
| Target Milestone: | --- | |||
| Target Release: | 3.6.z | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1745743 (view as bug list) | Environment: | ||
| Last Closed: | 2018-09-27 16:31:37 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1745743, 1805500 | |||
|
Description
Luke Stanton
2018-07-20 06:21:56 UTC
Moving this to containers as they *might* own skopeo, i'm not sure. If not them, hopefully they can push it in the right direction. Miloslav, have you ever seen something like this in skopeo? No, I haven’t ever seen anything like this. What registry implementation is this? Anything else unusual, perhaps the backing layer storage? Overall, this looks as if the layer storage were somehow unable to read a layer immediately after successfully writing it and confirming to the client that it has been written — but the layer becomes readable a few seconds later. (The “manifest blob unknown” error is lacking a bit of detail, but the most likely explanation is that while uploading a manifest, the registry is checking whether all referenced blobs (layers, config…) exist already, and finds that they don’t.) Looking at skopeo_-_manifestunknownlog.txt, compare the handling of layer blobs b538cc6febe635e011f69d724aa31744ad50a0caee5347221874afa25629ca51 , 944b324912445e934ad17a152e23805fb75fe70e7b5bf6775d83420376fb43c9 , and 96eb74fb2f1d0f1ea94247cdcc4f11dc6df79ebca46a9af12cf26c726b709c9b: b538… is, when running the command for the first time, not present, so it is uploaded; on second invocation, it is detected as already present at the destination. 944b… is likewise not present when running the command for the first time, and uploaded; on second invocation, it is _detected as missing_, so the command starts to upload it again, and at that time re-checks the presence at the destination [which is an inefficiency in skopeo, arguably], and _then_ the server reports that the layer already exists. And 96eb… is not present the first time, uploaded, then not present the second time (in _both_ checks), and uploaded anew. The 944b… layer seems to rule out an overzealous GC: the client merely asks twice ”do you have this layer“ and is told “no”, less than a second later it the server changes its mind and replies “yes”. This was reported against 3.6, do we know if it occurs with any newer version? I'm not sure if it occurs in any newer versions. This is the only case that I'm aware of where the error has shown up. NFS is being used as the backing storage. |