Bug 2026104 - [sig-imageregistry][Feature:ImageAppend] Image append should create images by appending them [Skipped:Disconnected] [Suite:openshift/conformance/parallel]
Summary: [sig-imageregistry][Feature:ImageAppend] Image append should create images by...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: oc
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.10.0
Assignee: Oleg Bulatov
QA Contact: zhou ying
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-11-23 18:55 UTC by Devan Goodwin
Modified: 2022-03-11 18:15 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
job=periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi
Last Closed: 2022-03-11 18:15:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Devan Goodwin 2021-11-23 18:55:47 UTC
[sig-imageregistry][Feature:ImageAppend] Image append should create images by appending them [Skipped:Disconnected] [Suite:openshift/conformance/parallel]

is failing intermittently in CI particularly for metal IPI jobs, see:
https://sippy.ci.openshift.org/sippy-ng/tests/4.10/analysis?test=%5Bsig-imageregistry%5D%5BFeature%3AImageAppend%5D%20Image%20append%20should%20create%20images%20by%20appending%20them%20%5BSkipped%3ADisconnected%5D%20%5BSuite%3Aopenshift%2Fconformance%2Fparallel%5D



A sample failing prow job is: https://prow.ci.openshift.org/view/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi/1463138904453419008

Error is a bit peculiar:

Uploading 11.21MB ...
error: uploading the source layer sha256:d8df2ecc7eef38298649da03c160f94011670149e6727dd4be1e5b6969d37622 failed: Patch "https://image-registry.openshift-image-registry.svc:5000/v2/e2e-test-image-append-mq6km/test/blobs/uploads/1bcbb4fe-89c3-4eae-a5a7-61d7ff91491d?_state=mv8sd1oQjoALpA7ob_-hQrlHrhn_ArSH9UZsDlHMjEV7Ik5hbWUiOiJlMmUtdGVzdC1pbWFnZS1hcHBlbmQtbXE2a20vdGVzdCIsIlVVSUQiOiIxYmNiYjRmZS04OWMzLTRlYWUtYTVhNy02MWQ3ZmY5MTQ5MWQiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjEtMTEtMjNUMTU6Mjg6MTYuMDA1NTg2MjI1WiJ9": unknown blob
<----end of log for "append-test"/"test"


Which shows up for about 5% of metal-ipi jobs: 

https://search.ci.openshift.org/?search=error%3A+uploading+the+source+layer&maxAge=48h&context=1&type=build-log&name=4.10.*metal&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job


TRT is rating sev high as this is the biggest failure present in metal ipi testing.

Comment 1 Oleg Bulatov 2021-11-24 09:46:11 UTC
Registry logs for the PATCH request:

time="2021-11-23T15:28:16.018598399Z" level=info msg="authorized request" go.version=go1.17.2 http.request.host="image-registry.openshift-image-registry.svc:5000" http.request.id=63504da0-debe-4ba8-90d2-b085023c5fea http.request.method=PATCH http.request.remoteaddr="10.131.0.39:43436" http.request.uri="/v2/e2e-test-image-append-mq6km/test/blobs/uploads/1bcbb4fe-89c3-4eae-a5a7-61d7ff91491d?_state=mv8sd1oQjoALpA7ob_-hQrlHrhn_ArSH9UZsDlHMjEV7Ik5hbWUiOiJlMmUtdGVzdC1pbWFnZS1hcHBlbmQtbXE2a20vdGVzdCIsIlVVSUQiOiIxYmNiYjRmZS04OWMzLTRlYWUtYTVhNy02MWQ3ZmY5MTQ5MWQiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjEtMTEtMjNUMTU6Mjg6MTYuMDA1NTg2MjI1WiJ9" http.request.useragent="oc/4.10.0 (linux/amd64) kubernetes/af85060" openshift.auth.user="system:serviceaccount:e2e-test-image-append-mq6km:builder" vars.name=e2e-test-image-append-mq6km/test vars.uuid=1bcbb4fe-89c3-4eae-a5a7-61d7ff91491d
time="2021-11-23T15:28:16.384323363Z" level=error msg="client disconnected during blob PATCH" contentLength=-1 copied=0 error="stream error: stream ID 35; CANCEL" go.version=go1.17.2 http.request.host="image-registry.openshift-image-registry.svc:5000" http.request.id=63504da0-debe-4ba8-90d2-b085023c5fea http.request.method=PATCH http.request.remoteaddr="10.131.0.39:43436" http.request.uri="/v2/e2e-test-image-append-mq6km/test/blobs/uploads/1bcbb4fe-89c3-4eae-a5a7-61d7ff91491d?_state=mv8sd1oQjoALpA7ob_-hQrlHrhn_ArSH9UZsDlHMjEV7Ik5hbWUiOiJlMmUtdGVzdC1pbWFnZS1hcHBlbmQtbXE2a20vdGVzdCIsIlVVSUQiOiIxYmNiYjRmZS04OWMzLTRlYWUtYTVhNy02MWQ3ZmY5MTQ5MWQiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjEtMTEtMjNUMTU6Mjg6MTYuMDA1NTg2MjI1WiJ9" http.request.useragent="oc/4.10.0 (linux/amd64) kubernetes/af85060" openshift.auth.user="system:serviceaccount:e2e-test-image-append-mq6km:builder" vars.name=e2e-test-image-append-mq6km/test vars.uuid=1bcbb4fe-89c3-4eae-a5a7-61d7ff91491d
time="2021-11-23T15:28:16.391734534Z" level=error msg="response completed with error" err.code=unknown err.detail="client disconnected" err.message="unknown error" go.version=go1.17.2 http.request.host="image-registry.openshift-image-registry.svc:5000" http.request.id=63504da0-debe-4ba8-90d2-b085023c5fea http.request.method=PATCH http.request.remoteaddr="10.131.0.39:43436" http.request.uri="/v2/e2e-test-image-append-mq6km/test/blobs/uploads/1bcbb4fe-89c3-4eae-a5a7-61d7ff91491d?_state=mv8sd1oQjoALpA7ob_-hQrlHrhn_ArSH9UZsDlHMjEV7Ik5hbWUiOiJlMmUtdGVzdC1pbWFnZS1hcHBlbmQtbXE2a20vdGVzdCIsIlVVSUQiOiIxYmNiYjRmZS04OWMzLTRlYWUtYTVhNy02MWQ3ZmY5MTQ5MWQiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjEtMTEtMjNUMTU6Mjg6MTYuMDA1NTg2MjI1WiJ9" http.request.useragent="oc/4.10.0 (linux/amd64) kubernetes/af85060" http.response.contenttype="application/json; charset=utf-8" http.response.duration=381.874577ms http.response.status=500 http.response.written=89 openshift.auth.user="system:serviceaccount:e2e-test-image-append-mq6km:builder" vars.name=e2e-test-image-append-mq6km/test vars.uuid=1bcbb4fe-89c3-4eae-a5a7-61d7ff91491d
time="2021-11-23T15:28:16.391815734Z" level=info msg=response go.version=go1.17.2 http.request.host="image-registry.openshift-image-registry.svc:5000" http.request.id=2d1478ae-2ac8-4f68-a382-89e3d121f85d http.request.method=PATCH http.request.remoteaddr="10.131.0.39:43436" http.request.uri="/v2/e2e-test-image-append-mq6km/test/blobs/uploads/1bcbb4fe-89c3-4eae-a5a7-61d7ff91491d?_state=mv8sd1oQjoALpA7ob_-hQrlHrhn_ArSH9UZsDlHMjEV7Ik5hbWUiOiJlMmUtdGVzdC1pbWFnZS1hcHBlbmQtbXE2a20vdGVzdCIsIlVVSUQiOiIxYmNiYjRmZS04OWMzLTRlYWUtYTVhNy02MWQ3ZmY5MTQ5MWQiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjEtMTEtMjNUMTU6Mjg6MTYuMDA1NTg2MjI1WiJ9" http.request.useragent="oc/4.10.0 (linux/amd64) kubernetes/af85060" http.response.contenttype="application/json; charset=utf-8" http.response.duration=381.994871ms http.response.status=500 http.response.written=89

So the registry claims that the client has disconnected during the PATCH request, from the client logs it seems it has received 404 from the registry.

Comment 2 Oleg Bulatov 2021-11-24 12:40:58 UTC
Ok, the client logs are misleading.

It opens a reader for the blob, a writer, and calls io.Copy. The error "unknown blob" comes from the reader, so nothing is copied, and the writer is closed without sending any data.

So the registry sees a PATCH request without body.

The real error is

time="2021-11-23T15:28:16.38317368Z" level=error msg="Error statting blob sha256:d8df2ecc7eef38298649da03c160f94011670149e6727dd4be1e5b6969d37622 in remote repository \"quay.io/openshift-release-dev/ocp-v4.0-art-dev\": Head \"https://quay.io/v2/openshift-release-dev/ocp-v4.0-art-dev/blobs/sha256:d8df2ecc7eef38298649da03c160f94011670149e6727dd4be1e5b6969d37622\": error parsing HTTP 429 response body: invalid character '<' looking for beginning of value: \"<html>\\r\\n<head><title>429 Too Many Requests</title></head>\\r\\n<body bgcolor=\\\"white\\\">\\r\\n<center><h1>429 Too Many Requests</h1></center>\\r\\n<hr><center>nginx/1.12.1</center>\\r\\n</body>\\r\\n</html>\\r\\n\"" go.version=go1.17.2 http.request.host="image-registry.openshift-image-registry.svc:5000" http.request.id=8c7a1c9c-38a8-4b7b-8d4f-22a42b8ca0da http.request.method=GET http.request.remoteaddr="10.131.0.39:43436" http.request.uri="/v2/openshift/tools/blobs/sha256:d8df2ecc7eef38298649da03c160f94011670149e6727dd4be1e5b6969d37622" http.request.useragent="oc/4.10.0 (linux/amd64) kubernetes/af85060" openshift.auth.user="system:serviceaccount:e2e-test-image-append-mq6km:builder" vars.digest="sha256:d8df2ecc7eef38298649da03c160f94011670149e6727dd4be1e5b6969d37622" vars.name=openshift/tools

and BZ 1923536 hides this problem even deeper.

Comment 3 Oleg Bulatov 2021-11-24 12:59:59 UTC
https://github.com/openshift/library-go/pull/1226/files should help with error reporting, the error will happen on opening the blob.

Comment 4 Oleg Bulatov 2021-12-14 20:01:43 UTC
The registry problem is tracked at BZ 1923536, moving to `oc` for the error reporting part.


Note You need to log in before you can comment on or make changes to this bug.