[sig-imageregistry][Feature:ImageAppend] Image append should create images by appending them [Skipped:Disconnected] [Suite:openshift/conformance/parallel] is failing intermittently in CI particularly for metal IPI jobs, see: https://sippy.ci.openshift.org/sippy-ng/tests/4.10/analysis?test=%5Bsig-imageregistry%5D%5BFeature%3AImageAppend%5D%20Image%20append%20should%20create%20images%20by%20appending%20them%20%5BSkipped%3ADisconnected%5D%20%5BSuite%3Aopenshift%2Fconformance%2Fparallel%5D A sample failing prow job is: https://prow.ci.openshift.org/view/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi/1463138904453419008 Error is a bit peculiar: Uploading 11.21MB ... error: uploading the source layer sha256:d8df2ecc7eef38298649da03c160f94011670149e6727dd4be1e5b6969d37622 failed: Patch "https://image-registry.openshift-image-registry.svc:5000/v2/e2e-test-image-append-mq6km/test/blobs/uploads/1bcbb4fe-89c3-4eae-a5a7-61d7ff91491d?_state=mv8sd1oQjoALpA7ob_-hQrlHrhn_ArSH9UZsDlHMjEV7Ik5hbWUiOiJlMmUtdGVzdC1pbWFnZS1hcHBlbmQtbXE2a20vdGVzdCIsIlVVSUQiOiIxYmNiYjRmZS04OWMzLTRlYWUtYTVhNy02MWQ3ZmY5MTQ5MWQiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjEtMTEtMjNUMTU6Mjg6MTYuMDA1NTg2MjI1WiJ9": unknown blob <----end of log for "append-test"/"test" Which shows up for about 5% of metal-ipi jobs: https://search.ci.openshift.org/?search=error%3A+uploading+the+source+layer&maxAge=48h&context=1&type=build-log&name=4.10.*metal&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job TRT is rating sev high as this is the biggest failure present in metal ipi testing.
Registry logs for the PATCH request: time="2021-11-23T15:28:16.018598399Z" level=info msg="authorized request" go.version=go1.17.2 http.request.host="image-registry.openshift-image-registry.svc:5000" http.request.id=63504da0-debe-4ba8-90d2-b085023c5fea http.request.method=PATCH http.request.remoteaddr="10.131.0.39:43436" http.request.uri="/v2/e2e-test-image-append-mq6km/test/blobs/uploads/1bcbb4fe-89c3-4eae-a5a7-61d7ff91491d?_state=mv8sd1oQjoALpA7ob_-hQrlHrhn_ArSH9UZsDlHMjEV7Ik5hbWUiOiJlMmUtdGVzdC1pbWFnZS1hcHBlbmQtbXE2a20vdGVzdCIsIlVVSUQiOiIxYmNiYjRmZS04OWMzLTRlYWUtYTVhNy02MWQ3ZmY5MTQ5MWQiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjEtMTEtMjNUMTU6Mjg6MTYuMDA1NTg2MjI1WiJ9" http.request.useragent="oc/4.10.0 (linux/amd64) kubernetes/af85060" openshift.auth.user="system:serviceaccount:e2e-test-image-append-mq6km:builder" vars.name=e2e-test-image-append-mq6km/test vars.uuid=1bcbb4fe-89c3-4eae-a5a7-61d7ff91491d time="2021-11-23T15:28:16.384323363Z" level=error msg="client disconnected during blob PATCH" contentLength=-1 copied=0 error="stream error: stream ID 35; CANCEL" go.version=go1.17.2 http.request.host="image-registry.openshift-image-registry.svc:5000" http.request.id=63504da0-debe-4ba8-90d2-b085023c5fea http.request.method=PATCH http.request.remoteaddr="10.131.0.39:43436" http.request.uri="/v2/e2e-test-image-append-mq6km/test/blobs/uploads/1bcbb4fe-89c3-4eae-a5a7-61d7ff91491d?_state=mv8sd1oQjoALpA7ob_-hQrlHrhn_ArSH9UZsDlHMjEV7Ik5hbWUiOiJlMmUtdGVzdC1pbWFnZS1hcHBlbmQtbXE2a20vdGVzdCIsIlVVSUQiOiIxYmNiYjRmZS04OWMzLTRlYWUtYTVhNy02MWQ3ZmY5MTQ5MWQiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjEtMTEtMjNUMTU6Mjg6MTYuMDA1NTg2MjI1WiJ9" http.request.useragent="oc/4.10.0 (linux/amd64) kubernetes/af85060" openshift.auth.user="system:serviceaccount:e2e-test-image-append-mq6km:builder" vars.name=e2e-test-image-append-mq6km/test vars.uuid=1bcbb4fe-89c3-4eae-a5a7-61d7ff91491d time="2021-11-23T15:28:16.391734534Z" level=error msg="response completed with error" err.code=unknown err.detail="client disconnected" err.message="unknown error" go.version=go1.17.2 http.request.host="image-registry.openshift-image-registry.svc:5000" http.request.id=63504da0-debe-4ba8-90d2-b085023c5fea http.request.method=PATCH http.request.remoteaddr="10.131.0.39:43436" http.request.uri="/v2/e2e-test-image-append-mq6km/test/blobs/uploads/1bcbb4fe-89c3-4eae-a5a7-61d7ff91491d?_state=mv8sd1oQjoALpA7ob_-hQrlHrhn_ArSH9UZsDlHMjEV7Ik5hbWUiOiJlMmUtdGVzdC1pbWFnZS1hcHBlbmQtbXE2a20vdGVzdCIsIlVVSUQiOiIxYmNiYjRmZS04OWMzLTRlYWUtYTVhNy02MWQ3ZmY5MTQ5MWQiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjEtMTEtMjNUMTU6Mjg6MTYuMDA1NTg2MjI1WiJ9" http.request.useragent="oc/4.10.0 (linux/amd64) kubernetes/af85060" http.response.contenttype="application/json; charset=utf-8" http.response.duration=381.874577ms http.response.status=500 http.response.written=89 openshift.auth.user="system:serviceaccount:e2e-test-image-append-mq6km:builder" vars.name=e2e-test-image-append-mq6km/test vars.uuid=1bcbb4fe-89c3-4eae-a5a7-61d7ff91491d time="2021-11-23T15:28:16.391815734Z" level=info msg=response go.version=go1.17.2 http.request.host="image-registry.openshift-image-registry.svc:5000" http.request.id=2d1478ae-2ac8-4f68-a382-89e3d121f85d http.request.method=PATCH http.request.remoteaddr="10.131.0.39:43436" http.request.uri="/v2/e2e-test-image-append-mq6km/test/blobs/uploads/1bcbb4fe-89c3-4eae-a5a7-61d7ff91491d?_state=mv8sd1oQjoALpA7ob_-hQrlHrhn_ArSH9UZsDlHMjEV7Ik5hbWUiOiJlMmUtdGVzdC1pbWFnZS1hcHBlbmQtbXE2a20vdGVzdCIsIlVVSUQiOiIxYmNiYjRmZS04OWMzLTRlYWUtYTVhNy02MWQ3ZmY5MTQ5MWQiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjEtMTEtMjNUMTU6Mjg6MTYuMDA1NTg2MjI1WiJ9" http.request.useragent="oc/4.10.0 (linux/amd64) kubernetes/af85060" http.response.contenttype="application/json; charset=utf-8" http.response.duration=381.994871ms http.response.status=500 http.response.written=89 So the registry claims that the client has disconnected during the PATCH request, from the client logs it seems it has received 404 from the registry.
Ok, the client logs are misleading. It opens a reader for the blob, a writer, and calls io.Copy. The error "unknown blob" comes from the reader, so nothing is copied, and the writer is closed without sending any data. So the registry sees a PATCH request without body. The real error is time="2021-11-23T15:28:16.38317368Z" level=error msg="Error statting blob sha256:d8df2ecc7eef38298649da03c160f94011670149e6727dd4be1e5b6969d37622 in remote repository \"quay.io/openshift-release-dev/ocp-v4.0-art-dev\": Head \"https://quay.io/v2/openshift-release-dev/ocp-v4.0-art-dev/blobs/sha256:d8df2ecc7eef38298649da03c160f94011670149e6727dd4be1e5b6969d37622\": error parsing HTTP 429 response body: invalid character '<' looking for beginning of value: \"<html>\\r\\n<head><title>429 Too Many Requests</title></head>\\r\\n<body bgcolor=\\\"white\\\">\\r\\n<center><h1>429 Too Many Requests</h1></center>\\r\\n<hr><center>nginx/1.12.1</center>\\r\\n</body>\\r\\n</html>\\r\\n\"" go.version=go1.17.2 http.request.host="image-registry.openshift-image-registry.svc:5000" http.request.id=8c7a1c9c-38a8-4b7b-8d4f-22a42b8ca0da http.request.method=GET http.request.remoteaddr="10.131.0.39:43436" http.request.uri="/v2/openshift/tools/blobs/sha256:d8df2ecc7eef38298649da03c160f94011670149e6727dd4be1e5b6969d37622" http.request.useragent="oc/4.10.0 (linux/amd64) kubernetes/af85060" openshift.auth.user="system:serviceaccount:e2e-test-image-append-mq6km:builder" vars.digest="sha256:d8df2ecc7eef38298649da03c160f94011670149e6727dd4be1e5b6969d37622" vars.name=openshift/tools and BZ 1923536 hides this problem even deeper.
https://github.com/openshift/library-go/pull/1226/files should help with error reporting, the error will happen on opening the blob.
The registry problem is tracked at BZ 1923536, moving to `oc` for the error reporting part.
It passed in recent job: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi/1478024133454336000 and no shows up for metal-ipi jobs: https://search.ci.openshift.org/?search=error%3A+uploading+the+source+layer&maxAge=168h&context=1&type=build-log&name=4.10.*metal&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job