Description of problem: While investigating issues from migrating the release-controller, and the imagestreams it monitors, from the api.ci cluster (3.11) to the app.ci cluster (4.6), we observed that the publicImageRepository entry isn't always being populated. Version-Release number of selected component (if applicable): 4.6 How reproducible: At the time. this was occurring continuously. So much so, that we had to add a safeguard, in the release-controller, to prevent this issue from failing releases: https://github.com/openshift/release-controller/pull/240 Actual results: The publicImageRepository field wasn't populated until some point after the imagestream was created and potentially used by downstream processes. Expected results: The publicImageRepository should always be populated.
It is always populated but after some time - when a new ImageStream is created, if a client has a Watch open they will see the ImageStream without this status field.
Sending to Adam. Reducing severity as to my knowledge there is no production outage or data loss involved. Please comment if it is.
Sending to Oleg, as his team owns ImageStreams.
This completely breaks consistency of any client built on image stream watch. I'm bumping it back to high until I get a determination of why it broke, it completely broke release controller. From the moment the public route is created, there is NEVER any reason for this value to be empty unless the user deletes all the routes. Any flakiness here is probably a broken operator.
There is no other place in our API we would accept "sometimes the API returns incorrect info to watch" that has not been a significant bug that imposes significant costs on clients, therefore, we fix it.
So far I wasn't able to reproduce this problem, but I'm able to occasionally observe it on build02. To observe events I used curl '.../apis/image.openshift.io/v1/imagestreams?resourceVersion=...&watch=true'. publicDockerImageRepository is usually populated except for some events that have type="ADDED". I don't see any patterns, previous and subsequent events in the same watch request have this field correctly populated. Example of an incorrect event: {"type":"ADDED","object":{"kind":"ImageStream","apiVersion":"image.openshift.io/v1","metadata":{"name":"pipeline","namespace":"ci-op-kimvmsws","selfLink":"/apis/image.openshift.io/v1/namespaces/ci-op-kimvmsws/imagestreams/pipeline","uid":"b8dfe90b-40b8-42a9-ab0e-26b8b8f63440","resourceVersion":"204693272","generation":1,"creationTimestamp":"2021-01-06T15:06:46Z","managedFields":[{"manager":"ci-operator","operation":"Update","apiVersion":"image.openshift.io/v1","time":"2021-01-06T15:06:46Z","fieldsType":"FieldsV1","fieldsV1":{"f:spec":{"f:lookupPolicy":{"f:local":{}}}}}]},"spec":{"lookupPolicy":{"local":true}},"status":{"dockerImageRepository":"image-registry.openshift-image-registry.svc:5000/ci-op-kimvmsws/pipeline"}}} There are also events with tags (both in spec and status) but without publicDockerImageRepository.
I was able to reproduce it on 4.7-nightly and 4.6.0. I cannot reproduce it on 4.5.24. So apparently it's a regression in 4.6.
Verified on 4.7.0-0.nightly-2021-02-03-165316: status: dockerImageRepository: image-registry.openshift-image-registry.svc:5000/wzheng1/rails-postgresql-example publicDockerImageRepository: default-route-openshift-image-registry.apps.wsun47kuryr.0204-kdj.qe.rhcloud.com/wzheng1/rails-postgresql-example
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633