Description of problem: Failed to force the deployment to use local name lookup Version-Release number of selected component (if applicable): [root@dhcp-140-138 ~]# oc version Client Version: 4.4.0-0.nightly-2020-02-10-035806 Kubernetes Version: v1.17.1 How reproducible: always Steps to Reproduce: 1. Create ImageStream: `oc tag openshift/deployment-example:v1 --source=docker app:v1` 2. Create deploy to use the Imagestream: `oc create deployment app --image=app:v1` 3. Set deployment to use local image lookup `oc set image-lookup deployment/app` Actual results: 3.Deploy failed with error: [root@dhcp-140-138 ~]# oc describe deployment/app Name: app Namespace: zhouy CreationTimestamp: Mon, 10 Feb 2020 16:46:13 +0800 Labels: app=app Annotations: deployment.kubernetes.io/revision: 2879 Selector: app=app Replicas: 1 desired | 1 updated | 1 total | 0 available | 1 unavailable StrategyType: RollingUpdate MinReadySeconds: 0 RollingUpdateStrategy: 25% max unavailable, 25% max surge Pod Template: Labels: app=app Annotations: alpha.image.policy.openshift.io/resolve-names: * Containers: app: Image: app:v1 Port: <none> Host Port: <none> Environment: <none> Mounts: <none> Volumes: <none> Conditions: Type Status Reason ---- ------ ------ Available False MinimumReplicasUnavailable Progressing True NewReplicaSetCreated OldReplicaSets: app-858f464854 (1/1 replicas created), app-b797c6d6d (1/1 replicas created) `oc get event` Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_app-db87ffc99-cnf8j_zhouy_f02be028-d947-4306-81d3-e9a39ea32faa_0(4da67f3b9df5232326a7df46ddabfc5aba08241e2c0b70d9c76d6ab87ec2cd8e): Multus: error adding pod to network "openshift-sdn": delegateAdd: cannot set "openshift-sdn" interface name to "eth0": validateIfName: no net namespace /proc/725663/ns/net found: failed to Statfs "/proc/725663/ns/net": no such file or directory Expected results: 3. Pod running Additional info:
can you get the yaml for the deployment, replicaset and the pod?
Created attachment 1662384 [details] deployment and rs
The deploy will burst and old pod will be deleted , so it's hard for me to get the pod's yaml.
Is it creating RSs in a loop? it looks to me this way by deployment.kubernetes.io/revision: 2879 and from the dump. also the RS has injected image that is already resolved to: image: openshift/deployment-example@sha256:c505b916f7e5143a356ff961f2c21aee40fbd2cd906c1e3feeb8d5e978da284b I thought that image plugin is suppose to resolve on Pod level... Wonder if it is broken on previous releases too
The image resolve admission plugin works (opened https://github.com/openshift/origin/pull/24530 to prove it) but `oc set image-lookup deploy/app` sets the annotation only for the template not itself which is what triggers the deepequal hotloop.
OC client sets the annotation only on the template for at least 3 years: https://github.com/openshift/oc/blame/master/pkg/cli/set/imagelookup.go#L236-L251 I wonder what else might have changed to make this stop working, I will check on 4.2 to see if it is working there.
1. We don't have the Deployment recreation loop on 4.2. 2. The annotation is created only on template when on 4.2(See comment-5) as it is in the 4.4. 3. It seems like there is no difference between the created Deployments from different revisions during the loop on 4.4(that is weird). 4. There is a constant increase on Deployment's `status.collisionCount` during the loop on 4.4.
I have patched oc to also add the annotation on the deployment, together with the template. That does not solved.
oc patch deploy/redis -p '{"spec":{"template":{"metadata":{"annotations":{"alpha.image.policy.openshift.io/resolve-names":"*"}}}}}' --type=merge or oc patch deploy/redis -p '{"metadata":{"annotations":{"alpha.image.policy.openshift.io/resolve-names":"*"}},"spec":{"template":{"metadata":{"annotations":{"alpha.image.policy.openshift.io/resolve-names":"*"}}}}}' --type=merge both cause the same problem.
This is a problem with mutators in ImagePolicy admission, working on a fix atm.
fyi, I have found out that it doesn't matter if the annotation is on the object it self or the template, if the object is being registered for resolve. The real issue here is that the admission was incorrectly skipping updates that were enabling the resolve. More details on the referenced PRs. Also CronJobs weren't registered which we fixed as well.
Confirmed with latest payload:4.4.0-0.nightly-2020-02-24-105333 , can't reproduce the issue now.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581