Bug 1801095 - Failed to force the deployment to use local name lookup
Summary: Failed to force the deployment to use local name lookup
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: openshift-controller-manager
Version: 4.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.4.0
Assignee: Maciej Szulik
QA Contact: zhou ying
URL:
Whiteboard: workloads
Depends On: 1805155
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-10 09:16 UTC by zhou ying
Modified: 2020-05-13 21:57 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-13 21:57:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
deployment and rs (1.14 KB, application/gzip)
2020-02-11 08:08 UTC, zhou ying
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift apiserver-library-go pull 23 0 None closed Bug 1801095: add all supported versions for pod mutators 2020-10-12 12:30:14 UTC
Github openshift apiserver-library-go pull 24 0 None closed Bug 1801095: Fix imagepolicyresolve plugin to resolve when enabled on an existing object 2020-10-12 12:30:14 UTC
Github openshift origin pull 24530 0 None closed Bug 1805155: Fix image resolve plugin on updates and add tests 2020-10-12 12:30:14 UTC
Github openshift origin pull 24571 0 None closed [release-4.4] Bug 1801095: Fix image resolve plugin on updates and add tests 2020-10-12 12:30:15 UTC
Red Hat Product Errata RHBA-2020:0581 0 None None None 2020-05-13 21:57:17 UTC

Description zhou ying 2020-02-10 09:16:48 UTC
Description of problem:
Failed to force the deployment to use local name lookup

Version-Release number of selected component (if applicable):
[root@dhcp-140-138 ~]# oc version 
Client Version: 4.4.0-0.nightly-2020-02-10-035806
Kubernetes Version: v1.17.1

How reproducible:
always

Steps to Reproduce:
1. Create ImageStream:
  `oc tag openshift/deployment-example:v1 --source=docker app:v1`
2. Create deploy to use the Imagestream:
  `oc create deployment app --image=app:v1`
3. Set deployment to use local image lookup
  `oc set image-lookup deployment/app`


Actual results:
3.Deploy failed with error:
[root@dhcp-140-138 ~]#  oc describe deployment/app
Name:                   app
Namespace:              zhouy
CreationTimestamp:      Mon, 10 Feb 2020 16:46:13 +0800
Labels:                 app=app
Annotations:            deployment.kubernetes.io/revision: 2879
Selector:               app=app
Replicas:               1 desired | 1 updated | 1 total | 0 available | 1 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:       app=app
  Annotations:  alpha.image.policy.openshift.io/resolve-names: *
  Containers:
   app:
    Image:        app:v1
    Port:         <none>
    Host Port:    <none>
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      False   MinimumReplicasUnavailable
  Progressing    True    NewReplicaSetCreated
OldReplicaSets:  app-858f464854 (1/1 replicas created), app-b797c6d6d (1/1 replicas created)

`oc get event`
Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_app-db87ffc99-cnf8j_zhouy_f02be028-d947-4306-81d3-e9a39ea32faa_0(4da67f3b9df5232326a7df46ddabfc5aba08241e2c0b70d9c76d6ab87ec2cd8e): Multus: error adding pod to network "openshift-sdn": delegateAdd: cannot set "openshift-sdn" interface name to "eth0": validateIfName: no net namespace /proc/725663/ns/net found: failed to Statfs "/proc/725663/ns/net": no such file or directory

Expected results:
3. Pod running


Additional info:

Comment 1 Tomáš Nožička 2020-02-10 17:14:46 UTC
can you get the yaml for the deployment, replicaset and the pod?

Comment 2 zhou ying 2020-02-11 08:08:59 UTC
Created attachment 1662384 [details]
deployment and rs

Comment 3 zhou ying 2020-02-11 08:09:48 UTC
The deploy will burst and old pod will be deleted , so it's hard for me to get the pod's yaml.

Comment 4 Tomáš Nožička 2020-02-11 09:49:23 UTC
Is it creating RSs in a loop? it looks to me this way by deployment.kubernetes.io/revision: 2879 and from the dump.

also the RS has injected image that is already resolved to:
image: openshift/deployment-example@sha256:c505b916f7e5143a356ff961f2c21aee40fbd2cd906c1e3feeb8d5e978da284b

I thought that image plugin is suppose to resolve on Pod level...

Wonder if it is broken on previous releases too

Comment 5 Tomáš Nožička 2020-02-12 16:31:41 UTC
The image resolve admission plugin works (opened https://github.com/openshift/origin/pull/24530 to prove it) but `oc set image-lookup deploy/app` sets the annotation only for the template not itself which is what triggers the deepequal hotloop.

Comment 6 Ricardo Maraschini 2020-02-13 12:08:44 UTC
OC client sets the annotation only on the template for at least 3 years: 

https://github.com/openshift/oc/blame/master/pkg/cli/set/imagelookup.go#L236-L251

I wonder what else might have changed to make this stop working, I will check on 4.2 to see if it is working there.

Comment 7 Ricardo Maraschini 2020-02-13 13:14:49 UTC
1. We don't have the Deployment recreation loop on 4.2.
2. The annotation is created only on template when on 4.2(See comment-5) as it is in the 4.4.
3. It seems like there is no difference between the created Deployments from different revisions during the loop on 4.4(that is weird).
4. There is a constant increase on Deployment's `status.collisionCount` during the loop on 4.4.

Comment 8 Ricardo Maraschini 2020-02-13 13:16:18 UTC
I have patched oc to also add the annotation on the deployment, together with the template. That does not solved.

Comment 9 Ricardo Maraschini 2020-02-13 13:27:22 UTC
oc patch deploy/redis -p '{"spec":{"template":{"metadata":{"annotations":{"alpha.image.policy.openshift.io/resolve-names":"*"}}}}}' --type=merge

or 

oc patch deploy/redis -p '{"metadata":{"annotations":{"alpha.image.policy.openshift.io/resolve-names":"*"}},"spec":{"template":{"metadata":{"annotations":{"alpha.image.policy.openshift.io/resolve-names":"*"}}}}}' --type=merge

both cause the same problem.

Comment 11 Maciej Szulik 2020-02-13 14:20:39 UTC
This is a problem with mutators in ImagePolicy admission, working on a fix atm.

Comment 13 Tomáš Nožička 2020-02-20 08:46:41 UTC
fyi, I have found out that it doesn't matter if the annotation is on the object it self or the template, if the object is being registered for resolve. The real issue here is that the admission was incorrectly skipping updates that were enabling the resolve. More details on the referenced PRs. Also CronJobs weren't registered which we fixed as well.

Comment 15 zhou ying 2020-02-25 05:27:49 UTC
Confirmed with latest payload:4.4.0-0.nightly-2020-02-24-105333 , can't reproduce the issue now.

Comment 17 errata-xmlrpc 2020-05-13 21:57:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581


Note You need to log in before you can comment on or make changes to this bug.