Description of problem: When backing up a deploymentConfig with more than 1 revision within a project, e.g: `oc get dc,rc -o yaml > backup.yaml` and then trying to restore it, there are two things occurring: 1.- RC's are not created since they contain an ownerRefs section which refers to a non-existent DeploymentCconfig and the UUID does not match with the newly created DeploymentConfig object. 2.- If latest revision was set to X, where X is > 1, e.g: 3, then if ownerRef field within RC's object definition is removed, the RC's objects are created (since they don't refer to a non-existant DC anymore), however the revision for the DC is set to 1 instead of X (3 in this case). That means, when `oc rollout latest <dc>` is executed, it will tell you that it successfully rolled out, but nothing will happen (just the DC revision is bumped) until you call that command three times. On fourth time, it will actually trigger a new rollout. Version-Release number of selected component (if applicable): It has been reproduced in 3.7 and 3.11 How reproducible: Always Steps to Reproduce: 1. oc new-project test 2. oc new-app <URL> 3. oc rollout latest test 4. oc get rc (this should show two rc's) 5. oc get rc,dc -o yaml > backup.yaml 6. oc delete all --all 7. oc create -f backup.yaml Actual results: Firstly ReplicationController objects are not created since they contain an 'ownerRef' field which refers to a non-existent DC object through the UUID, and secondly if this field is removed from the ReplicationControllers object definition and the rc's get created, the deploymentconfig's revision number is set to one instead of to X (where X was the last revision when the dump was taken). Expected results: It should be able to restore from a dump file, e.g: backup.yaml and get all objects created with the proper revisions number in place. Additional info: Upstream issue: https://github.com/openshift/origin/issues/20729
You need to strip the ownerrefs manually or by a tool before re-creating it from the dump. (Or create a separate BZ targeted at the CLI and helping you do that with e.g. `oc create --strip-ownerrefs`. This is a general issue applicable to upstream Kubernetes as well.) The "import" will be fine then and the RCs get adopted. We have an issue there making you do dummy rollouts which we are going to fix here.
Hi Joel, yes, having a tool to strip down ownerReferences from object dumps is an RFE.
https://github.com/openshift/origin/pull/22324
The issue still could be reproduced by latest OCP 3.11: [zhouying@dhcp-140-138 ~]$ oc version oc v3.11.105 kubernetes v1.11.0+d4cacc0 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://vm-10-0-77-161.hosted.upshift.rdu2.redhat.com:8443 openshift v3.11.104 kubernetes v1.11.0+d4cacc0 [zhouying@dhcp-140-138 ~]$ oc create -f bakkkk.yaml deploymentconfig.apps.openshift.io/hello-openshift created Error from server (Forbidden): replicationcontrollers "hello-openshift-1" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: no RBAC policy matched, <nil> Error from server (Forbidden): replicationcontrollers "hello-openshift-2" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: no RBAC policy matched, <nil> [zhouying@dhcp-140-138 ~]$ oc get rc No resources found. [zhouying@dhcp-140-138 ~]$ oc get dc NAME REVISION DESIRED CURRENT TRIGGERED BY hello-openshift 0 1 0 config,image(hello-openshift:latest) [zhouying@dhcp-140-138 ~]$ oc get po No resources found. And the dc can't be rollout because no related imagestream created. [zhouying@dhcp-140-138 ~]$ oc rollout latest dc/hello-openshift Error from server (BadRequest): cannot trigger a deployment for "hello-openshift" because it contains unresolved images
Hi all, any update here?
> Error from server (Forbidden): replicationcontrollers "hello-openshift-1" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: no RBAC policy matched, you need to delete owner references manually before recreating RCs (the UIDs would differ anyways)
Hi Tomáš : When I delete the owner references, then created succeed, but the DC will lost the rc: [yinzhou@192 ~]$ oc get dc NAME REVISION DESIRED CURRENT TRIGGERED BY hello-openshift 0 1 0 config,image(hello-openshift:latest) [yinzhou@192 ~]$ oc get rc NAME DESIRED CURRENT READY AGE hello-openshift-1 0 0 0 30m hello-openshift-2 1 1 1 30m Is this by design ?
[root@dhcp-140-138 oc-client]# oc get rc NAME DESIRED CURRENT READY AGE hello-openshift-1 0 0 0 4h hello-openshift-2 1 1 1 4h [root@dhcp-140-138 oc-client]# oc get dc NAME REVISION DESIRED CURRENT TRIGGERED BY hello-openshift 0 1 0 config,image(hello-openshift:latest) [root@dhcp-140-138 oc-client]# oc get rc hello-openshift-2 -o template --template='{{.metadata.ownerReferences}}' [map[apiVersion:apps.openshift.io/v1 blockOwnerDeletion:true controller:true kind:DeploymentConfig name:hello-openshift uid:915b4cb8-b98d-11e9-afaa-fa163e39231b]] When I recreate the DC and RC, the DC failed to adopt the RC .
Created attachment 1601729 [details] dump the dc and rc after receate
dc.status.latestVersion shouldn't be 0 I'll spin a cluster and look into it
I have just tried it by building ose v3.11.104-1+95ffd35 and the adoption worked fine for me. It almost looks like the cluster didn't have the new patch. wonder why there is oc version v3.11.105 and openshift api v3.11.104 - not that oc would matter git tag --contains cfd91671c9c96552bd0c52e3bb7ccd8e86e3246f v3.11.101-1 v3.11.102-1 v3.11.103-1 v3.11.104-1 v3.11.105-1 v3.11.106-1 v3.11.107-1 v3.11.108-1 v3.11.109-1 v3.11.110-1 v3.11.111-1 v3.11.112-1 v3.11.113-1 v3.11.114-1 v3.11.115-1 v3.11.116-1 v3.11.117-1 v3.11.118-1 v3.11.119-1 v3.11.120-1 v3.11.121-1 v3.11.122-1 v3.11.123-1 v3.11.124-1 v3.11.125-1 v3.11.126-1 v3.11.127-1 v3.11.128-1 v3.11.129-1 v3.11.130-1 v3.11.131-1 v3.11.132-1 v3.11.133-1 v3.11.134-1 v3.11.135-1 v3.11.136-1 Also there is an e2e in https://github.com/openshift/origin/blob/a3dcfc0040cd5c6b1bda6e7d0d93192a39b5d473/test/extended/deployments/deployments.go#L1549 which should hopefully cover it if you want to look at differences. can you try the approach shown in https://bugzilla.redhat.com/show_bug.cgi?id=1741133#c0 ? Otherwise I'd need to see master controllers logs or possibly leaving the QA cluster alive so I can investigate there.
Double confirmed with : [zhouying@dhcp-140-138 test-bugs]$ oc version oc v3.11.136 kubernetes v1.11.0+d4cacc0 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://ci-vm-10-0-151-207.hosted.upshift.rdu2.redhat.com:8443 openshift v3.11.136 kubernetes v1.11.0+d4cacc0 When I follow the steps: 1. oc new-project test 2. oc new-app <URL> 3. oc rollout latest test 4. oc get rc (this should show two rc's) 5. oc get rc,dc -o yaml > backup.yaml 6. oc delete all --all 7. Delete the owner references from the backup yaml file 8. oc create -f backup.yaml Then the result will be: the dc will lost adoption of the RC : [zhouying@dhcp-140-138 test-bugs]$ oc get dc NAME REVISION DESIRED CURRENT TRIGGERED BY hello-openshift 0 1 0 config,image(hello-openshift:latest) [zhouying@dhcp-140-138 test-bugs]$ oc get rc NAME DESIRED CURRENT READY AGE hello-openshift-1 0 0 0 59s hello-openshift-2 1 1 1 59s When I follow the steps from: https://bugzilla.redhat.com/show_bug.cgi?id=1741133#c0 , the dc.status.latestVersion still be 2, not the respected 3.
3.11 is closed for non critical fixes, this might have been fixed since then. Moving to QA to test against our current code base.
[root@dhcp-140-138 roottest]# oc version -o yaml clientVersion: buildDate: "2020-05-23T15:25:26Z" compiler: gc gitCommit: 44354e2c9621e62b46d1854fd2d868f46fcdffff gitTreeState: clean gitVersion: 4.5.0-202005231517-44354e2 goVersion: go1.13.4 major: "" minor: "" platform: linux/amd64 1) oc create deploymentconfig dctest --image=openshift/hello-openshift 2) oc rollout latest dc/dctest 3) oc get rc,dc -o yaml > /tmp/backup.yaml 4) oc delete all --all 5) delete the owner references from the backup yaml for the rc; 6) oc create -f /tmp/backup.yaml replicationcontroller/dctest-1 created replicationcontroller/dctest-2 created deploymentconfig.apps.openshift.io/dctest created 7) [root@dhcp-140-138 roottest]# oc get dc NAME REVISION DESIRED CURRENT TRIGGERED BY dctest 1 1 0 config [root@dhcp-140-138 roottest]# oc describe dc/dctest Name: dctest Namespace: zhouydc Created: About a minute ago Labels: <none> Annotations: <none> Latest Version: 1 Selector: deployment-config.name=dctest Replicas: 1 Triggers: Config Strategy: Rolling Template: Pod Template: Labels: deployment-config.name=dctest Containers: default-container: Image: openshift/hello-openshift Port: <none> Host Port: <none> Environment: <none> Mounts: <none> Volumes: <none> Latest Deployment: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning DeploymentCreationFailed 28s (x14 over 69s) deploymentconfig-controller Couldn't deploy version 1: replicationcontrollers "dctest-1" already exists
the DC's .status.latestVersion is 1, not respected as 3.
Zhou Ying, thanks for re-verification on our newest release, looks like this needs to be investigated. I'll try to schedule some time to look into it. Adding UpcomingSprint as I was fully occupied with bugs having higher priority.
I’m adding UpcomingSprint, because I was occupied by fixing bugs with higher priority/severity, developing new features with higher priority, or developing new features to improve stability at a macro level. I will revisit this bug next sprint.
This should be fixed by now in 4.7 since with k8s 1.20 we've got improved GC which matches resources by their UIDs.
Maciej Szulik: Checked with latest [root@dhcp-140-138 ~]# oc version Client Version: 4.7.0-202102032256.p0-c66c03f Server Version: 4.7.0-0.nightly-2021-02-03-165316 Kubernetes Version: v1.20.0+e761892 [root@dhcp-140-138 ~]# oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.0-0.nightly-2021-02-03-165316 True False 83m Cluster version is 4.7.0-0.nightly-2021-02-03-165316 see the steps: 1) oc create deploymentconfig dctest --image=openshift/hello-openshift 2) oc rollout latest dc/dctest 3) oc get rc,dc -o yaml > /tmp/backup.yaml 4) oc delete all --all 5) delete the owner references from the backup yaml for the rc; 6) oc create -f /tmp/backup.yaml replicationcontroller/dctest-1 created replicationcontroller/dctest-2 created deploymentconfig.apps.openshift.io/dctest created 7) [root@dhcp-140-138 roottest]# oc get dc [root@dhcp-140-138 ~]# oc get dc NAME REVISION DESIRED CURRENT TRIGGERED BY dctest 2 1 1 config [root@dhcp-140-138 ~]# oc describe dc/dctest Name: dctest Namespace: zhouyt Created: 24 seconds ago Labels: <none> Annotations: <none> Latest Version: 2 Selector: deployment-config.name=dctest Replicas: 1 Triggers: Config Strategy: Rolling Template: Pod Template: Labels: deployment-config.name=dctest Containers: default-container: Image: openshift/hello-openshift Port: <none> Host Port: <none> Environment: <none> Mounts: <none> Volumes: <none> Deployment #2 (latest): Name: dctest-2 Created: 26 seconds ago Status: Complete Replicas: 1 current / 1 desired Selector: deployment-config.name=dctest,deployment=dctest-2,deploymentconfig=dctest Labels: openshift.io/deployment-config.name=dctest Pods Status: 1 Running / 0 Waiting / 0 Succeeded / 0 Failed Deployment #1: Created: 26 seconds ago Status: Complete Replicas: 0 current / 0 desired Events: <none> But dc.status.latestVersion still be 2, is this expected ?
Maciej Szulik: Please ignore my last question , no issue now , will move to verified status.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633