Bug 1539529
Summary: | `oc apply --force` will delete resource when failing to apply | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Xingxing Xia <xxia> |
Component: | oc | Assignee: | Juan Vallejo <jvallejo> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Xingxing Xia <xxia> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 3.9.0 | CC: | aos-bugs, dmace, hasha, jokerman, mfojtik, mmccomas, smunilla, xxia |
Target Milestone: | --- | Flags: | xxia:
needinfo-
|
Target Release: | 3.9.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | v3.9.0-0.42.0 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-06-18 18:17:45 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Xingxing Xia
2018-01-29 06:36:15 UTC
This is actually how --force is supposed to work: --force=false: Delete and re-create the specified resource, when PATCH encounters conflict and has retried for 5 times However the resource is supposed to also be recreated. Is it not being recreated after it has been deleted? Alright I think I got to the bottom of this. Because you are saving the object by piping the entirety of the existing one into a file (from step 2)... > 2. When pod is deployed running, user gets dc yaml > $ oc get dc hello -o yaml > mydc.yaml ...The resource version of the object is also being saved in "mydc.yaml". When you perform the apply command, using what's stored in "mydc.yaml", you are patching a bunch of fields on the existing object, including the metadata.resourceVersion field which was also saved from step 2. Since you are sending patch bytes to the server containing a different resourceVersion than what now exists on the object in the server, you are causing a conflict error here [1]. Normally this would just result in the command failing with no side-effects. However, the apply command sees this conflict error as its cue to to keep retrying the patch five more times [2]. When this fails, it sees that the error did not end up as nil after those 5 attempts, and goes on to delete the object, and create a new one from the contents in "mydc.yaml" [3]. It is then that the object is deleted, but not recreated because aside from it having a conflict error, caused by leaving in the "resourceVersion" field, the change you made to "mydc.yaml" was also invalid. So normally the invalid change would be caught here [1] when sending the patch bytes to the server. But because the object also had a different "resourceVersion" than expected (this field should not have been present in "mydc.yaml" in the first place), that conflict error was returned instead, causing us to end up back at [3]. There is another error here, and that is that were you to remove the "resourceVersion" from "mydc.yaml", you would be left with a non-conflict error. But because the error's kind is not checked before deleting-and-creating-the-new-object [4], you would end up in a similar situation as the scenario discussed above. I have opened a patch that attempts to address both of these situations [5]. If `--force` is provided, its effects only take place if the final error returned by the server after trying to patch is actually a conflict error, and not just any error. Additionally, if a user accidentally leaves in a resourceVersion field (or decides to include it for whatever reason), causing an actual conflict error AND the patch that is provided also ends up being invalid, [5] would attempt to restore the original object when the new one inevitably fails to be created. 1. https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/kubectl/cmd/apply.go#L660 2. https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/kubectl/cmd/apply.go#L667 3. https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/kubectl/cmd/apply.go#L678 4. https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/kubectl/cmd/apply.go#L677 5. https://github.com/openshift/origin/pull/18337 Dan, I'm wondering if it is worth keeping the part of the PR [1] in place that accounts for when a resource is both invalid and causes a conflict error. Should we expect users to not include a resourceVersion field? Also could such a case be hit in any other way? 1. https://github.com/openshift/origin/pull/18337/files#diff-cb112dedd27138a42778ba4c699ff68fR702 (In reply to Juan Vallejo from comment #1) > This is actually how --force is supposed to work: > > --force=false: Delete and re-create the specified resource, when PATCH > encounters conflict and > has retried for 5 times > > However the resource is supposed to also be recreated. Is it not being > recreated after it has been deleted? Yeah, checked help info of '--force' when reporting, and found it is not recreated after deletion, thus reported :) Upstream PR [1] has merged. Waiting on Origin PR [2]. 1. https://github.com/kubernetes/kubernetes/pull/58991/files 2. https://github.com/openshift/origin/pull/18337 Origin PR has merged. Moving to ON_QA PR just newly merged. Will verify when new upcoming OCP 3.9 version available Tested v3.9.0-0.42.0 per comment 0 steps, step 3 gives the pasted prompt, but now does not delete the resource. Issue is fixed |