Bug 2008926
Summary: | [sig-api-machinery] API data in etcd should be stored at the correct location and version for all resources [Serial] [Suite:openshift/conformance/serial] | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Devan Goodwin <dgoodwin> |
Component: | kube-apiserver | Assignee: | Stefan Schimanski <sttts> |
Status: | CLOSED ERRATA | QA Contact: | Rahul Gangwar <rgangwar> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.10 | CC: | aos-bugs, mfojtik, rgangwar, sippy, sttts, wking, xxia |
Target Milestone: | --- | ||
Target Release: | 4.10.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-03-10 16:13:56 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Devan Goodwin
2021-09-29 13:55:45 UTC
The test fails with fail [github.com/openshift/origin/test/extended/etcd/etcd_storage_path.go:518]: test failed: failed to clean up etcd: &errors.StatusError{ErrStatus:v1.Status{TypeMeta:v1.TypeMeta{Kind:"Status", APIVersion:"v1"}, ListMeta:v1.ListMeta{SelfLink:"", ResourceVersion:"", Continue:"", RemainingItemCount:(*int64)(nil)}, Status:"Failure", Message:"nodes \"node1\" not found", Reason:"NotFound", Details:(*v1.StatusDetails)(0xc0017c0840), Code:404}} i.e. the test is unable to clean up after it runs. When looking at the audit logs, it's visible that the AWS node controller removes the node object: {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"84279591-3c64-41df-918d-c5e0a0ed06e3","stage":"ResponseComplete","requestURI":"/api/v1/nodes/node1","verb":"delete","user":{"username":"system:serviceaccount:kube-system:node-controller","uid":"81dd1878-5c8a-402a-8600-0e8d88a253a8","groups":["system:serviceaccounts","system:serviceaccounts:kube-system","system:authenticated"]},"sourceIPs":["10.0.199.142"],"userAgent":"kube-controller-manager/v1.22.1 (linux/amd64) kubernetes/91b30ca/system:serviceaccount:kube-system:node-controller","objectRef":{"resource":"nodes","name":"node1","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Success","code":200},"requestReceivedTimestamp":"2021-09-29T04:18:49.289256Z","stageTimestamp":"2021-09-29T04:18:49.310260Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"system:controller:node-controller\" of ClusterRole \"system:controller:node-controller\" to ServiceAccount \"node-controller/kube-system\""}} This node object is essential for the test to verify proper serialization of the object. The node controller does the delete for unknown nodes, compare k8s.io/cloud-provider/controllers/nodelifecycle: klog.V(2).Infof("deleting node since it is no longer present in cloud provider: %s", node.Name) Maybe one can delay that by setting the node to ready. Then at least the node is kept until it is set not ready. This might be enough for it to survive long enough. As checked prow CI jobs, junit test was got passed https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-e2e-aws-serial/1443778864760229888 : [sig-api-machinery] API data in etcd should be stored at the correct location and version for all resources [Serial] [Suite:openshift/conformance/serial] 22s https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-e2e-aws-serial/1443718659464761344 : [sig-api-machinery] API data in etcd should be stored at the correct location and version for all resources [Serial] [Suite:openshift/conformance/serial] 12s Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |