Bug 2078778
| Summary: | [4.11] oc get ValidatingWebhookConfiguration,MutatingWebhookConfiguration fails and caused “apiserver panic'd...http2: panic serving xxx.xx.xxx.21:49748: cannot deep copy int” when AllRequestBodies audit-profile is used. | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | jmekkatt | ||||
| Component: | kube-apiserver | Assignee: | Abu Kashem <akashem> | ||||
| Status: | CLOSED ERRATA | QA Contact: | jmekkatt | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 4.11 | CC: | aojeagar, mfojtik, xxia | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 4.11.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2022-08-10 11:08:24 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
jmekkatt
2022-04-26 07:51:27 UTC
setting it to blocker+, the panic must not happen, and the request should succeed upstream fix in progress - https://github.com/kubernetes/kubernetes/pull/110408. Once it merges we will pick it in o/kubernetes Installed latest OCP version which include the fix.
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.0-0.nightly-2022-06-15-222801 True False 6m59s Cluster version is 4.11.0-0.nightly-2022-06-15-222801
Patched the audit profile to "AllRequestBodies".
$ oc patch apiserver cluster -p '{"spec": {"audit": {"profile": "AllRequestBodies"}}}' --type merge
apiserver.config.openshift.io/cluster patched
$ oc get apiserver/cluster -ojson | jq .spec.audit
{
"profile": "AllRequestBodies"
}
Once the new revisions rolled out, tried to get both ValidatingWebhookConfiguration and MutatingWebhookConfiguration objects. Both actions were succeeded.
$ oc get ValidatingWebhookConfiguration
NAME WEBHOOKS AGE
alertmanagerconfigs.openshift.io 1 36m
autoscaling.openshift.io 2 44m
machine-api 2 45m
multus.openshift.io 1 47m
performance-addon-operator 1 47m
prometheusrules.openshift.io 1 36m
snapshot.storage.k8s.io 1 46m
test-validating-cfg2 1 21s
$ oc get MutatingWebhookConfiguration
NAME WEBHOOKS AGE
machine-api 2 45m
test-mutating-cfg2 1 35s
Checked the kube-apiserver logs for any panic messages and nothing found.
$ oc logs kube-apiserver-jmekkatt-mpo-jzr7v-master-1.c.openshift-qe.internal -n openshift-kube-apiserver | grep -i "panic"
$ oc logs kube-apiserver-jmekkatt-mpo-jzr7v-master-0.c.openshift-qe.internal -n openshift-kube-apiserver | grep -i "panic"
$ oc logs kube-apiserver-jmekkatt-mpo-jzr7v-master-2.c.openshift-qe.internal -n openshift-kube-apiserver | grep -i "panic"
Repeated the same steps when kube-apiserver patched with "WriteRequestBodies" and "None". Everything works as expected.
$ oc patch apiserver cluster -p '{"spec": {"audit": {"profile": "WriteRequestBodies"}}}' --type merge
apiserver.config.openshift.io/cluster patched
$ oc get apiserver/cluster -ojson | jq .spec.audit
{
"profile": "WriteRequestBodies"
}
$ oc get ValidatingWebhookConfiguration
NAME WEBHOOKS AGE
alertmanagerconfigs.openshift.io 1 77m
autoscaling.openshift.io 2 85m
machine-api 2 85m
multus.openshift.io 1 87m
performance-addon-operator 1 88m
prometheusrules.openshift.io 1 77m
snapshot.storage.k8s.io 1 86m
$ oc get MutatingWebhookConfiguration
NAME WEBHOOKS AGE
machine-api 2 85m
$ oc logs kube-apiserver-jmekkatt-mpo-jzr7v-master-1.c.openshift-qe.internal -n openshift-kube-apiserver | grep -i "panic"
$ oc logs kube-apiserver-jmekkatt-mpo-jzr7v-master-0.c.openshift-qe.internal -n openshift-kube-apiserver | grep -i "panic"
$ oc logs kube-apiserver-jmekkatt-mpo-jzr7v-master-2.c.openshift-qe.internal -n openshift-kube-apiserver | grep -i "panic"
$ oc patch apiserver cluster -p '{"spec": {"audit": {"profile": "None"}}}' --type merge
apiserver.config.openshift.io/cluster patched
$ oc get apiserver/cluster -ojson | jq .spec.audit
{
"profile": "None"
}
$ oc get ValidatingWebhookConfiguration
NAME WEBHOOKS AGE
alertmanagerconfigs.openshift.io 1 103m
autoscaling.openshift.io 2 111m
machine-api 2 112m
multus.openshift.io 1 114m
performance-addon-operator 1 114m
prometheusrules.openshift.io 1 103m
snapshot.storage.k8s.io 1 113m
$ oc get MutatingWebhookConfiguration
NAME WEBHOOKS AGE
machine-api 2 112m
$ oc logs kube-apiserver-jmekkatt-mpo-jzr7v-master-1.c.openshift-qe.internal -n openshift-kube-apiserver | grep -i "panic"
$ oc logs kube-apiserver-jmekkatt-mpo-jzr7v-master-0.c.openshift-qe.internal -n openshift-kube-apiserver | grep -i "panic"
$ oc logs kube-apiserver-jmekkatt-mpo-jzr7v-master-2.c.openshift-qe.internal -n openshift-kube-apiserver | grep -i "panic"
Hence the issue has fixed in tested version and moved ticket to "verified".
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 |