Bug 1894342 - oauth-apiserver logs many "[SHOULD NOT HAPPEN] failed to update managedFields for ... OAuthClient ... no corresponding type for oauth.openshift.io/v1, Kind=OAuthClient"
Summary: oauth-apiserver logs many "[SHOULD NOT HAPPEN] failed to update managedFields...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: oauth-apiserver
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.7.0
Assignee: Maru Newby
QA Contact: Xingxing Xia
URL:
Whiteboard:
Depends On:
Blocks: 1894345
TreeView+ depends on / blocked
 
Reported: 2020-11-04 05:04 UTC by Xingxing Xia
Modified: 2021-03-04 21:10 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 1894345 (view as bug list)
Environment:
Last Closed: 2021-02-24 15:30:03 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift oauth-apiserver pull 26 0 None closed register api groups to `legacyscheme.Scheme` early 2021-02-15 10:02:53 UTC
Github openshift origin pull 25652 0 None closed Add e2e testing of server-side apply for openshift types 2021-02-15 10:02:54 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:30:33 UTC

Description Xingxing Xia 2020-11-04 05:04:35 UTC
Description of problem:
oauth-apiserver logs many:
... [SHOULD NOT HAPPEN] failed to update managedFields for ... OAuthClient ... no corresponding type for oauth.openshift.io/v1, Kind=OAuthClient"

kube-apiserver and openshift-apiserver logs don't have any "SHOULD NOT HAPPEN" errors.

Version-Release number of selected component (if applicable):
4.6.1 and 4.7.0-0.nightly-2020-11-03-002310

How reproducible:
Always

Steps to Reproduce:
1. Install fresh env
2. Check logs: oc logs <pod_name> -n openshift-oauth-apiserver

Actual results:
2. oauth-apiserver logs have many above errors
$ oc get pod -n openshift-oauth-apiserver
apiserver-f6d8b6c75-5p5wh   1/1     Running   0          10h ...
$ oc logs oauth-apiserver-f6d8b6c75-5p5wh -n openshift-oauth-apiserver > logs/oauth-apiserver-f6d8b6c75-5p5wh.log
$ grep "SHOULD NOT HAPPEN" logs/oauth-apiserver-f6d8b6c75-5p5wh.log | wc -l
1421
$ grep "SHOULD NOT HAPPEN.*OAuthClient" logs/oauth-apiserver-f6d8b6c75-5p5wh.log | wc -l
1421
All are for OAuthClient type. For this pod of 10h age, 10*3600s / 1421 = 25s per one.

$ vi logs/oauth-apiserver-f6d8b6c75-5p5wh.log
...
2020-11-03T11:00:56.262382600Z E1103 11:00:56.262311       1 fieldmanager.go:175] [SHOULD NOT HAPPEN] failed to update managedFields for /, Kind=: failed to convert new object (oauth.openshift.io/v1, Kind=OAuthClient) to smd typed: no corresponding type for oauth.openshift.io/v1, Kind=OAuthClient
2020-11-03T11:01:26.200228617Z E1103 11:01:26.200173       1 fieldmanager.go:175] [SHOULD NOT HAPPEN] failed to update managedFields for /, Kind=: failed to convert new object (oauth.openshift.io/v1, Kind=OAuthClient) to smd typed: no corresponding type for oauth.openshift.io/v1, Kind=OAuthClient
2020-11-03T11:01:56.182477945Z E1103 11:01:56.182380       1 fieldmanager.go:175] [SHOULD NOT HAPPEN] failed to update managedFields for /, Kind=: failed to convert new object (oauth.openshift.io/v1, Kind=OAuthClient) to smd typed: no corresponding type for oauth.openshift.io/v1, Kind=OAuthClient
2020-11-03T11:02:56.241959494Z E1103 11:02:56.241629       1 fieldmanager.go:175] [SHOULD NOT HAPPEN] failed to update managedFields for /, Kind=: failed to convert new object (oauth.openshift.io/v1, Kind=OAuthClient) to smd typed: no corresponding type for oauth.openshift.io/v1, Kind=OAuthClient
2020-11-03T11:03:05.874222841Z E1103 11:03:05.874103       1 fieldmanager.go:175] [SHOULD NOT HAPPEN] failed to update managedFields for /, Kind=: failed to convert new object (oauth.openshift.io/v1, Kind=OAuthClient) to smd typed: no corresponding type for oauth.openshift.io/v1, Kind=OAuthClient
...

Expected results:
From the conversation in #qe-group-b with Dev, Dev will fix it.

Additional info:
oc logs <pod_name> -n openshift-apiserver | grep "SHOULD NOT HAPPEN" # none
oc logs <pod_name> -n openshift-kube-apiserver | grep "SHOULD NOT HAPPEN" # none

Comment 1 Maru Newby 2020-11-05 06:13:09 UTC
I ran oauth-apiserver locally with telepresence and delve, and then executed the proposed apply tests [1] against the cluster. I observed that the reported error was due to the fieldManager's typeConverter having an unpopulated gvks map [2]. This map is supposed to be set on api resource registration from the `x-kubernetes-group-version-kind` field of each openapi model.

I retrieved the openapi spec from a 4.7 cluster:

$ oc get --raw /openapi/v2

I retrieved the openapi spec from the cluster's oauth-apiserver:

$ oc port-forward -n openshift-oauth-apiserver $(oc get pods -n openshift-oauth-apiserver --no-headers | head -1 | cut -d' ' -f1) 8443:8443
$ curl -H "Authorization: Bearer $(oc whoami -t)" -Lk "https://localhost:8443/openapi/v2"

Comparing the results revealed that the oauth-apiserver openapi spec was missing `x-kubernetes-group-version-kind` from its definitions:

cluster openapi:

"com.github.openshift.api.oauth.v1.OAuthClient": {
 <snip>
  "x-kubernetes-group-version-kind": [
    {
      "group": "",
      "kind": "OAuthClient",
      "version": "v1"
    },
    {
      "group": "oauth.openshift.io",
      "kind": "OAuthClient",
      "version": "v1"
    }
  ]
}

oauth openapi:

"com.github.openshift.api.oauth.v1.OAuthClient": {
  <snip>
}


1: https://github.com/openshift/origin/pull/25652
2: https://github.com/openshift/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/endpoints/handlers/fieldmanager/internal/typeconverter.go#L87

----------------------------

Debugging instructions:

- disabled CVO management of the auth operator as per [3]

- scaled down the auth operator deployment

$ oc scale deploy authentication-operator -n openshift-authentication-operator --replicas=0

- retrieved the set of arguments currently used to run the oauth apiserver

$ oc get deploy apiserver -n openshift-oauth-apiserver -o yaml

- computed the telepresence arguments necessary to proxy all nodes in a target cluster

$ ALSO_PROXY="$(oc get machines -A -o json | jq -jr '.items[] | .status.addresses[0].address | @text "--also-proxy=\(.) "')"

- ran telepresence to get a shell with the environment of the oauth-apiserver

$ telepresence --namespace=openshift-oauth-apiserver --swap-deployment=apiserver --mount=/tmp/tel_root ${ALSO_PROXY} --run bash

- symlinked `/tmp/tel_root/{configmaps,secrets}` to /var/run/

- ran oauth-apiserver with `dlv debug` and the current set of arguments (previously sourced from the deployment) in the telepresence shell

3:  https://github.com/openshift/enhancements/blob/master/enhancements/operator-dev-doc.md#option-a---start-with-a-running-cluster

Comment 2 Maru Newby 2020-11-06 00:38:15 UTC
It should be possible verify this bug manually, but I'm also adding e2e testing that I'll ensure is backported to 4.6.

Comment 4 Xingxing Xia 2020-11-06 10:58:38 UTC
Verified in 4.7.0-0.nightly-2020-11-06-010750, oauth-apiserver logs don't have SHOULD NOT HAPPEN now.

Comment 7 errata-xmlrpc 2021-02-24 15:30:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.