Bug 1609463
Summary: | oc get on a custom resource is fetching discovery every time | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Clayton Coleman <ccoleman> |
Component: | oc | Assignee: | Maciej Szulik <maszulik> |
Status: | CLOSED ERRATA | QA Contact: | zhou ying <yinzhou> |
Severity: | low | Docs Contact: | |
Priority: | medium | ||
Version: | 3.11.0 | CC: | aos-bugs, deads, jokerman, maszulik, mfojtik, mmccomas |
Target Milestone: | --- | ||
Target Release: | 4.1.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause:
Discovery data was ignored during some kubectl invocations.
Consequence:
Every operation against CRDs downloaded entire discovery.
Fix:
Refactor the code to use cached data always.
Result:
Discovery data is fetched less frequently.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2019-06-04 10:40:22 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Clayton Coleman
2018-07-28 05:50:18 UTC
It appears that when the Kind for a custom resource is looked up by the resource builder [1], the discovery RESTMapper returns a `no matches for GVK` error every time. This in turn, causes the cachedDiscoveryClient's cache to be invalidated [2] every time we attempt to lookup GVK information for custom resources. Apparently, for any resource (not just custom resources), the discovery client's cache is always stale at this step [3], which is what prompts the "no matches for GVK" error in the case of a custom resource. It is only after the cache is invalidated and we attempt to "discover" the custom resource a second time that we end up successfully discovering our custom resource. Not sure why this happens. David, I was hoping you would have some insight on this. 1. https://github.com/kubernetes/kubernetes/blob/master/pkg/kubectl/genericclioptions/resource/builder.go#L692 2. https://github.com/kubernetes/client-go/blob/master/restmapper/discovery.go#L233 3. https://github.com/kubernetes/client-go/blob/master/restmapper/discovery.go#L232 It also appears that when we attempt to look up the Kind for a custom resource (I am using [1] as my example), by the time we get to [2], the partiallySpecifiedResource contains an incorrect Group and Version, but the correct resource. It is only after the cache has been invalidated that we end up with the correct GVR. For example: ``` $ oc get foo.samplecontroller.k8s.io/example-foo --loglevel 5 # GVR passed to [2]: {G: "k8s.io" V: "samplecontroller" R: "foo"} I0810 15:20:26.841015 26443 discovery.go:215] Invalidating discovery information # GVR passed to [2]: {G: "k8s.io" V: "samplecontroller" R: "foo"} # GVR passed to [2]: {G: "samplecontroller.k8s.io" V: "" R: "foo"} I0810 15:20:26.930256 26443 get.go:443] no kind is registered for the type v1beta1.Table in scheme "k8s.io/kubernetes/pkg/api/legacyscheme/scheme.go:29" NAME REPLICAS example-foo 1 ``` Not sure why we end up passing correct GVR information to the priority RESTMapper _after_ invalidating caches. 1. https://gist.github.com/soltysh/b1c38b1660eea4a4c4741b722774aede 2. https://github.com/kubernetes/apimachinery/blob/master/pkg/api/meta/priority.go#L92 happens with kubectl as well... should open an upstream issue to track interestingly, a partial name with no group (oc get foo/example-foo) does not trigger it Using the same CRD linked in comment 2, I am able to confirm locally that this problem only occurs when a Resource is not the same as an object's Kind: ``` $ oc get DeploymentConfig.apps.openshift.io/mydc --loglevel 5 # Calculated GVR: openshift.io/apps, Resource=DeploymentConfig # cache is not invalidated while trying to lookup KindFor NAME REVISION DESIRED CURRENT TRIGGERED BY pictre2 1 1 1 config,image(pictre2:latest) ``` ``` $ oc get Foo.samplecontroller.k8s.io/example-foo --loglevel 5 # Calculated GVR: k8s.io/samplecontroller, Resource=Foo # cache _is_ invalidated while looking up KindFor I0813 13:35:00.363452 30946 discovery.go:215] Invalidating discovery information I0813 13:35:00.443568 30946 get.go:443] no kind is registered for the type v1beta1.Table in scheme "k8s.io/kubernetes/pkg/api/legacyscheme/scheme.go:29" NAME REPLICAS example-foo 1 ``` Because there is no resource "Foo", this line [1] gets a "not found" error, invalidates caches, and then does a live discovery on a retry. This only appears to happen when a fully qualified "Kind.Group/name" format is given. Since we attempt to lookup GVR information first, the "Kind" is always assumed to be the Resource name until a failure occurs. Lowering the severity of this bug, as not specifying the fully qualified Kind.Group/Name format does not cause the cache to be invalidated: ``` $ oc get foos --loglevel 5 I0813 13:49:05.217210 5468 get.go:443] no kind is registered for the type v1beta1.Table in scheme "k8s.io/kubernetes/pkg/api/legacyscheme/scheme.go:29" NAME REPLICAS example-foo 1 ``` 1. https://github.com/kubernetes/kubernetes/blob/master/pkg/kubectl/genericclioptions/resource/builder.go#L692 https://github.com/openshift/origin/pull/22020 and a few others PRs landed to fix this issue. Moving to qa. This is because you're passing full name, if you pass the name of the resource (can even be short) it'll get matched and won't re-fetch discovery. Confirmed the the short name, won't re-fetch discovery: [root@preserve-master-yinzhou auth]# oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-02-27-213933 True False 27m Cluster version is 4.0.0-0.nightly-2019-02-27-213933 [root@dhcp-140-138 ~]# oc get dc/ruby-ex --loglevel=6 I0304 10:40:47.214075 17326 loader.go:359] Config loaded from file /root/.kube/config I0304 10:40:47.215367 17326 loader.go:359] Config loaded from file /root/.kube/config I0304 10:40:47.225579 17326 loader.go:359] Config loaded from file /root/.kube/config I0304 10:40:48.359508 17326 round_trippers.go:405] GET https://api.qe-yinzhou.qe.devcluster.openshift.com:6443/apis/apps.openshift.io/v1/namespaces/zhouy/deploymentconfigs/ruby-ex 200 OK in 1133 milliseconds I0304 10:40:48.360464 17326 get.go:558] no kind is registered for the type v1beta1.Table in scheme "k8s.io/kubernetes/pkg/api/legacyscheme/scheme.go:29" NAME REVISION DESIRED CURRENT TRIGGERED BY ruby-ex 1 1 1 config,image(ruby-ex:latest) [root@dhcp-140-138 ~]# oc get deploymentconfig/ruby-ex --loglevel=6 I0304 10:41:22.256238 17343 loader.go:359] Config loaded from file /root/.kube/config I0304 10:41:22.257534 17343 loader.go:359] Config loaded from file /root/.kube/config I0304 10:41:22.266430 17343 loader.go:359] Config loaded from file /root/.kube/config I0304 10:41:23.275211 17343 round_trippers.go:405] GET https://api.qe-yinzhou.qe.devcluster.openshift.com:6443/apis/apps.openshift.io/v1/namespaces/zhouy/deploymentconfigs/ruby-ex 200 OK in 1008 milliseconds I0304 10:41:23.276064 17343 get.go:558] no kind is registered for the type v1beta1.Table in scheme "k8s.io/kubernetes/pkg/api/legacyscheme/scheme.go:29" NAME REVISION DESIRED CURRENT TRIGGERED BY ruby-ex 1 1 1 config,image(ruby-ex:latest) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |