Bug 1880787

Summary: No description for Provisioning CRD for `oc explain`
Product: OpenShift Container Platform Reporter: Jason Boxman <jboxman>
Component: Cloud ComputeAssignee: Angus Salkeld <asalkeld>
Cloud Compute sub component: BareMetal Provider QA Contact: Shelly Miron <smiron>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: agarcial, asalkeld, beth.white, brad, fabian, fsimonce, rbartal, stbenjam, wking, yjoseph, zbitter
Version: 4.6Keywords: Triaged, UserExperience
Target Milestone: ---   
Target Release: 4.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Previously `oc explain Provisioning` did not produce the details that were in the CRD as we were using an old version of the CRD definition. Fix: The version of the CRD has been updated and now `oc explain Provisioning` works as expected.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-24 15:18:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jason Boxman 2020-09-20 00:23:01 UTC
Description of problem:

There is no description for the Provisioning CRD in `oc explain`.


How reproducible:

Always.


Steps to Reproduce:

`oc explain Provisioning`

Actual results:

KIND:     Provisioning
VERSION:  metal3.io/v1alpha1

DESCRIPTION:
     <empty>



Expected results:

A description.

Comment 1 Zane Bitter 2020-09-21 19:03:16 UTC
This CRD is in the MAO repo, so changing the component:

https://github.com/openshift/machine-api-operator/blob/master/install/0000_30_machine-api-operator_04_metal3provisioning.crd.yaml

I assume this file is automatically generated from the controller in the cluster-baremetal-operator (which is not part of the 4.6 release). To fix it we either need to 
* regenerate it using API version  apiextensions.k8s.io/v1 (instead of v1beta1 - use the --crd-version=v1 flag to operator-sdk, not sure about kubebuilder); or
* manually edit in "x-kubernetes-preserve-unknown-fields: false" to make it into a structural schema (see https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#specifying-a-structural-schema).

Comment 4 Angus Salkeld 2020-11-03 08:28:30 UTC
This commit will fix the issue: https://github.com/openshift/cluster-baremetal-operator/commit/f38ec83878090ea18cf0a9482a710b35332df44d
If testing remember to completely delete the CRD first and re-create it. When I tested, running an update did not cause 'oc explain' to
show the new details (maybe it's cached somewhere?).

Comment 5 Zane Bitter 2020-11-03 14:29:39 UTC
(In reply to Angus Salkeld from comment #4)
> When I tested, running an update did not cause 'oc explain' to
> show the new details (maybe it's cached somewhere?).

There's a note about that here: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#field-pruning

"CRDs converted from apiextensions.k8s.io/v1beta1 to apiextensions.k8s.io/v1 might lack structural schemas, and spec.preserveUnknownFields might be true.

For migrated CustomResourceDefinitions where spec.preserveUnknownFields is set, pruning is not enabled and you can store arbitrary data. For best compatibility, you should update your custom resources to meet an OpenAPI schema, and you should set spec.preserveUnknownFields to true for the CustomResourceDefinition itself."

It sounds like because the default has flipped, simply updating the CRD with a v1 version generated by kubebuilder (which does not explicitly assert the new default) will leave the preserveUnknownFields at its previous setting. (I tend to think we shouldn't be too worried about this.)

Comment 6 Shelly Miron 2020-11-09 08:51:40 UTC
[kni@provisionhost-0-0 ~]$ oc version

Client Version: 4.7.0-0.nightly-2020-10-27-051128
Server Version: 4.7.0-0.nightly-2020-10-27-051128
Kubernetes Version: v1.19.0+e67f5dc

Before re-creating the CRD:
----------------------------------

[kni@provisionhost-0-0 ~]$ oc explain Provisioning

KIND:     Provisioning
VERSION:  metal3.io/v1alpha1

DESCRIPTION:
     <empty>

After re-creating the CRD:
----------------------------------

[kni@provisionhost-0-0 ~]$ oc explain Provisioning

KIND:     Provisioning
VERSION:  metal3.io/v1alpha1

DESCRIPTION:
     Provisioning contains configuration used by the Provisioning service
     (Ironic) to provision baremetal hosts. Provisioning is created by the
     OpenShift installer using admin or user provided information about the
     provisioning network and the NIC on the server that can be used to PXE boot
     it. This CR is a singleton, created by the installer and currently only
     consumed by the cluster-baremetal-operator to bring up and update
     containers in a metal3 cluster.

FIELDS:
   apiVersion	<string>
     APIVersion defines the versioned schema of this representation of an
     object. Servers should convert recognized schemas to the latest internal
     value, and may reject unrecognized values. More info:
     https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources

   kind	<string>
     Kind is a string value representing the REST resource this object
     represents. Servers may infer this from the endpoint the client submits
     requests to. Cannot be updated. In CamelCase. More info:
     https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds

   metadata	<Object>
     Standard object's metadata. More info:
     https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata

   spec	<Object>
     ProvisioningSpec defines the desired state of Provisioning

   status	<Object>
     ProvisioningStatus defines the observed state of Provisioning


Is this considered as fixed although the CRD needed to be re-created each time?

Comment 8 Stephen Benjamin 2020-12-02 12:32:46 UTC
I think it failed QA because MAO still has the older CRD which is in the process of being removed.

CVO has a very interesting behavior where it's constantly applying the MAO CRD, and then shortly after applying the one from CBO.  There's a very short window where you'll get the older one which can't display the description when using `oc explain`.

Comment 9 Stephen Benjamin 2020-12-02 12:33:13 UTC
*** Bug 1903275 has been marked as a duplicate of this bug. ***

Comment 10 Stephen Benjamin 2020-12-02 12:37:01 UTC
This is affecting OCP release payloads, as there's a test to check if each CRD has a description. When it catches it in that short window when the older CRD is present, the test is failing and blocking the release.

Comment 12 Zane Bitter 2020-12-04 01:42:09 UTC
(In reply to Stephen Benjamin from comment #10)
> This is affecting OCP release payloads, as there's a test to check if each
> CRD has a description. When it catches it in that short window when the
> older CRD is present, the test is failing and blocking the release.

I don't know that it's a short window; I think it's whichever is created first wins.

So now that the MAO is no longer installing the CRD, this should Just Work(TM) on fresh deployments. But it quite possibly still won't work after an upgrade from 4.6.

Comment 13 Stephen Benjamin 2020-12-04 01:47:35 UTC
> I don't know that it's a short window; I think it's whichever is created first wins.

If you had both MAO and CBO with the CRD in their manifests, CVO alternates between applying the two over and over again. It's visible in the logs for CVO. If you sit with `watch oc explain`, you'll periodically catch the MAO one not showing a description. It's short, so maybe 1 out of ~50 times.

Comment 14 Shelly Miron 2020-12-06 11:58:38 UTC
Verified on version (in disconnected env & connected env):


[kni@provisionhost-0-0 ~]$ oc version
Client Version: 4.7.0-0.nightly-2020-12-04-013308
Server Version: 4.7.0-0.nightly-2020-12-04-013308
Kubernetes Version: v1.19.2+ad738ba


[kni@provisionhost-0-0 ~]$ oc explain Provisioning
KIND:     Provisioning
VERSION:  metal3.io/v1alpha1

DESCRIPTION:
     Provisioning contains configuration used by the Provisioning service
     (Ironic) to provision baremetal hosts. Provisioning is created by the
     OpenShift installer using admin or user provided information about the
     provisioning network and the NIC on the server that can be used to PXE boot
     it. This CR is a singleton, created by the installer and currently only
     consumed by the cluster-baremetal-operator to bring up and update
     containers in a metal3 cluster.

FIELDS:
   apiVersion	<string>
     APIVersion defines the versioned schema of this representation of an
     object. Servers should convert recognized schemas to the latest internal
     value, and may reject unrecognized values. More info:
     https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources

   kind	<string>
     Kind is a string value representing the REST resource this object
     represents. Servers may infer this from the endpoint the client submits
     requests to. Cannot be updated. In CamelCase. More info:
     https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds

   metadata	<Object>
     Standard object's metadata. More info:
     https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata

   spec	<Object>
     ProvisioningSpec defines the desired state of Provisioning

   status	<Object>
     ProvisioningStatus defines the observed state of Provisioning

Comment 16 Angus Salkeld 2021-01-17 23:03:19 UTC
Doc Text: Previously `oc explain Provisioning` did not produce the details that were in the CRD. The version of the CRD has been updated and now `oc explain Provisioning` works as expected.

Comment 17 Angus Salkeld 2021-01-18 11:00:57 UTC
Doc type: bug

Comment 19 errata-xmlrpc 2021-02-24 15:18:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633