Bug 1956611
| Summary: | OLM CRD schema validation failing against CRs where the value of a string field is a blank string | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | James Harrington <jaharrin> |
| Component: | OLM | Assignee: | Kevin Rizza <krizza> |
| OLM sub component: | OLM | QA Contact: | kuiwang |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | urgent | ||
| Priority: | urgent | CC: | ankithom, bmontgom, cblecker, dgoodwin, ecordell, gshereme, jdiaz, krizza, scuppett, travi |
| Version: | 4.8 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.8.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-07-27 23:06:09 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
James Harrington
2021-05-04 04:29:23 UTC
Tests conducted on a fresh 4.8.0 cluster to try to demonstrate that OLM is rejecting a CRD because there's a CR using a map[string]string where one of the values is empty string (which is valid data), when kubernetes itself will allow that CR successfully.
This can be tested easily with Hive without needing to provision clusters or do any major configuration.
First subscribe to an old version of Hive:
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: hive-sub
namespace: openshift-operators
spec:
channel: alpha
name: hive-operator
source: community-operators
sourceNamespace: openshift-marketplace
installPlanApproval: Manual
startingCSV: hive-operator.v1.0.19
kubectl -n openshift-operators patch installplan install-dc4fv -p '{"spec":{"approved":true}}' --type merge
I now have my Hive CRDs installed:
❯ kg crd | grep machinepools
machinepools.hive.openshift.io 2021-05-04T12:19:08Z
We don't need any real data here, we can just create a MachinePool that isn't linked to a real cluster and nothing will happen in hive. This machine pool has a spec.label with an empty string value.
apiVersion: hive.openshift.io/v1
kind: MachinePool
metadata:
creationTimestamp: null
name: f1-worker
namespace: default
spec:
labels:
node-role.kubernetes.io: infra
node-role.kubernetes.io/infra: ""
clusterDeploymentRef:
name: f1
name: worker
platform:
aws:
rootVolume:
iops: 100
size: 22
type: gp2
type: m4.xlarge
replicas: 3
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
❯ k apply -f myconfig/machinepool-empty-label.yaml
machinepool.hive.openshift.io/f1-worker created
Everything worked fine. Now lets try to upgrade to a newer Hive bundle by approving the next installplan that was automatically created:
❯ kg installplan
NAME CSV APPROVAL APPROVED
install-d26wv hive-operator.v1.1.0 Manual false
install-dc4fv hive-operator.v1.0.19 Manual true
❯ kg installplan install-d26wv -o yaml
This shows:
message: 'error validating existing CRs against new CRD''s schema: machinepools.hive.openshift.io:
error validating custom resource against new schema &apiextensions.CustomResourceValidation{OpenAPIV3Schema:(*apiextensions.JSONSchemaProps)(0xc015cc9800)}:
[].spec.labels.node-role.kubernetes.io/infra: Invalid value: "null": spec.labels.node-role.kubernetes.io/infra
in body must be of type string: "null"'
CSV now stuck:
❯ kg csv
NAME DISPLAY VERSION REPLACES PHASE
hive-operator.v1.0.19 Hive for Red Hat OpenShift 1.0.19 hive-operator.v1.0.18 Replacing
hive-operator.v1.1.0 Hive for Red Hat OpenShift 1.1.0 hive-operator.v1.0.19 Pending
Lets delete the machinepool and see if we can update the Hive OLM bundle:
❯ k delete machinepool -n default f1-worker
The installplan didn't seem to want to try again after 10 minutes or so, so I deleted my subscription and CSVs to try again.
Reapply subscription, approve old version, let it install, then approve latest version. (this time with no "bad" CR in etcd, yet)
We now have the latest hive installed:
❯ kg csv
NAME DISPLAY VERSION REPLACES PHASE
hive-operator.v1.1.0 Hive for Red Hat OpenShift 1.1.0 hive-operator.v1.0.19 Succeeded
Now lets apply our "bar" CR and see if Kube is ok with it:
❯ k apply -f myconfig/machinepool-empty-label.yaml
machinepool.hive.openshift.io/f1-worker created
I believe this indicates there is a problem with the validation OLM is doing where it's rejecting valid CRD updates, which Kube would not.
We desperately could use a workaround here if possible.
The PR is against the upstream repository, we'd need it merged there and pulled downstream. LGTM
--
[root@preserve-olm-env 1956611]# oc get pod -n openshift-operator-lifecycle-manager
NAME READY STATUS RESTARTS AGE
catalog-operator-6f7dcb85cb-s2ncl 1/1 Running 0 32m
olm-operator-74cc8c4bdc-xdftq 1/1 Running 0 32m
packageserver-d96c94dd5-4ptws 1/1 Running 0 30m
packageserver-d96c94dd5-vtxf6 1/1 Running 0 30m
[root@preserve-olm-env 1956611]# oc exec catalog-operator-6f7dcb85cb-s2ncl -n openshift-operator-lifecycle-manager -- olm --version
OLM version: 0.17.0
git commit: 9498948b664cdc43ab11581b77bbf1d9e5264692
[root@preserve-olm-env 1956611]# oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.8.0-0.nightly-2021-05-08-025039 True False 16m Cluster version is 4.8.0-0.nightly-2021-05-08-025039
[root@preserve-olm-env 1956611]#
[root@preserve-olm-env 1956611]# cat sub.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: hive-sub
namespace: openshift-operators
spec:
channel: alpha
name: hive-operator
source: community-operators
sourceNamespace: openshift-marketplace
installPlanApproval: Manual
startingCSV: hive-operator.v1.0.19
[root@preserve-olm-env 1956611]# oc apply -f sub.yaml
subscription.operators.coreos.com/hive-sub created
[root@preserve-olm-env 1956611]# oc get ip -n openshift-operators
NAME CSV APPROVAL APPROVED
install-dxmm8 hive-operator.v1.0.19 Manual false
[root@preserve-olm-env 1956611]# oc -n openshift-operators patch installplan install-dxmm8 -p '{"spec":{"approved":true}}' --type merge
installplan.operators.coreos.com/install-dxmm8 patched
[root@preserve-olm-env 1956611]# oc get ip -n openshift-operators
NAME CSV APPROVAL APPROVED
install-dxmm8 hive-operator.v1.0.19 Manual true
install-zj7cb hive-operator.v1.1.0 Manual false
[root@preserve-olm-env 1956611]#
[root@preserve-olm-env 1956611]# oc get csv -n openshift-operators
NAME DISPLAY VERSION REPLACES PHASE
hive-operator.v1.0.19 Hive for Red Hat OpenShift 1.0.19 hive-operator.v1.0.18 Succeeded
[root@preserve-olm-env 1956611]# oc get crd | grep machinepools
machinepools.hive.openshift.io 2021-05-08T09:51:16Z
[root@preserve-olm-env 1956611]#
[root@preserve-olm-env 1956611]#
[root@preserve-olm-env 1956611]# cat cr.yaml
apiVersion: hive.openshift.io/v1
kind: MachinePool
metadata:
creationTimestamp: null
name: f1-worker
namespace: default
spec:
labels:
node-role.kubernetes.io: infra
node-role.kubernetes.io/infra: ""
clusterDeploymentRef:
name: f1
name: worker
platform:
aws:
rootVolume:
iops: 100
size: 22
type: gp2
type: m4.xlarge
replicas: 3
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
[root@preserve-olm-env 1956611]# oc apply -f cr.yaml
machinepool.hive.openshift.io/f1-worker created
[root@preserve-olm-env 1956611]#
[root@preserve-olm-env 1956611]# oc get MachinePool
NAME POOLNAME CLUSTERDEPLOYMENT REPLICAS
f1-worker worker f1 3
[root@preserve-olm-env 1956611]#
[root@preserve-olm-env 1956611]# oc get MachinePool f1-worker -o yaml
apiVersion: hive.openshift.io/v1
kind: MachinePool
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"hive.openshift.io/v1","kind":"MachinePool","metadata":{"annotations":{},"creationTimestamp":null,"name":"f1-worker","namespace":"default"},"spec":{"clusterDeploymentRef":{"name":"f1"},"labels":{"node-role.kubernetes.io":"infra","node-role.kubernetes.io/infra":""},"name":"worker","platform":{"aws":{"rootVolume":{"iops":100,"size":22,"type":"gp2"},"type":"m4.xlarge"}},"replicas":3,"taints":[{"effect":"NoSchedule","key":"node-role.kubernetes.io/infra"}]}}
creationTimestamp: "2021-05-08T09:53:20Z"
generation: 1
name: f1-worker
namespace: default
resourceVersion: "38334"
uid: f46ec58b-d92d-4cac-8ed2-dfd4cb4ad9b6
spec:
clusterDeploymentRef:
name: f1
labels:
node-role.kubernetes.io: infra
node-role.kubernetes.io/infra: ""
name: worker
platform:
aws:
rootVolume:
iops: 100
size: 22
type: gp2
type: m4.xlarge
replicas: 3
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
[root@preserve-olm-env 1956611]#
[root@preserve-olm-env 1956611]# oc -n openshift-operators patch installplan install-zj7cb -p '{"spec":{"approved":true}}' --type merge
installplan.operators.coreos.com/install-zj7cb patched
[root@preserve-olm-env 1956611]# oc get ip -n openshift-operators
NAME CSV APPROVAL APPROVED
install-dxmm8 hive-operator.v1.0.19 Manual true
install-zj7cb hive-operator.v1.1.0 Manual true
[root@preserve-olm-env 1956611]# oc get ip install-zj7cb -n openshift-operators -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: InstallPlan
metadata:
creationTimestamp: "2021-05-08T09:51:17Z"
generateName: install-
generation: 2
labels:
operators.coreos.com/hive-operator.openshift-operators: ""
name: install-zj7cb
namespace: openshift-operators
ownerReferences:
- apiVersion: operators.coreos.com/v1alpha1
blockOwnerDeletion: false
controller: false
kind: Subscription
name: hive-sub
uid: f0a03a20-dbe3-4027-8baf-ba808cb3bcc2
resourceVersion: "39128"
uid: 3c7ee83e-8be8-4386-afdb-840f9f3d2288
spec:
approval: Manual
approved: true
clusterServiceVersionNames:
- hive-operator.v1.1.0
generation: 2
status:
bundleLookups:
- catalogSourceRef:
name: community-operators
namespace: openshift-marketplace
identifier: hive-operator.v1.1.0
...
- resolving: hive-operator.v1.1.0
resource:
group: rbac.authorization.k8s.io
kind: ClusterRoleBinding
manifest: '{"kind":"ConfigMap","name":"26e5fa87dc5e412cf0af5e2820356a92ecf83031f5b9b832abc110b5485cc62","namespace":"openshift-marketplace","catalogSourceName":"community-operators","catalogSourceNamespace":"openshift-marketplace","replaces":"hive-operator.v1.0.19","properties":"{\"properties\":[{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hive.openshift.io\",\"kind\":\"HiveConfig\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hive.openshift.io\",\"kind\":\"ClusterClaim\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hive.openshift.io\",\"kind\":\"ClusterState\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hive.openshift.io\",\"kind\":\"ClusterImageSet\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hive.openshift.io\",\"kind\":\"SelectorSyncSet\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hive.openshift.io\",\"kind\":\"MachinePool\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hive.openshift.io\",\"kind\":\"ClusterPool\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hive.openshift.io\",\"kind\":\"ClusterDeprovision\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hive.openshift.io\",\"kind\":\"SyncIdentityProvider\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hive.openshift.io\",\"kind\":\"Checkpoint\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hive.openshift.io\",\"kind\":\"ClusterProvision\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hiveinternal.openshift.io\",\"kind\":\"ClusterSyncLease\",\"version\":\"v1alpha1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hive.openshift.io\",\"kind\":\"DNSZone\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hive.openshift.io\",\"kind\":\"ClusterDeployment\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hive.openshift.io\",\"kind\":\"SyncSet\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hive.openshift.io\",\"kind\":\"ClusterRelocate\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hive.openshift.io\",\"kind\":\"MachinePoolNameLease\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hiveinternal.openshift.io\",\"kind\":\"ClusterSync\",\"version\":\"v1alpha1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"hive.openshift.io\",\"kind\":\"SelectorSyncIdentityProvider\",\"version\":\"v1\"}},{\"type\":\"olm.package\",\"value\":{\"packageName\":\"hive-operator\",\"version\":\"1.1.0\"}}]}"}'
name: hive-operator.v1.1.0-687dbf574d
sourceName: community-operators
sourceNamespace: openshift-marketplace
version: v1
status: Created
startTime: "2021-05-08T09:55:03Z"
[root@preserve-olm-env 1956611]#
[root@preserve-olm-env 1956611]# oc get csv -n openshift-operators
NAME DISPLAY VERSION REPLACES PHASE
hive-operator.v1.1.0 Hive for Red Hat OpenShift 1.1.0 hive-operator.v1.0.19 Succeeded
[root@preserve-olm-env 1956611]#
--
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |