2033579 – SRO cannot update the special-resource-lifecycle ConfigMap if the data field is undefined

Bug 2033579 - SRO cannot update the special-resource-lifecycle ConfigMap if the data field is undefined

Summary: SRO cannot update the special-resource-lifecycle ConfigMap if the data field ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Special Resource Operator
Sub Component:
Version:	4.10
Hardware:	All
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.10.0
Assignee:	Quentin Barrand
QA Contact:	liqcui
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-12-17 10:26 UTC by Quentin Barrand
Modified:	2022-03-10 16:34 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-03-10 16:34:34 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift-psap special-resource-operator pull 197	None	open	Bug 2033579: allow the ConfigMap's data field to be empty	2021-12-17 14:29:16 UTC
Github	openshift special-resource-operator pull 86	None	open	Bug 2033579: Allow the ConfigMap's data field to be empty	2021-12-17 15:02:07 UTC
Red Hat Product Errata	RHSA-2022:0056	None	None	None	2022-03-10 16:34:51 UTC

Description Quentin Barrand 2021-12-17 10:26:47 UTC

Description of problem:

Because of a change in [1], when the special-resource-lifecycle ConfigMap's data field is undefined, SRO cannot update it and fails the reconciliation.

[1]: https://github.com/openshift/special-resource-operator/pull/84/files#diff-8b98b331c5d4acbeb7274c68973d20900daaed47c8d8f3e62ba39284379166bbL48-R41,


Steps to Reproduce:
0. Have SRO running
1. Install the infoscale recipe https://github.com/openshift/special-resource-operator/tree/master/charts/infoscale

Actual results:

 2021-12-17T02:56:07.849Z        INFO    ^[[1;32minfoscale-vtas  ^[[0m   RECONCILE REQUEUE: Could not reconcile chart    {"error": "cannot reconcile hardware states: failed to create state templates/3000-driver-container.yaml: after CRUD hooks failed: could not wait for resource: Waiting too long for resource: error or data not found: <nil> "}

Expected results:

The ConfigMap is updated and the reconciliation proceeds as expected.

Comment 3 liqcui 2021-12-30 02:50:16 UTC

We don't have infoscale environment and don't have veritas image registry account, so we met with image pull issue, can not do e2e test, we need to ask partner to help us to do the e2e testing.

Detailed Testing Results:

[ocpadmin@ec2-18-217-45-133 infoscale-vtas-0.0.1]$ oc get pods -n infoscale-vtas
NAME                                                   READY   STATUS         RESTARTS   AGE
infoscale-vtas-licensing-controller-69787566f7-74jjs   0/1     ErrImagePull   0          2m29s
[ocpadmin@ec2-18-217-45-133 infoscale-vtas-0.0.1]$ oc describe pod infoscale-vtas-licensing-controller-69787566f7-74jjs -n infoscale-vtas
Name:         infoscale-vtas-licensing-controller-69787566f7-74jjs
Namespace:    infoscale-vtas
Priority:     0
Node:         ip-10-0-139-35.us-east-2.compute.internal/10.0.139.35
Start Time:   Thu, 30 Dec 2021 02:12:24 +0000
Labels:       app=infoscale-vtas-licensing-controller
              pod-template-hash=69787566f7
              specialresource.openshift.io/owned=true
Annotations:  k8s.v1.cni.cncf.io/network-status:
                [{
                    "name": "openshift-sdn",
                    "interface": "eth0",
                    "ips": [
                        "10.128.2.12"
                    ],
Volumes:
  kube-api-access-xfd49:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
    ConfigMapName:           openshift-service-ca.crt
    ConfigMapOptional:       <nil>
QoS Class:                   BestEffort
Node-Selectors:              IS-cluster1=true
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age    From               Message
  ----     ------            ----   ----               -------
  Warning  FailedScheduling  2m37s  default-scheduler  0/6 nodes are available: 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
  Warning  FailedScheduling  92s    default-scheduler  0/6 nodes are available: 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
  Normal   Scheduled         15s    default-scheduler  Successfully assigned infoscale-vtas/infoscale-vtas-licensing-controller-69787566f7-74jjs to ip-10-0-139-35.us-east-2.compute.internal
  Normal   AddedInterface    13s    multus             Add eth0 [10.128.2.12/23] from openshift-sdn
  Normal   Pulling           13s    kubelet            Pulling image "veritas/infoscale-license:8.0.0.0000-rhel8"
  Warning  Failed            12s    kubelet            Failed to pull image "veritas/infoscale-license:8.0.0.0000-rhel8": rpc error: code = Unknown desc = reading manifest 8.0.0.0000-rhel8 in docker.io/veritas/infoscale-license: errors:
denied: requested access to the resource is denied
unauthorized: authentication required
  Warning  Failed   12s  kubelet  Error: ErrImagePull
  Normal   BackOff  12s  kubelet  Back-off pulling image "veritas/infoscale-license:8.0.0.0000-rhel8"
  Warning  Failed   12s  kubelet  Error: ImagePullBackOff
[ocpadmin@ec2-18-217-45-133 infoscale-vtas-0.0.1]$ oc get configmap -n infoscale-vtas
NAME                                   DATA   AGE
kube-root-ca.crt                       1      3m19s
openshift-service-ca.crt               1      3m19s
sh.helm.hooks.pre-install              0      3m13s
sh.helm.release.v1.infoscale-vtas.v1   1      3m13s

Comment 4 Quentin Barrand 2022-01-07 13:11:53 UTC

Veritas confirmed that the fix solves the problem: https://coreos.slack.com/archives/C02358PSC03/p1641554833000100?thread_ts=1639735346.216700&cid=C02358PSC03

Comment 7 errata-xmlrpc 2022-03-10 16:34:34 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056

Note You need to log in before you can comment on or make changes to this bug.