Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1986311

Summary: SRO crash when a incorrect chart is applied
Product: OpenShift Container Platform Reporter: Sebastian Scheinkman <sscheink>
Component: Special Resource OperatorAssignee: dagray
Status: CLOSED ERRATA QA Contact: liqcui
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.9CC: aos-bugs
Target Milestone: ---   
Target Release: 4.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-10-18 17:41:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sebastian Scheinkman 2021-07-27 09:10:16 UTC
Description of problem:
The SRO is a generic operator that allows multiple users to create recipes to deploy applications.

Right now there is a bug in the SRO that if one of the recipes is incorrect (is not possible to apply only of the resource) the all operator will crash.


How reproducible:
100%

Steps to Reproduce:
1. create a chart with a yaml issue
2. apply a recipe using that chart

Actual results:
The operator will crash and not continue to work and the other recipes that are fine

Expected results:
The SRO should update the status section of the recipe that have the issue to let the user know about it but continue to reconcile and the other CRs

Comment 1 liqcui 2021-08-10 13:38:01 UTC
Verify Steps:
Deploy SRO from master on https://github.com/openshift/special-resource-operator/ with TAG=master make deploy
2. Edit the file in that repo charts/example/simple-kmod-0.0.1/simple-kmod.yaml  to introduce some mistake in the yaml, for instance by substituting a nonexistent field in for the driverContainer field:
sed -i 's/driverContainer/driverErrorInFile/g' charts/example/simple-kmod-0.0.1/simple-kmod.yaml
3. Create the simple-kmod CR using this "bad" yaml.
VERSION=0.0.1 REPO=example SPECIALRESOURCE=simple-kmod make
4. Ensure that the operator pod in -n openshift-special-resource-operator  does not have an error (stays Running).
5. Ensure that the issue is reported in the cr Status field from:
 oc describe sr/simple-kmod 

Verify Result:
oc logs -f special-resource-controller-manager-59695c877c-8fgxp -n openshift-special-resource-operator

2021-08-10T03:10:47.197Z        INFO    simple-kmod     KernelAffine: ClusterUpgradeInfo        {"kernel": "4.18.0-305.10.2.el8_4.x86_64", "os": "8.4", "cluster": "4.9", "driverToolkitImage": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b9bc5e270c0d41ed695d8fb5e8bf17802ffa552c7c6b932c1e73e5fe6559c5d3"}
2021-08-10T03:10:48.092Z        INFO    warning         OnError: unable to build kubernetes objects from release manifest: error validating "": error validating data: ValidationError(BuildConfig.spec.source.git): missing required field "uri" in com.github.openshift.api.build.v1.GitBuildSource  
2021-08-10T03:10:48.113Z        INFO    simple-kmod     RECONCILE REQUEUE: Could not reconcile chart    {"error": "Cannot reconcile hardware states: Failed to create state: templates/0000-buildconfig.yaml: unable to build kubernetes objects from release manifest: error validating \"\": error validating data: ValidationError(BuildConfig.spec.source.git): missing required field \"uri\" in com.github.openshift.api.build.v1.GitBuildSource"}

$ oc get pods -A |grep special
openshift-special-resource-operator                special-resource-controller-manager-59695c877c-8fgxp                  2/2     Running     0          15m

$ oc create -f charts/example/simple-kmod-0.0.1/simple-kmod.yaml
specialresource.sro.openshift.io/simple-kmod created
[ocpadmin@ec2-18-217-45-133 special-resource-operator]$ oc describe sr/simple-kmod
Name:         simple-kmod
.................
  Namespace:      simple-kmod
  Set:
    API Version:  sro.openshift.io/v1beta1
    Build Args:
      Name:   KMODVER
      Value:  SRO
    Kind:     Values
    Kmod Names:
      simple-kmod
      simple-procfs-kmod
Status:
  State:  Cannot reconcile hardware states: Failed to create state: templates/0000-buildconfig.yaml: unable to build kubernetes objects from release manifest: error validating "": error validating data: ValidationError(BuildConfig.spec.source.git): missing required field "uri" in com.github.openshift.api.build.v1.GitBuildSource
Events:   <none>

Comment 4 errata-xmlrpc 2021-10-18 17:41:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759