Bug 2089344 - Failed to deploy simple-kmod
Summary: Failed to deploy simple-kmod
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Special Resource Operator
Version: 4.11
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.11.0
Assignee: yevgeny shnaidman
QA Contact: Constantin Vultur
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-05-23 13:25 UTC by Constantin Vultur
Modified: 2022-08-10 11:13 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 11:13:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift special-resource-operator pull 217 0 None open Bug 2089344: fix the Reconcilition function logs 2022-05-25 09:06:20 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:13:46 UTC

Description Constantin Vultur 2022-05-23 13:25:56 UTC
Description of problem:
simple-kmod failed to deploy with message:

"error":"failed to reconcile SpecialResource '/simple-kmod': %!w(<nil>)"

oc describe specialresource/simple-kmod shows:

    Message:               failed to deploy SpecialResource's chart: failed to reconcile SpecialResource /simple-kmod: cannot reconcile hardware states: failed to create state templates/1000-driver-container.yaml: failed to create resources from yaml for chart simple-kmod: failed to create object from YAML: failed to execute after crud for object simple-kmod/simple-kmod-driver-container-e383247e62b56585: failed to wait for resource, object simple-kmod/simple-kmod-driver-container-e383247e62b56585: waiting too long for resource DaemonSet simple-kmod/simple-kmod-driver-container-e383247e62b56585: lifecycle availability of the DaemonSet simple-kmod/simple-kmod-driver-container-e383247e62b56585 is not verified yet: old pod simple-kmod/simple-kmod-driver-container-e383247e62b56585-gvkrj is still running 
    Reason:                FailedToDeployChart


Version-Release number of selected component (if applicable):
release-4.11

How reproducible:


Steps to Reproduce:
1. deploy operator
2. deploy simple-kmod
3.

Actual results:
The operator logs shows:
{"level":"info","ts":1653310541.6059291,"logger":"controller.specialresource","msg":"WARNING: RECONCILE REQUEUE: Could not reconcile chart for SpecialResource","reconciler group":"sro.openshift.io","reconciler kind":"SpecialResource","name":"simple-kmod","namespace":"","error":"failed to reconcile SpecialResource /simple-kmod: cannot reconcile hardware states: failed to create state templates/1000-driver-container.yaml: failed to create resources from yaml for chart simple-kmod: failed to create object from YAML: failed to execute after crud for object simple-kmod/simple-kmod-driver-container-e383247e62b56585: failed to wait for resource, object simple-kmod/simple-kmod-driver-container-e383247e62b56585: waiting too long for resource DaemonSet simple-kmod/simple-kmod-driver-container-e383247e62b56585: lifecycle availability of the DaemonSet simple-kmod/simple-kmod-driver-container-e383247e62b56585 is not verified yet: old pod simple-kmod/simple-kmod-driver-container-e383247e62b56585-gvkrj is still running "}
{"level":"error","ts":1653310541.6060023,"logger":"controller.specialresource","msg":"Reconciler error","reconciler group":"sro.openshift.io","reconciler kind":"SpecialResource","name":"simple-kmod","namespace":"","error":"failed to reconcile SpecialResource '/simple-kmod': %!w(<nil>)","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"}


oc describe special-resource/simple-kmod gives:

    Message:               failed to deploy SpecialResource's chart: failed to reconcile SpecialResource /simple-kmod: cannot reconcile hardware states: failed to create state templates/1000-driver-container.yaml: failed to create resources from yaml for chart simple-kmod: failed to create object from YAML: failed to execute after crud for object simple-kmod/simple-kmod-driver-container-e383247e62b56585: failed to wait for resource, object simple-kmod/simple-kmod-driver-container-e383247e62b56585: waiting too long for resource DaemonSet simple-kmod/simple-kmod-driver-container-e383247e62b56585: lifecycle availability of the DaemonSet simple-kmod/simple-kmod-driver-container-e383247e62b56585 is not verified yet: old pod simple-kmod/simple-kmod-driver-container-e383247e62b56585-gvkrj is still running 
    Reason:                FailedToDeployChart


Status of the deployment:

# oc get all -n simple-kmod
NAME                                                      READY   STATUS      RESTARTS   AGE
pod/simple-kmod-driver-build-e383247e62b56585-1-build     0/1     Completed   0          14m
pod/simple-kmod-driver-container-e383247e62b56585-gvkrj   1/1     Running     0          14m
pod/simple-kmod-driver-container-e383247e62b56585-sfsc4   1/1     Running     0          14m

NAME                                                           DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                                                                                                 AGE
daemonset.apps/simple-kmod-driver-container-e383247e62b56585   2         2         2       2            2           feature.node.kubernetes.io/kernel-version.full=4.18.0-305.40.2.el8_4.x86_64,node-role.kubernetes.io/worker=   14m

NAME                                                                       TYPE     FROM         LATEST
buildconfig.build.openshift.io/simple-kmod-driver-build-e383247e62b56585   Docker   Git@master   1

NAME                                                                   TYPE     FROM          STATUS     STARTED          DURATION
build.build.openshift.io/simple-kmod-driver-build-e383247e62b56585-1   Docker   Git@4cdff09   Complete   14 minutes ago   2m55s

NAME                                                          IMAGE REPOSITORY                                                                            TAGS                            UPDATED
imagestream.image.openshift.io/simple-kmod-driver-container   image-registry.openshift-image-registry.svc:5000/simple-kmod/simple-kmod-driver-container   v4.18.0-305.40.2.el8_4.x86_64   11 minutes ago

Expected results:
no error for deployment

Additional info:

Comment 3 Constantin Vultur 2022-06-10 11:44:44 UTC
With an 4.11 bundle build, and simple-kmod deployment this problem is not seen anymore.

Marking as Verified

Comment 5 errata-xmlrpc 2022-08-10 11:13:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.