Bug 1965030
| Summary: | Controller inventory container has memory leaks and restarts if OpenShift Virtualization is not installed on the cluster | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Migration Toolkit for Virtualization | Reporter: | Franco Bladilo <fbladilo> | ||||
| Component: | Operator | Assignee: | Franco Bladilo <fbladilo> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Tzahi Ashkenazi <tashkena> | ||||
| Severity: | low | Docs Contact: | Avital Pinnick <apinnick> | ||||
| Priority: | low | ||||||
| Version: | 2.0.0 | CC: | apinnick, fdupont, istein | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 2.2.0 | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2021-12-09 19:20:45 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Created attachment 1787321 [details]
Prometheus chart showing the leak over 1 week
Attached prometheus memory stats for controller pod
@fbladilo, can you change the Ansible role to create the "host" provider only if CNV is installed in the cluster? Please verify with build 2.2.0-1 / iib:104622, on OCP 4.9. To test, install the MTV operator on an OpenShift cluster where CNV is *NOT* installed. Once MTV is installed, the OpenShift Virtualization provider named "host" should not be present. Tested for 20H on MTV-2.2.0-87 on f06-h36 no memory leaks, the memory consuming during those 20H of the forklift-controller pod was 178MB and stable no restarts occurred on forklift-controller pod : [kni@f06-h36-000-r640 root]$ oc get pods -nopenshift-mtv NAME READY STATUS RESTARTS AGE forklift-controller-67fd84598-kwkx9 2/2 Running 0 20h forklift-must-gather-api-5979b5b97c-gctn2 1/1 Running 0 20h forklift-operator-7867b4cd45-s7zpv 1/1 Running 0 20h forklift-ui-b86f47d86-mfdff 1/1 Running 0 20h forklift-validation-6c5b5697fb-82xpj 1/1 Running 0 20h no host provider is present on the OCP : [kni@f06-h36-000-r640 root]$ oc get providers -A No resources found Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (MTV 2.2.0 Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2021:5066 |
Description of problem: The inventory container seems to be leaking slowly when MTV 2.0.0 is deployed on a cluster without CNV installed. The inventory container logs for controller show an error of the missing VirtualMachine kind continuously. MTV is configured only with the OCP host provider (default). Error below : {"level":"info","ts":1622041766.0487552,"logger":"provider|8jmmm","msg":"Reconcile ended.","provider":"konveyor-forklift/host","reQ":3} {"level":"error","ts":1622041767.0330288,"logger":"controller-runtime.source","msg":"if kind is a CRD, it should be installed before calling Start","kind":"VirtualMachine.kubevirt.io","error":"no matches for kind \"VirtualMachine\" in version \"kubevirt.io/v1alpha3\"","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/opt/app-root/pkg/mod/github.com/go-logr/zapr.0/zapr.go:132\nsigs.k8s.io/controller-runtime/pkg/source.(*Kind).Start\n\t/opt/app-root/pkg/mod/sigs.k8s.io/controller-runtime.4/pkg/source/source.go:117\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/opt/app-root/pkg/mod/sigs.k8s.io/controller-runtime.4/pkg/internal/controller/controller.go:143\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start\n\t/opt/app-root/pkg/mod/sigs.k8s.io/controller-runtime.4/pkg/internal/controller/controller.go:184\nsigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1\n\t/opt/app-root/pkg/mod/sigs.k8s.io/controller-runtime.4/pkg/manager/internal.go:676"} {"level":"info","ts":1622041769.0490327,"logger":"provider|f6hn4","msg":"Reconcile started.","provider":"konveyor-forklift/host"} {"level":"info","ts":1622041771.7529714,"logger":"provider","msg":"Connection test succeeded."} Version-Release number of selected component (if applicable): 2.0.0 , OCP 4.7 How reproducible: Always Steps to Reproduce: 1. Deploy MTV or Forklift upstream on OCP without CNV 2. Create forkliftcontroller CR and wait for deployment to finish 3. Keep deployment running and watch the restart count on controller pod 4. Examine controller pod to see the OOM terminations for inventory Actual results: Expected results: It should not leak memory or cause restarts Additional info: