Bug 2063789 - [ MTV ] forklift-controller restarted 10 times during idle and warm migrations; exited on "Observed a panic : runtime error: index out of range [0] with length 0"
Summary: [ MTV ] forklift-controller restarted 10 times during idle and warm migrati...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Migration Toolkit for Virtualization
Classification: Red Hat
Component: Controller
Version: 2.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 2.3.0
Assignee: Jeff Ortel
QA Contact: Tzahi Ashkenazi
Richard Hoch
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-03-14 11:45 UTC by Tzahi Ashkenazi
Modified: 2022-04-04 18:04 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-04 18:04:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github konveyor forklift-controller pull 422 0 None open Bug 2063789: Check len of pod.Status.ContainerStatuses rather than checking for nil 2022-03-14 14:21:41 UTC
Red Hat Product Errata RHEA-2022:1183 0 None None None 2022-04-04 18:04:08 UTC

Description Tzahi Ashkenazi 2022-03-14 11:45:58 UTC
Description of problem:

forklift-controller restarted 10 times during idle & warm migration 

[root@f01-h14-000-r640 ~]# oc get pods
NAME                                        READY   STATUS    RESTARTS       AGE
forklift-controller-854bbdd985-cfnkj        2/2     Running   10 (18h ago)   3d23h
forklift-must-gather-api-7fd458f97f-zthsh   1/1     Running   0              6d21h
forklift-operator-7648895549-5zhxc          1/1     Running   0              6d21h
forklift-ui-6668df84db-xpbm9                1/1     Running   0              6d21h
forklift-validation-5d48f67559-tx8kp        1/1     Running   0              6d21h


from the controller pod log : 

E0313 16:56:27.034720       1 runtime.go:78] Observed a panic: runtime.boundsError{x:0, y:0, signed:true, code:0x0} (runtime error: index out of range [0] with length 0)
goroutine 752 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x2b1ea40, 0xc00100f260)
	/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/runtime/runtime.go:74 +0xa6
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/runtime/runtime.go:48 +0x86
panic(0x2b1ea40, 0xc00100f260)
	/usr/lib/golang/src/runtime/panic.go:965 +0x1b9
github.com/konveyor/forklift-controller/pkg/controller/plan.(*Migration).updateCopyProgress(0xc0039357b8, 0xc0046d57a0, 0xc004f10990, 0xc004f10990, 0x1)
	/remote-source/app/pkg/controller/plan/migration.go:1248 +0x1245
github.com/konveyor/forklift-controller/pkg/controller/plan.(*Migration).execute(0xc0039357b8, 0xc0046d57a0, 0x0, 0x0)
	/remote-source/app/pkg/controller/plan/migration.go:660 +0x2bfd
github.com/konveyor/forklift-controller/pkg/controller/plan.(*Migration).Run(0xc0039357b8, 0xb2d05e00, 0x0, 0x0)
	/remote-source/app/pkg/controller/plan/migration.go:171 +0x138
github.com/konveyor/forklift-controller/pkg/controller/plan.(*Reconciler).execute(0xc004a8f740, 0xc0059ef800, 0x0, 0x0, 0x0)
	/remote-source/app/pkg/controller/plan/controller.go:405 +0x854
github.com/konveyor/forklift-controller/pkg/controller/plan.Reconciler.Reconcile(0x30cd890, 0xc000d9a740, 0x30eea18, 0xc000420540, 0xc0007bc810, 0xc0004c5270, 0xd, 0xc000e49890, 0x28, 0x0, ...)
	/remote-source/app/pkg/controller/plan/controller.go:263 +0x6e5
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0010aa120, 0x2965e00, 0xc005febe80, 0x0)
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.4/pkg/internal/controller/controller.go:244 +0x2a9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0010aa120, 0x203000)
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.4/pkg/internal/controller/controller.go:218 +0xb0
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(...)
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.4/pkg/internal/controller/controller.go:197
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc000f74e00)
	/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:155 +0x5f
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000f74e00, 0x308c720, 0xc0005285d0, 0x1, 0xc00077e000)
	/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:156 +0x9b
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000f74e00, 0x3b9aca00, 0x0, 0x1, 0xc00077e000)
	/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:133 +0x98
k8s.io/apimachinery/pkg/util/wait.Until(0xc000f74e00, 0x3b9aca00, 0xc00077e000)
	/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:90 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.4/pkg/internal/controller/controller.go:179 +0x3d6
panic: runtime error: index out of range [0] with length 0 [recovered]
	panic: runtime error: index out of range [0] with length 0

goroutine 752 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/runtime/runtime.go:55 +0x109
panic(0x2b1ea40, 0xc00100f260)
	/usr/lib/golang/src/runtime/panic.go:965 +0x1b9
github.com/konveyor/forklift-controller/pkg/controller/plan.(*Migration).updateCopyProgress(0xc0039357b8, 0xc0046d57a0, 0xc004f10990, 0xc004f10990, 0x1)
	/remote-source/app/pkg/controller/plan/migration.go:1248 +0x1245
github.com/konveyor/forklift-controller/pkg/controller/plan.(*Migration).execute(0xc0039357b8, 0xc0046d57a0, 0x0, 0x0)
	/remote-source/app/pkg/controller/plan/migration.go:660 +0x2bfd
github.com/konveyor/forklift-controller/pkg/controller/plan.(*Migration).Run(0xc0039357b8, 0xb2d05e00, 0x0, 0x0)
	/remote-source/app/pkg/controller/plan/migration.go:171 +0x138
github.com/konveyor/forklift-controller/pkg/controller/plan.(*Reconciler).execute(0xc004a8f740, 0xc0059ef800, 0x0, 0x0, 0x0)
	/remote-source/app/pkg/controller/plan/controller.go:405 +0x854
github.com/konveyor/forklift-controller/pkg/controller/plan.Reconciler.Reconcile(0x30cd890, 0xc000d9a740, 0x30eea18, 0xc000420540, 0xc0007bc810, 0xc0004c5270, 0xd, 0xc000e49890, 0x28, 0x0, ...)
	/remote-source/app/pkg/controller/plan/controller.go:263 +0x6e5
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0010aa120, 0x2965e00, 0xc005febe80, 0x0)
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.4/pkg/internal/controller/controller.go:244 +0x2a9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0010aa120, 0x203000)
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.4/pkg/internal/controller/controller.go:218 +0xb0
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(...)
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.4/pkg/internal/controller/controller.go:197
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc000f74e00)
	/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:155 +0x5f
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000f74e00, 0x308c720, 0xc0005285d0, 0x1, 0xc00077e000)
	/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:156 +0x9b
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000f74e00, 0x3b9aca00, 0x0, 0x1, 0xc00077e000)
	/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:133 +0x98
k8s.io/apimachinery/pkg/util/wait.Until(0xc000f74e00, 0x3b9aca00, 0xc00077e000)
	/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:90 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.4/pkg/internal/controller/controller.go:179 +0x3d6



Version-Release number of selected component (if applicable):
MTV 2.3.0-37
CNV 4.10.0-648
OCP 4.10
RHEV 4.4.10-7-001
cloud10


Additional info:

the log can be found here :
https://drive.google.com/drive/folders/1nIzzyqn1MhhTQYZf8zUdmxRHMLobPzcw?usp=sharing

Comment 1 Ilanit Stein 2022-03-15 06:51:59 UTC
Fixed in MTV-2.3.0-38, use Openshift 4.10 iib:193826

Comment 2 Tzahi Ashkenazi 2022-03-27 09:14:50 UTC
tested and verified with MTV 2.3.0-38 
for  68 hours no restart was present on the controller pod.
on idle state & with some  cycles of warm  migration 


[root@f01-h14-000-r640]# oc get pods
NAME                                                                       READY   STATUS    RESTARTS        AGE
forklift-controller-7895674668-hg9nv                                       2/2     Running   0               2d20h
forklift-must-gather-api-7fd458f97f-9dv4m                                  1/1     Running   0               2d20h
forklift-operator-7d9bbd5575-xw2kb                                         1/1     Running   0               2d20h
forklift-ui-59d55999fc-sjrhz                                               1/1     Running   0               2d20h
forklift-validation-5d48f67559-p4kj4                                       1/1     Running   0               2d20h

Comment 6 errata-xmlrpc 2022-04-04 18:04:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (MTV 2.3.0 images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:1183


Note You need to log in before you can comment on or make changes to this bug.