Bug 1779748

Summary: hostpath-provisioner-operator failing with SIGSEGV after CR creation
Product: Container Native Virtualization (CNV) Reporter: Lukas Bednar <lbednar>
Component: InstallationAssignee: Adam Litke <alitke>
Status: CLOSED ERRATA QA Contact: Lukas Bednar <lbednar>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 2.2.0CC: alitke, awels, cnv-qe-bugs, fdeutsch, ncredi, stirabos, talayan, ycui
Target Milestone: ---Keywords: TestBlocker
Target Release: 2.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: hostpath-provisioner-operator-container-v2.2.0-7 hco-bundle-registry-container-v2.2.0-70 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-30 16:27:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lukas Bednar 2019-12-04 16:02:33 UTC
Description of problem:

I am trying to deploy HPP, using HPP operator which comes with HCO for CNV-2.2 .
I am using latest HCO bundle 163.0.0.

I deploy CNV-2.2 using market place, and I see hostpath-provisioner-operator pod running.

[cnv-qe-jenkins@cnv-executor-lbednar2 ~]$ oc get pods --all-namespaces | grep hostpa
openshift-cnv                                           hostpath-provisioner-operator-74dc75578d-75mmt                    1/1     Running     0          11m

Then I create HPP-CR [1] and then I see it in error state
[1] https://raw.githubusercontent.com/kubevirt/hostpath-provisioner-operator/master/deploy/hostpathprovisioner_cr.yaml

[cnv-qe-jenkins@cnv-executor-lbednar2 ~]$ oc get pods --all-namespaces | grep hostpa
openshift-cnv                                           hostpath-provisioner-operator-74dc75578d-75mmt                    0/1     Error       4          14m

When I take a look at the logs (full log in additional-info) I see:

    /opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/runtime/asm_amd64.s:1337
    panic: runtime error: invalid memory address or nil pointer dereference [recovered]
            panic: runtime error: invalid memory address or nil pointer dereference
    [signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x12bb60d]


Version-Release number of selected component (if applicable):
cnv-2.2.0
HCO bundle 163.0.0


How reproducible:


Steps to Reproduce:
1. Deploy CNV from market place 
2. create HPP CR [1]

Actual results: HPP operator failing 


Expected results: HPP is deployed by HPP operator


Additional info:
[cnv-qe-jenkins@cnv-executor-lbednar2 ~]$ oc logs -n openshift-cnv hostpath-provisioner-operator-74dc75578d-75mmt
{"level":"info","ts":1575469059.5784547,"logger":"cmd","msg":"Go Version: go1.12.12"}
{"level":"info","ts":1575469059.5785656,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
{"level":"info","ts":1575469059.5785706,"logger":"cmd","msg":"Version of operator-sdk: v0.11.0"}
{"level":"info","ts":1575469059.5794225,"logger":"leader","msg":"Trying to become the leader."}
{"level":"info","ts":1575469062.418997,"logger":"leader","msg":"Found existing lock with my name. I was likely restarted."}
{"level":"info","ts":1575469062.4190483,"logger":"leader","msg":"Continuing as the leader."}
{"level":"info","ts":1575469065.2271235,"logger":"cmd","msg":"Registering Components."}
{"level":"info","ts":1575469065.2276523,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"hostpathprovisioner-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1575469065.2278988,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"hostpathprovisioner-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1575469065.2280571,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"hostpathprovisioner-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1575469065.2282245,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"hostpathprovisioner-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1575469065.2284064,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"hostpathprovisioner-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1575469065.2289288,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"hostpathprovisioner-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1575469065.229043,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"hostpathprovisioner-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1575469065.2293298,"logger":"cmd","msg":"Starting the Cmd."}
E1204 14:17:45.255169       1 reflector.go:270] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:126: Failed to watch *v1.StorageClass: unknown (get storageclasses.storage.k8s.io)
{"level":"info","ts":1575469065.329994,"logger":"controller-runtime.controller","msg":"Starting Controller","controller":"hostpathprovisioner-controller"}
{"level":"info","ts":1575469065.4303026,"logger":"controller-runtime.controller","msg":"Starting workers","controller":"hostpathprovisioner-controller","worker count":1}
{"level":"info","ts":1575469065.4304929,"logger":"controller_hostpathprovisioner","msg":"Reconciling HostPathProvisioner","Request.Namespace":"","Request.Name":"hostpath-provisioner"}
E1204 14:17:45.430814       1 runtime.go:69] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/kubevirt.io/hostpath-provisioner-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:76
/go/src/kubevirt.io/hostpath-provisioner-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/go/src/kubevirt.io/hostpath-provisioner-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/runtime/panic.go:522
/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/runtime/panic.go:82
/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/runtime/signal_unix.go:390
/go/src/kubevirt.io/hostpath-provisioner-operator/pkg/controller/hostpathprovisioner/controller.go:202
/go/src/kubevirt.io/hostpath-provisioner-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:216
/go/src/kubevirt.io/hostpath-provisioner-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:192
/go/src/kubevirt.io/hostpath-provisioner-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:171
/go/src/kubevirt.io/hostpath-provisioner-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152
/go/src/kubevirt.io/hostpath-provisioner-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153
/go/src/kubevirt.io/hostpath-provisioner-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/runtime/asm_amd64.s:1337
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x12bb60d]

goroutine 662 [running]:
kubevirt.io/hostpath-provisioner-operator/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/src/kubevirt.io/hostpath-provisioner-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x105
panic(0x1488ae0, 0x268bae0)
	/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/runtime/panic.go:522 +0x1b5
kubevirt.io/hostpath-provisioner-operator/pkg/controller/hostpathprovisioner.(*ReconcileHostPathProvisioner).Reconcile(0xc0004af000, 0x0, 0x0, 0xc0002453e0, 0x14, 0x26a69e0, 0xc0006ee240, 0xc000670d88, 0xc000670db8)
	/go/src/kubevirt.io/hostpath-provisioner-operator/pkg/controller/hostpathprovisioner/controller.go:202 +0x5fd
kubevirt.io/hostpath-provisioner-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000015720, 0x14e5e00, 0xc0004bbf00, 0x14e5e00)
	/go/src/kubevirt.io/hostpath-provisioner-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:216 +0x146
kubevirt.io/hostpath-provisioner-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000015720, 0xc0001ade00)
	/go/src/kubevirt.io/hostpath-provisioner-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:192 +0xb5
kubevirt.io/hostpath-provisioner-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc000015720)
	/go/src/kubevirt.io/hostpath-provisioner-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:171 +0x2b
kubevirt.io/hostpath-provisioner-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc000397940)
	/go/src/kubevirt.io/hostpath-provisioner-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x54
kubevirt.io/hostpath-provisioner-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000397940, 0x3b9aca00, 0x0, 0xc000000001, 0xc0009b60c0)
	/go/src/kubevirt.io/hostpath-provisioner-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xf8
kubevirt.io/hostpath-provisioner-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc000397940, 0x3b9aca00, 0xc0009b60c0)
	/go/src/kubevirt.io/hostpath-provisioner-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
created by kubevirt.io/hostpath-provisioner-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start
	/go/src/kubevirt.io/hostpath-provisioner-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:157 +0x311

Comment 1 Alexander Wels 2019-12-04 18:09:34 UTC
The problem is the version.txt file is missing in the operator container, it reads that file to determine which version it is (needed for upgrades for instance). I suspect the file is not being created during the build of the container.

Comment 3 Adam Litke 2019-12-09 19:15:11 UTC
Both Alexander and I were able to deploy successfully using the following images which are currently attached to the errata and specified in the HCO bundle:

hco-bundle-registry-container-v2.2.0-70
https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=1030906


registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-hostpath-provisioner-rhel8-operator:v2.2.0-7
registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-hostpath-provisioner-rhel8:v2.2.0-5


Therefore, I think your issue must be due to the use of stale images.  

Additionally, the segfault from your logs (/go/src/kubevirt.io/hostpath-provisioner-operator/pkg/controller/hostpathprovisioner/controller.go:202) references the 
following code in broken versions:

        cr.Status.OperatorVersion = *versionString

and in fixed versions that line is:

                MarkCrDeploying(cr, "DeployStarted", "Started Deployment")

which cannot segfault.

Comment 4 Tareq Alayan 2019-12-18 13:44:01 UTC
verified. We are able to activate the hpp.

Comment 6 errata-xmlrpc 2020-01-30 16:27:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:0307