Bug 1991860

Summary: Insights Operator panics with invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
Product: OpenShift Container Platform Reporter: Jan Chaloupka <jchaloup>
Component: Insights OperatorAssignee: Tomas Remes <tremes>
Status: CLOSED ERRATA QA Contact: Dmitry Misharov <dmisharo>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.9CC: aos-bugs, inecas, mitr, mklika, stbenjam, tremes
Target Milestone: ---   
Target Release: 4.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Undiagnosed panic detected in pod job=periodic-ci-openshift-release-master-ci-4.9-e2e-gcp-compact-serial=all
Last Closed: 2021-10-18 17:45:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jan Chaloupka 2021-08-10 10:54:51 UTC
Description of problem:
```
E0808 19:36:23.134328       1 recent_metrics.go:67] Unable to retrieve most recent metrics: Get "https://prometheus-k8s.openshift-monitoring.svc:9091/federate?match%5B%5D=etcd_object_counts&match%5B%5D=cluster_installer&match%5B%5D=namespace%3Acontainer_cpu_usage_seconds_total%3Asum_rate&match%5B%5D=namespace%3Acontainer_memory_usage_bytes%3Asum&match%5B%5D=vsphere_node_hw_version_total&match%5B%5D=virt_platform": context canceled
E0808 19:36:23.134371       1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 347 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x1eca3a0, 0x3323d10)
	/go/src/github.com/openshift/insights-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x95
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/src/github.com/openshift/insights-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x86
panic(0x1eca3a0, 0x3323d10)
	/usr/lib/golang/src/runtime/panic.go:965 +0x1b9
github.com/openshift/insights-operator/pkg/record.(*Record).Filename(0xc0007932c8, 0x4146d8, 0x37)
	/go/src/github.com/openshift/insights-operator/pkg/record/record.go:25 +0x2e
github.com/openshift/insights-operator/pkg/recorder.(*Recorder).has(0xc0005aa640, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/openshift/insights-operator/pkg/recorder/recorder.go:147 +0x2f
github.com/openshift/insights-operator/pkg/recorder.(*Recorder).Record(0xc0005aa640, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/openshift/insights-operator/pkg/recorder/recorder.go:56 +0x1fa
github.com/openshift/insights-operator/pkg/gather.CollectAndRecordGatherer(0x24a8d28, 0xc000823560, 0x2473188, 0xc0007696b0, 0x2463880, 0xc0005aa640, 0x24730c0, 0xc000a7d760, 0xc000380430, 0x1, ...)
	/go/src/github.com/openshift/insights-operator/pkg/gather/gather.go:111 +0x493
github.com/openshift/insights-operator/pkg/controller/periodic.(*Controller).Gather.func2(0x0, 0x0, 0x0)
	/go/src/github.com/openshift/insights-operator/pkg/controller/periodic/periodic.go:142 +0x2f6
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection(0xc000793b78, 0x2075c00, 0x0, 0x0)
	/go/src/github.com/openshift/insights-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:211 +0x69
k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff(0xa7a3582000, 0x3ff599999999999a, 0x0, 0x5, 0x68c61714000, 0xc000793b78, 0x2, 0x4)
	/go/src/github.com/openshift/insights-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:399 +0x55
github.com/openshift/insights-operator/pkg/controller/periodic.(*Controller).Gather(0xc0005aa690)
	/go/src/github.com/openshift/insights-operator/pkg/controller/periodic/periodic.go:134 +0x398
github.com/openshift/insights-operator/pkg/controller/periodic.(*Controller).Run(0xc0005aa690, 0xc0005da720, 0xa6ec506f6e)
	/go/src/github.com/openshift/insights-operator/pkg/controller/periodic/periodic.go:79 +0x1da
created by github.com/openshift/insights-operator/pkg/controller.(*Operator).Run
	/go/src/github.com/openshift/insights-operator/pkg/controller/operator.go:141 +0xb3b
```

Version-Release number of selected component (if applicable):
4.9


How reproducible:
From https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.9-e2e-gcp-compact-serial/1424447141601873920. See https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.9-e2e-gcp-compact-serial/1424447141601873920/artifacts/e2e-gcp-compact-serial/gather-extra/artifacts/pods/openshift-insights_insights-operator-c88ccccdb-rjm6l_insights-operator.log

Steps to Reproduce:


Actual results:
- Operator panics


Expected results:
- Operator does not panic


Additional info:
From https://github.com/openshift/insights-operator/blob/b178f47f4a77420f13fe35e1535835090d7f3a8c/pkg/recorder/recorder.go#L147:

```
func (r *Recorder) has(re record.Record) bool {
	existing, ok := r.records[re.Filename()]
	if ok {
		if re.Fingerprint == existing.Fingerprint {
			return true
		}
	}
	return false
}
```
The function "has" does not check if re.Filename() can be properly invoked.
Based on its definition:
```
// Filename with extension, if present
func (r *Record) Filename() string {
	extension := r.Item.GetExtension()
	if len(extension) > 0 {
		return fmt.Sprintf("%s.%s", r.Name, extension)
	}
	return r.Name
}
```
If re.Item is nil, r.Item.GetExtension() will panic as GetExtension() method does not have a pointer receiver (see https://github.com/openshift/insights-operator/blob/b178f47f4a77420f13fe35e1535835090d7f3a8c/pkg/record/record.go#L46-L48):
```
// GetExtension return extension for json marshaller
func (m JSONMarshaller) GetExtension() string {
	return JSONExtension
}
```

Also https://play.golang.org/p/IO0DzTx_fjR vs. https://play.golang.org/p/wfbTII_Ypkd ("func (t T) hello() string" vs. "func (t *T) hello() string").

Comment 1 Miloslav Trmač 2021-08-10 22:38:01 UTC
AFAICS https://github.com/openshift/insights-operator/pull/484 will fix it.

Comment 6 errata-xmlrpc 2021-10-18 17:45:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759