Bug 1875046 - Undiagnosed panic detected in pod: openshift-kube-apiserver_kube-apiserver: runtime.go:76: invalid memory address or nil pointer dereference [NEEDINFO]
Summary: Undiagnosed panic detected in pod: openshift-kube-apiserver_kube-apiserver: r...
Keywords:
Status: VERIFIED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.6.0
Assignee: Lukasz Szaszkiewicz
QA Contact: Ke Wang
URL:
Whiteboard:
: 1879208 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-02 20:32 UTC by W. Trevor King
Modified: 2020-09-23 02:45 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Undiagnosed panic detected in pod
Last Closed: 2020-09-02 20:37:26 UTC
Target Upstream Version:
alchan: needinfo? (lszaszki)


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift kubernetes pull 338 None closed Bug 1875046: Undiagnosed panic detected in pod: openshift-kube-apiserver_kube-apiserver: runtime.go:76: invalid memory a... 2020-09-22 16:50:40 UTC

Description W. Trevor King 2020-09-02 20:32:34 UTC
test:
Undiagnosed panic detected in pod 

is failing frequently in CI, see search results:
https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=Undiagnosed+panic+detected+in+pod

Example 4.6.0-0.ci-2020-09-01-180917 -> 4.6.0-0.ci-2020-09-02-112251 job [1] loops on:

pods/openshift-kube-apiserver_kube-apiserver-ci-op-x44nqxlf-2f611-bdgq5-master-1_kube-apiserver.log.gz:E0902 19:31:10.506239      17 runtime.go:76] Observed a panic: runtime error: invalid memory address or nil pointer dereference

Seems common:

$ w3m -dump -cols 200 'https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=Undiagnosed+panic+detected+in+pod' | grep 'failures match'
...
release-openshift-okd-installer-e2e-aws-4.6 - 67 runs, 97% failed, 52% of failures match
release-openshift-okd-installer-e2e-aws-upgrade - 87 runs, 67% failed, 22% of failures match
release-openshift-origin-installer-e2e-aws-4.6 - 4 runs, 50% failed, 50% of failures match
release-openshift-origin-installer-e2e-aws-compact-4.6 - 4 runs, 75% failed, 67% of failures match
release-openshift-origin-installer-e2e-aws-disruptive-4.6 - 4 runs, 100% failed, 50% of failures match
release-openshift-origin-installer-e2e-aws-sdn-multitenant-4.6 - 4 runs, 50% failed, 50% of failures match
release-openshift-origin-installer-e2e-aws-serial-4.6 - 107 runs, 47% failed, 118% of failures match
release-openshift-origin-installer-e2e-aws-shared-vpc-4.6 - 7 runs, 43% failed, 133% of failures match
release-openshift-origin-installer-e2e-aws-upgrade-4.5-stable-to-4.6-ci - 79 runs, 25% failed, 90% of failures match
release-openshift-origin-installer-e2e-aws-upgrade - 671 runs, 31% failed, 18% of failures match
release-openshift-origin-installer-e2e-aws-upgrade-rollback-4.6 - 8 runs, 38% failed, 33% of failures match
release-openshift-origin-installer-e2e-azure-4.6 - 27 runs, 85% failed, 9% of failures match
release-openshift-origin-installer-e2e-azure-shared-vpc-4.6 - 7 runs, 86% failed, 17% of failures match
release-openshift-origin-installer-e2e-azure-upgrade-4.5-stable-to-4.6-ci - 28 runs, 96% failed, 7% of failures match
release-openshift-origin-installer-e2e-azure-upgrade-4.6 - 28 runs, 96% failed, 7% of failures match
release-openshift-origin-installer-e2e-gcp-4.6 - 93 runs, 53% failed, 69% of failures match
release-openshift-origin-installer-e2e-gcp-4.7 - 11 runs, 100% failed, 36% of failures match
release-openshift-origin-installer-e2e-gcp-compact-4.6 - 4 runs, 75% failed, 33% of failures match
release-openshift-origin-installer-e2e-gcp-shared-vpc-4.6 - 7 runs, 14% failed, 200% of failures match
release-openshift-origin-installer-e2e-gcp-upgrade - 150 runs, 31% failed, 59% of failures match
release-openshift-origin-installer-e2e-gcp-upgrade-4.5-stable-to-4.6-ci - 27 runs, 15% failed, 100% of failures match
release-openshift-origin-installer-e2e-gcp-upgrade-4.6 - 28 runs, 36% failed, 60% of failures match
release-openshift-origin-installer-launch-aws - 153 runs, 52% failed, 5% of failures match
release-openshift-origin-installer-launch-gcp - 527 runs, 55% failed, 11% of failures match


[1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-upgrade-4.6/1301225743073677312

Comment 1 W. Trevor King 2020-09-02 20:37:26 UTC
Apparently a dup of bug 1875038.

*** This bug has been marked as a duplicate of bug 1875038 ***

Comment 2 W. Trevor King 2020-09-08 21:53:50 UTC
Pulling this back out into its own bug at Michal's request.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1875038#c4

Comment 3 W. Trevor King 2020-09-08 23:17:25 UTC
Although it is not clear to me why https://github.com/openshift/cluster-kube-apiserver-operator/pull/941 is not a fix for the panic that lead me to open this bug, so I'm fine if this gets re-closed as a dup ;).

Comment 4 Michal Fojtik 2020-09-09 08:19:39 UTC
This will be fixed by https://github.com/kubernetes/kubernetes/pull/94589

The fix you referenced Trevor is for KAS operator (during graceful shutdown, controllers that have not started yet because they were waiting for caches to sync received context close which closed the channel the WaitForCacheSync() used which resulted in panic inside that controller.

The fix Lukasz is working on is in operand and require backport from upstream.

Comment 8 Venkata Siva Teja Areti 2020-09-15 21:41:18 UTC
*** Bug 1879208 has been marked as a duplicate of this bug. ***

Comment 10 Ke Wang 2020-09-18 11:16:11 UTC
See the search results: https://search.ci.openshift.org/?search=Undiagnosed+panic+detected+in+pod&maxAge=168h&context=2&type=junit&name=&maxMatches=5&maxBytes=20971520&groupBy=job
Matched keywords 'Observed a panic: runtime error: invalid memory address or nil pointer dereference', there are total nine which involves openshift-apiserver_apiserver and openshift-kube-apiserver_kube-apiserver. Some are related to 4.3/4.4/4.5, others are related to indeterminate version,will observe a couple of days about this.

Comment 11 Ke Wang 2020-09-23 02:45:56 UTC
In the past seven days, no longer saw the panic from 4.6 related tests, still existed on 4.3,4.4 and 4.5, here is searching results: https://search.ci.openshift.org/?search=kube-apiserver.log.*Observed+a+panic%3A+runtime+error%3A+invalid+memory+address+or+nil+pointer+dereference&maxAge=168h&context=2&type=junit&name=&maxMatches=5&maxBytes=20971520&groupBy=job.


Note You need to log in before you can comment on or make changes to this bug.