Bug 1546854

Summary: occasional nil pointer dereference panics in master-controllers while scaling up resources
Product: OpenShift Container Platform Reporter: Mike Fiedler <mifiedle>
Component: MasterAssignee: Tomáš Nožička <tnozicka>
Status: CLOSED ERRATA QA Contact: Mike Fiedler <mifiedle>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.9.0CC: aos-bugs, haowang, jokerman, mifiedle, mmccomas
Target Milestone: ---   
Target Release: 3.9.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: aos-scalability-39
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-03-28 14:29:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
master logs none

Description Mike Fiedler 2018-02-19 19:52:47 UTC
Description of problem:

I've seen this in two different clusters so far, both times while running a script to populate clusters with deployments, pods, builds, routes, imagestreams etc.   The standard SVT "cluster horizontal script"

In the scalability long-running configuration two occurences of this panic have happened in the past 3 days:

Feb 19 10:17:54 master-0.scale-ci.example.com atomic-openshift-master-controllers[121755]: E0219 10:17:54.882711  121755 runtime.go:66] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
Feb 19 10:17:54 master-0.scale-ci.example.com atomic-openshift-master-controllers[121755]: /builddir/build/BUILD/atomic-openshift-git-0.610e0bc/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:72
Feb 19 10:17:54 master-0.scale-ci.example.com atomic-openshift-master-controllers[121755]: /builddir/build/BUILD/atomic-openshift-git-0.610e0bc/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
Feb 19 10:17:54 master-0.scale-ci.example.com atomic-openshift-master-controllers[121755]: /builddir/build/BUILD/atomic-openshift-git-0.610e0bc/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
Feb 19 10:17:54 master-0.scale-ci.example.com atomic-openshift-master-controllers[121755]: /usr/lib/golang/src/runtime/asm_amd64.s:509
Feb 19 10:17:54 master-0.scale-ci.example.com atomic-openshift-master-controllers[121755]: /usr/lib/golang/src/runtime/panic.go:491
Feb 19 10:17:54 master-0.scale-ci.example.com atomic-openshift-master-controllers[121755]: /usr/lib/golang/src/runtime/panic.go:63
Feb 19 10:17:54 master-0.scale-ci.example.com atomic-openshift-master-controllers[121755]: /usr/lib/golang/src/runtime/signal_unix.go:367
Feb 19 10:17:54 master-0.scale-ci.example.com atomic-openshift-master-controllers[121755]: /builddir/build/BUILD/atomic-openshift-git-0.610e0bc/_output/local/go/src/github.com/openshift/origin/pkg/apps/controller/deploymentconfig/deploymentconfig_controller.go:192
Feb 19 10:17:54 master-0.scale-ci.example.com atomic-openshift-master-controllers[121755]: /builddir/build/BUILD/atomic-openshift-git-0.610e0bc/_output/local/go/src/github.com/openshift/origin/pkg/apps/controller/deploymentconfig/factory.go:214
Feb 19 10:17:54 master-0.scale-ci.example.com atomic-openshift-master-controllers[121755]: /builddir/build/BUILD/atomic-openshift-git-0.610e0bc/_output/local/go/src/github.com/openshift/origin/pkg/apps/controller/deploymentconfig/factory.go:187
Feb 19 10:17:54 master-0.scale-ci.example.com atomic-openshift-master-controllers[121755]: /builddir/build/BUILD/atomic-openshift-git-0.610e0bc/_output/local/go/src/github.com/openshift/origin/pkg/apps/controller/deploymentconfig/factory.go:89
Feb 19 10:17:54 master-0.scale-ci.example.com atomic-openshift-master-controllers[121755]: /builddir/build/BUILD/atomic-openshift-git-0.610e0bc/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133
Feb 19 10:17:54 master-0.scale-ci.example.com atomic-openshift-master-controllers[121755]: /builddir/build/BUILD/atomic-openshift-git-0.610e0bc/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134
Feb 19 10:17:54 master-0.scale-ci.example.com atomic-openshift-master-controllers[121755]: /builddir/build/BUILD/atomic-openshift-git-0.610e0bc/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
Feb 19 10:17:54 master-0.scale-ci.example.com atomic-openshift-master-controllers[121755]: /usr/lib/golang/src/runtime/asm_amd64.s:2337





In a smaller cluster (3 masters, 10 computes) on AWS it happened during my first attempt running the scale up script:

Feb 19 19:08:28 ip-172-31-4-100.us-west-2.compute.internal atomic-openshift-master-controllers[78344]: E0219 19:08:28.110995   78344 runtime.go:66] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
Feb 19 19:08:28 ip-172-31-4-100.us-west-2.compute.internal atomic-openshift-master-controllers[78344]: /builddir/build/BUILD/atomic-openshift-git-0.b4a0c68/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:72
Feb 19 19:08:28 ip-172-31-4-100.us-west-2.compute.internal atomic-openshift-master-controllers[78344]: /builddir/build/BUILD/atomic-openshift-git-0.b4a0c68/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
Feb 19 19:08:28 ip-172-31-4-100.us-west-2.compute.internal atomic-openshift-master-controllers[78344]: /builddir/build/BUILD/atomic-openshift-git-0.b4a0c68/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
Feb 19 19:08:28 ip-172-31-4-100.us-west-2.compute.internal atomic-openshift-master-controllers[78344]: /usr/lib/golang/src/runtime/asm_amd64.s:509
Feb 19 19:08:28 ip-172-31-4-100.us-west-2.compute.internal atomic-openshift-master-controllers[78344]: /usr/lib/golang/src/runtime/panic.go:491
Feb 19 19:08:28 ip-172-31-4-100.us-west-2.compute.internal atomic-openshift-master-controllers[78344]: /usr/lib/golang/src/runtime/panic.go:63
Feb 19 19:08:28 ip-172-31-4-100.us-west-2.compute.internal atomic-openshift-master-controllers[78344]: /usr/lib/golang/src/runtime/signal_unix.go:367
Feb 19 19:08:28 ip-172-31-4-100.us-west-2.compute.internal atomic-openshift-master-controllers[78344]: /builddir/build/BUILD/atomic-openshift-git-0.b4a0c68/_output/local/go/src/github.com/openshift/origin/pkg/apps/controller/deploymentconfig/deploymentconfig_controller.go:192
Feb 19 19:08:28 ip-172-31-4-100.us-west-2.compute.internal atomic-openshift-master-controllers[78344]: /builddir/build/BUILD/atomic-openshift-git-0.b4a0c68/_output/local/go/src/github.com/openshift/origin/pkg/apps/controller/deploymentconfig/factory.go:214
Feb 19 19:08:28 ip-172-31-4-100.us-west-2.compute.internal atomic-openshift-master-controllers[78344]: /builddir/build/BUILD/atomic-openshift-git-0.b4a0c68/_output/local/go/src/github.com/openshift/origin/pkg/apps/controller/deploymentconfig/factory.go:187
Feb 19 19:08:28 ip-172-31-4-100.us-west-2.compute.internal atomic-openshift-master-controllers[78344]: /builddir/build/BUILD/atomic-openshift-git-0.b4a0c68/_output/local/go/src/github.com/openshift/origin/pkg/apps/controller/deploymentconfig/factory.go:89
Feb 19 19:08:28 ip-172-31-4-100.us-west-2.compute.internal atomic-openshift-master-controllers[78344]: /builddir/build/BUILD/atomic-openshift-git-0.b4a0c68/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133
Feb 19 19:08:28 ip-172-31-4-100.us-west-2.compute.internal atomic-openshift-master-controllers[78344]: /builddir/build/BUILD/atomic-openshift-git-0.b4a0c68/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134
Feb 19 19:08:28 ip-172-31-4-100.us-west-2.compute.internal atomic-openshift-master-controllers[78344]: /builddir/build/BUILD/atomic-openshift-git-0.b4a0c68/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
Feb 19 19:08:28 ip-172-31-4-100.us-west-2.compute.internal atomic-openshift-master-controllers[78344]: /usr/lib/golang/src/runtime/asm_amd64.s:2337
 


Version-Release number of selected component (if applicable): 3.9.0-0.24.0 (scale lab) and 3.9.0-0.45.0 (AWS cluster)


How reproducible: Unknown, 2 occurrences in 3 days in the scalability lab.  Hit it on first attempt at cluster horizontal in AWS

To reproduce:

1. AWS cluster on 3.9.0-0.45.0 with 3 master/etcd, 1 infra, 10 computes
2. Run the SVT cluster loader tool (https://github.com/openshift/svt/tree/master/openshift_scalability) with the configuration below

I will attach the logs from the 3 master/etcds on AWS - let me know what else is needed.

Comment 1 Mike Fiedler 2018-02-19 19:55:00 UTC
Created attachment 1398008 [details]
master logs

See master-0-logs.out for the panic

Comment 2 Tomáš Nožička 2018-02-20 13:57:41 UTC
https://github.com/openshift/origin/pull/18676

Comment 3 Mike Fiedler 2018-02-20 18:01:17 UTC
Moving to modified until a puddle is available.

Comment 7 Mike Fiedler 2018-02-26 17:02:16 UTC
Verified on 3.9.0-0.53.0. Panic not seeing running cluster horizontal scaleup

Comment 10 errata-xmlrpc 2018-03-28 14:29:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489