Bug 1669933

Summary: ./openshift-install destroy cluster failed: "panic: runtime error: invalid memory address or nil pointer dereference"
Product: OpenShift Container Platform Reporter: Rutvik <rkshirsa>
Component: InstallerAssignee: Matthew Staebler <mstaeble>
Installer sub component: openshift-installer QA Contact: Johnny Liu <jialiu>
Status: CLOSED DUPLICATE Docs Contact:
Severity: unspecified    
Priority: unspecified CC: mstaeble, wking
Version: 4.1.0   
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-01-30 02:49:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1664187    

Description Rutvik 2019-01-28 05:20:24 UTC
Description of problem:

While destroying a cluster, the installer failed with error:

"panic: runtime error: invalid memory address or nil pointer dereference"

Here are the goroutines for which the installer has thrown the runtime error.

-------------
[root@node-0 ocp4]# ./openshift-install destroy cluster
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x3983d16]
goroutine 49 [running]:
github.com/openshift/installer/pkg/destroy/aws.filterLBsByVPC(0xc420d9b500, 0x8, 0x8, 0xc420468280, 0x5181ca0, 0xc4200960c0, 0x1, 0x1, 0xc4209b9d80)
/home/wking/.local/lib/go/src/github.com/openshift/installer/pkg/destroy/aws/aws.go:215 +0x76
github.com/openshift/installer/pkg/destroy/aws.deleteLBs(0xc420468280, 0xc4209441c0, 0x5181ca0, 0xc4200960c0, 0x0)
/home/wking/.local/lib/go/src/github.com/openshift/installer/pkg/destroy/aws/aws.go:237 +0x346
github.com/openshift/installer/pkg/destroy/aws.deleteVPCs(0xc4209441c0, 0xc420407aa0, 0xc42040dff8, 0x7, 0x5181ca0, 0xc4200960c0, 0x0, 0x0, 0x0)
/home/wking/.local/lib/go/src/github.com/openshift/installer/pkg/destroy/aws/aws.go:463 +0x42a
github.com/openshift/installer/pkg/destroy/aws.deleteRunner.func1(0xc42006df18, 0x4106b8, 0x40)
/home/wking/.local/lib/go/src/github.com/openshift/installer/pkg/destroy/aws/aws.go:141 +0x5e
github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff(0x2540be400, 0x3ff4cccccccccccd, 0x0, 0x64, 0xc420092ac0, 0xc42006dfa8, 0x403e4c)
/home/wking/.local/lib/go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:203 +0x9c
github.com/openshift/installer/pkg/destroy/aws.deleteRunner(0x4d0bc09, 0xa, 0x4e6cdb8, 0xc4209441c0, 0xc420407aa0, 0xc42040dff8, 0x7, 0x5181ca0, 0xc4200960c0, 0xc420430480)
/home/wking/.local/lib/go/src/github.com/openshift/installer/pkg/destroy/aws/aws.go:140 +0x107
created by github.com/openshift/installer/pkg/destroy/aws.(*ClusterUninstaller).Run
/home/wking/.local/lib/go/src/github.com/openshift/installer/pkg/destroy/aws/aws.go:116 +0x5a5
-------------

Found this issue similar to https://bugzilla.redhat.com/show_bug.cgi?id=1669925. However, issue reported in this BZ found during destroy cluster only.

Relase: 0.10.0

Comment 1 W. Trevor King 2019-01-28 20:35:49 UTC
> Relase: 0.10.0

Deletion got a major rewrite in 0.10.1.  Can you still reproduce?

There's also a similar bug 1669925, but that panics via a different code path.

Comment 2 Matthew Staebler 2019-01-28 23:01:20 UTC
The cause of the panic here and in bug 1669925 is that the load balancer fetched from aws does not have a VPC ID.

Comment 3 W. Trevor King 2019-01-28 23:07:52 UTC
> The cause of the panic here and in bug 1669925 is that the load balancer fetched from aws does not have a VPC ID.

Do we still break in that case since dropping [1]?  I don't see how it would cause a problem in 0.10.1 and later.

[1]: https://github.com/openshift/installer/pull/1039/files#diff-ab65f0e2ba0237c5cc1429673c22cc08L216

Comment 4 Matthew Staebler 2019-01-28 23:16:34 UTC
(In reply to W. Trevor King from comment #3)
> > The cause of the panic here and in bug 1669925 is that the load balancer fetched from aws does not have a VPC ID.
> 
> Do we still break in that case since dropping [1]?  I don't see how it would
> cause a problem in 0.10.1 and later.
> 
> [1]:
> https://github.com/openshift/installer/pull/1039/files#diff-
> ab65f0e2ba0237c5cc1429673c22cc08L216

We no longer break in the specific code that caused this panic, but we do still have code that de-references the VPC ID for the load balancer before checking that it is non-nil. I can close this one as a duplicate of bug 1669925.

Comment 5 Matthew Staebler 2019-01-30 02:48:44 UTC
Fixed by https://github.com/openshift/installer/pull/1151

Comment 6 Matthew Staebler 2019-01-30 02:49:30 UTC

*** This bug has been marked as a duplicate of bug 1669925 ***