Bug 1793627 - gcp destroy cluster panics "nil pointer dereference" when deleting dns records
Summary: gcp destroy cluster panics "nil pointer dereference" when deleting dns records
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: 4.4.0
Assignee: Jeremiah Stuever
QA Contact: Yang Yang
URL:
Whiteboard:
Depends On:
Blocks: 1788708
TreeView+ depends on / blocked
 
Reported: 2020-01-21 17:24 UTC by Jeremiah Stuever
Modified: 2020-05-04 11:26 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-04 11:25:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 2962 0 None closed Bug 1793627: gcp destroy: handle nil when evaluating dns response 2021-01-17 08:50:25 UTC
Red Hat Product Errata RHBA-2020:0581 0 None None None 2020-05-04 11:26:26 UTC

Description Jeremiah Stuever 2020-01-21 17:24:23 UTC
Description of problem:

On occasion when running openshift-install destroy cluster on a gcp cluster, it panics while trying to delete dns records.

Version-Release number of the following components:
4.4

How reproducible:
Occasionally, found once in CI

Steps to Reproduce:
1. Create cluster in gcp
2. Destroy cluster in gcp
3. panic

Actual results:
Please include the entire output from the last TASK line through the end of output if an error is generated

panic: runtime error: invalid memory address or nil pointer dereference

Expected results:

dns records should delete properly

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

...
level=debug msg="Listing DNS Zones"
level=debug msg="Found cluster private dns zone: ci-op-sbqhj-private-zone\n"
level=debug msg="Found parent dns zone: origin-ci-int-gce-new"
level=debug msg="Deleting 2 recordset(s) in zone origin-ci-int-gce-new"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x78 pc=0x95fd7bf]
goroutine 1 [running]:
github.com/openshift/installer/pkg/destroy/gcp.(*ClusterUninstaller).deleteDNSZoneRecordSets(0xc000b61040, 0xc0004fc520, 0x15, 0xc000f4d110, 0x24, 0xc000a03570, 0x2, 0x2, 0x0, 0x0)
	/go/src/github.com/openshift/installer/pkg/destroy/gcp/dns.go:90 +0x67f
github.com/openshift/installer/pkg/destroy/gcp.(*ClusterUninstaller).destroyDNS(0xc000b61040, 0x2, 0xc0000da480)
	/go/src/github.com/openshift/installer/pkg/destroy/gcp/dns.go:170 +0x31e
github.com/openshift/installer/pkg/destroy/gcp.(*ClusterUninstaller).destroyCluster(0xc000b61040, 0x0, 0x0, 0x0)
	/go/src/github.com/openshift/installer/pkg/destroy/gcp/gcp.go:153 +0x749
github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait.WaitFor(0xc001572e00, 0xc000dc7b10, 0xc00158c900, 0x0, 0x0)
	/go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:434 +0x137
github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait.PollUntil(0x2540be400, 0xc00114db10, 0xc00158c8a0, 0x0, 0x0)
	/go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:385 +0xbc
github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait.PollInfinite(0x2540be400, 0xc00114db10, 0x0, 0x0)
	/go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:357 +0x8a
github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait.PollImmediateInfinite(0x2540be400, 0xc00114db10, 0xc00114db30, 0x2)
	/go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:374 +0x70
github.com/openshift/installer/pkg/destroy/gcp.(*ClusterUninstaller).Run(0xc000b61040, 0x0, 0x0)
	/go/src/github.com/openshift/installer/pkg/destroy/gcp/gcp.go:112 +0x41b
main.runDestroyCmd(0x7ffda7833f34, 0x1c, 0xc0005c5ee0, 0xc00114dbf0)
	/go/src/github.com/openshift/installer/cmd/openshift-install/destroy.go:56 +0x8d
main.newDestroyClusterCmd.func1(0xc000d15680, 0xc000608cc0, 0x0, 0x4)
	/go/src/github.com/openshift/installer/cmd/openshift-install/destroy.go:43 +0x77
github.com/openshift/installer/vendor/github.com/spf13/cobra.(*Command).execute(0xc000d15680, 0xc000608c80, 0x4, 0x4, 0xc000d15680, 0xc000608c80)
	/go/src/github.com/openshift/installer/vendor/github.com/spf13/cobra/command.go:766 +0x2ae
github.com/openshift/installer/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc000d14c80, 0xc00114de78, 0x1, 0x1)
	/go/src/github.com/openshift/installer/vendor/github.com/spf13/cobra/command.go:852 +0x2ec
github.com/openshift/installer/vendor/github.com/spf13/cobra.(*Command).Execute(...)
	/go/src/github.com/openshift/installer/vendor/github.com/spf13/cobra/command.go:800
main.installerMain()
	/go/src/github.com/openshift/installer/cmd/openshift-install/main.go:61 +0x21a
main.main()
	/go/src/github.com/openshift/installer/cmd/openshift-install/main.go:43 +0xc6

Comment 1 Jeremiah Stuever 2020-01-22 22:05:46 UTC
This appears to be happening when destroy identifies DNS records to be deleted from a zone, but the records are deleted by something else before destroy actually deletes them.

Comment 3 Yang Yang 2020-02-12 10:27:10 UTC
Verified with 4.4.0-0.nightly-2020-02-12-004057

Steps for verification are as below:
1, Create an IPI cluster on GCP
2, Delete 1 record set manually
3, Destroy the cluster
DEBUG Listing DNS Zones                            
DEBUG Found cluster private dns zone: yybz1-zfzvk-private-zone 
DEBUG Found parent dns zone: qe                    
DEBUG Deleting 1 recordset(s) in zone qe           
INFO Deleted 1 recordset(s) in zone qe            
DEBUG Deleting 6 recordset(s) in zone yybz1-zfzvk-private-zone 
INFO Deleted 6 recordset(s) in zone yybz1-zfzvk-private-zone 
DEBUG Deleting DNS zones yybz1-zfzvk-private-zone  
INFO Deleted DNS zone yybz1-zfzvk-private-zone    

Installer is not panic hence moving it to verified state.

Comment 5 errata-xmlrpc 2020-05-04 11:25:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581


Note You need to log in before you can comment on or make changes to this bug.