Version: quay.io/openshift-release-dev/ocp-release:4.10.0-x86_64 Platform: IBM Cloud What happened? My install failed early during installation (due to a gateway timeout) before any resources were created in IBM Cloud. time="2022-03-01T18:29:22Z" level=fatal msg="failed to fetch Terraform Variables: failed to fetch dependency of \"Terraform Variables\": failed to generate asset \"Platform Provisioning Check\": baseDomain: Internal error: failed to get cis instance: Gateway Timeout" Hive attempts to clean up by running uninstall with the ClusterID/Metadata of the failed install including the default ResourceGroupName which is the ClusterID for the cluster (eg. abutcher-lj7bf). Uninstall crashes because the ResourceGroup cannot be found and does not exist. time="2022-03-01T18:38:24Z" level=debug msg="Listing virtual service instances" time="2022-03-01T18:38:24Z" level=debug msg="Listing virtual service instances" time="2022-03-01T18:38:24Z" level=debug msg="Listing load balancers" time="2022-03-01T18:38:34Z" level=debug msg="Listing subnets" time="2022-03-01T18:38:34Z" level=debug msg="Listing images" time="2022-03-01T18:38:34Z" level=debug msg="Listing public gateways" time="2022-03-01T18:38:34Z" level=info msg="Skipping deletion of security groups with generated VPC" time="2022-03-01T18:38:35Z" level=debug msg="Listing floating IPs" time="2022-03-01T18:38:36Z" level=debug msg="Listing dedicated hosts" time="2022-03-01T18:38:36Z" level=debug msg="Listing VPCs" E0301 18:38:37.741094 1 runtime.go:78] Observed a panic: runtime.boundsError{x:0, y:0, signed:true, code:0x0} (runtime error: index out of range [0] with length 0) goroutine 63 [running]: k8s.io/apimachinery/pkg/util/runtime.logPanic({0x537e2e0, 0xc000140168}) k8s.io/apimachinery.3/pkg/util/runtime/runtime.go:74 +0x7d k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc00007a240}) k8s.io/apimachinery.3/pkg/util/runtime/runtime.go:48 +0x75 panic({0x537e2e0, 0xc000140168}) runtime/panic.go:1038 +0x215 github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).ResourceGroupID(0xc0005986c0) github.com/openshift/installer.0-master.0.20220118155007-ad535d3fdbf4/pkg/destroy/ibmcloud/ibmcloud.go:298 +0x388 github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).listDedicatedHosts(0xc0005986c0) github.com/openshift/installer.0-master.0.20220118155007-ad535d3fdbf4/pkg/destroy/ibmcloud/dedicatedhost.go:24 +0xc5 github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).destroyDedicatedHosts(0xc0005986c0) github.com/openshift/installer.0-master.0.20220118155007-ad535d3fdbf4/pkg/destroy/ibmcloud/dedicatedhost.go:174 +0x36 github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).executeStageFunction.func1() github.com/openshift/installer.0-master.0.20220118155007-ad535d3fdbf4/pkg/destroy/ibmcloud/ibmcloud.go:159 +0x3f k8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1({0x7f44b41a5980, 0x0}) k8s.io/apimachinery.3/pkg/util/wait/wait.go:220 +0x1b k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext({0x5cdd610, 0xc000052040}, 0xc00006fe90) k8s.io/apimachinery.3/pkg/util/wait/wait.go:233 +0x7c k8s.io/apimachinery/pkg/util/wait.poll({0x5cdd610, 0xc000052040}, 0xd0, 0x20e9225, 0x30) k8s.io/apimachinery.3/pkg/util/wait/wait.go:580 +0x38 k8s.io/apimachinery/pkg/util/wait.PollImmediateInfiniteWithContext({0x5cdd610, 0xc000052040}, 0x1aee987, 0x28) k8s.io/apimachinery.3/pkg/util/wait/wait.go:566 +0x49 k8s.io/apimachinery/pkg/util/wait.PollImmediateInfinite(0x57c63e0, 0x0) k8s.io/apimachinery.3/pkg/util/wait/wait.go:555 +0x46 github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).executeStageFunction(0xc0005986c0, {{0x56365c4, 0xc00006ffd0}, 0xc000cfaf60}, 0xc000e314a0, 0xc000e314a0) github.com/openshift/installer.0-master.0.20220118155007-ad535d3fdbf4/pkg/destroy/ibmcloud/ibmcloud.go:156 +0x108 created by github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).destroyCluster github.com/openshift/installer.0-master.0.20220118155007-ad535d3fdbf4/pkg/destroy/ibmcloud/ibmcloud.go:130 +0xae5 panic: runtime error: index out of range [0] with length 0 [recovered] panic: runtime error: index out of range [0] with length 0 What did you expect to happen? Uninstall succeeds because there is nothing to clean up. How to reproduce it (as minimally and precisely as possible)? I hit this running uninstall normally with generated metadata but since the original failure was a gateway timeout we have to simulate the uninstall failure by providing a metadata.json to openshift-install destroy. Create a metadata.json file with a "resourceGroupName" that doesn't correspond to an existing ResourceGroup, place within a directory and run openshift-install destroy --dir=<dir with metadata.json> I was able to reproduce the crash providing the following metadata.json file with valid+existing accountID, cisInstanceCRN and baseDomain. {"clusterName":"abutcher-test","clusterID":"12345","infraID":"abutcher-test-czpjs","ibmcloud":{"accountID":"VALID_ACCOUNT_ID","baseDomain":"VALID_BASE_DOMAIN","cisInstanceCRN":"VALID_CIS_INSTANCE_CRN_FOR_BASEDOMAIN","region":"us-south","resourceGroupName":"abutcher-test-czpjs"}} Anything else we need to know? I'm a developer for the Hive team and encountered this issue testing hive.
Thanks Andrew, I'll notify IBM devs to review this BZ ASAP.
checked with /home/fedora/n410/openshift-install 4.12.0-0.nightly-2022-07-13-062839 built from commit 09e92dc201d741615420eb004cd8021b44a25f67 release image registry.ci.openshift.org/ocp/release@sha256:63bc1950bb6e14a817d7b9415dde32dfd1a995a8aa07f5e5b6e7bbff2aae5bcf release architecture amd64 copyed the metadata.json to Folder, 'openshift-install destroy cluster --dir ${1} --log-level debug' the resource group is not existed. cat metadata.json: {"clusterName":"logci4123","clusterID":"44932a36-054d-41a4-8d5a-857a424a00c6","infraID":"logci4123-5z7j9","ibmcloud":{"accountID":"fdc2e14cf8bc4d53a67f972dc2e2c861","baseDomain":"ibmcloud.qe.devcluster.openshift.com","cisInstanceCRN":"crn:v1:bluemix:public:internet-svcs:global:a/fdc2e14cf8bc4d53a67f972dc2e2c861:e8ee6ca1-4b31-4307-8190-e67f6925f83b::","region":"us-east","resourceGroupName":"logci4123-5z7j9"}} destroy cluster failed. with the following output: DEBUG OpenShift Installer 4.12.0-0.nightly-2022-07-13-062839 DEBUG Built from commit 09e92dc201d741615420eb004cd8021b44a25f67 DEBUG Listing virtual service instances DEBUG Listing virtual service instances INFO Listing disks DEBUG All disks fetched DEBUG Listing load balancers DEBUG Listing subnets DEBUG Listing public gateways INFO Skipping deletion of security groups with generated VPC DEBUG Listing images DEBUG Listing floating IPs DEBUG Listing dedicated hosts DEBUG Listing VPCs DEBUG Listing VPCs E0713 07:10:19.870835 341712 runtime.go:78] Observed a panic: runtime.boundsError{x:0, y:0, signed:true, code:0x0} (runtime error: index out of range [0] with length 0) goroutine 115 [running]: k8s.io/apimachinery/pkg/util/runtime.logPanic({0x41d8f80?, 0xc000ec27b0}) /go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x86 k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc00010c240?}) /go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x75 panic({0x41d8f80, 0xc000ec27b0}) /usr/lib/golang/src/runtime/panic.go:838 +0x207 github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).ResourceGroupID(0xc000bc2360) /go/src/github.com/openshift/installer/pkg/destroy/ibmcloud/ibmcloud.go:343 +0x388 github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).listDedicatedHosts(0xc000bc2360) /go/src/github.com/openshift/installer/pkg/destroy/ibmcloud/dedicatedhost.go:24 +0xc5 github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).destroyDedicatedHosts(0xc000bc2360) /go/src/github.com/openshift/installer/pkg/destroy/ibmcloud/dedicatedhost.go:174 +0x36 github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).executeStageFunction.func1() /go/src/github.com/openshift/installer/pkg/destroy/ibmcloud/ibmcloud.go:186 +0x3f k8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1({0x18, 0xc000484800}) /go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:220 +0x1b k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext({0x19f0ba28?, 0xc00012a000?}, 0xc00069f690?) /go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:233 +0x57 k8s.io/apimachinery/pkg/util/wait.poll({0x19f0ba28, 0xc00012a000}, 0xd0?, 0x1108625?, 0x30?) /go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:580 +0x38 k8s.io/apimachinery/pkg/util/wait.PollImmediateInfiniteWithContext({0x19f0ba28, 0xc00012a000}, 0x40d687?, 0x28?) /go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:566 +0x49 k8s.io/apimachinery/pkg/util/wait.PollImmediateInfinite(0x0?, 0x0?) /go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:555 +0x46 github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).executeStageFunction(0xc000bc2360, {{0x46ac765?, 0xc00069f7d0?}, 0xc00090af10?}, 0xc0008fe660?, 0xc0008fe660?) /go/src/github.com/openshift/installer/pkg/destroy/ibmcloud/ibmcloud.go:183 +0x108 created by github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).destroyCluster /go/src/github.com/openshift/installer/pkg/destroy/ibmcloud/ibmcloud.go:157 +0xb3b panic: runtime error: index out of range [0] with length 0 [recovered] panic: runtime error: index out of range [0] with length 0 goroutine 115 [running]: k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc00010c240?}) /go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:55 +0xd8 panic({0x41d8f80, 0xc000ec27b0}) /usr/lib/golang/src/runtime/panic.go:838 +0x207 github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).ResourceGroupID(0xc000bc2360) /go/src/github.com/openshift/installer/pkg/destroy/ibmcloud/ibmcloud.go:343 +0x388 github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).listDedicatedHosts(0xc000bc2360) /go/src/github.com/openshift/installer/pkg/destroy/ibmcloud/dedicatedhost.go:24 +0xc5 github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).destroyDedicatedHosts(0xc000bc2360) /go/src/github.com/openshift/installer/pkg/destroy/ibmcloud/dedicatedhost.go:174 +0x36 github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).executeStageFunction.func1() /go/src/github.com/openshift/installer/pkg/destroy/ibmcloud/ibmcloud.go:186 +0x3f k8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1({0x18, 0xc000484800}) /go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:220 +0x1b k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext({0x19f0ba28?, 0xc00012a000?}, 0xc00069f690?) /go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:233 +0x57 k8s.io/apimachinery/pkg/util/wait.poll({0x19f0ba28, 0xc00012a000}, 0xd0?, 0x1108625?, 0x30?) /go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:580 +0x38 k8s.io/apimachinery/pkg/util/wait.PollImmediateInfiniteWithContext({0x19f0ba28, 0xc00012a000}, 0x40d687?, 0x28?) /go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:566 +0x49 k8s.io/apimachinery/pkg/util/wait.PollImmediateInfinite(0x0?, 0x0?) /go/src/github.com/openshift/installer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:555 +0x46 github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).executeStageFunction(0xc000bc2360, {{0x46ac765?, 0xc00069f7d0?}, 0xc00090af10?}, 0xc0008fe660?, 0xc0008fe660?) /go/src/github.com/openshift/installer/pkg/destroy/ibmcloud/ibmcloud.go:183 +0x108 created by github.com/openshift/installer/pkg/destroy/ibmcloud.(*ClusterUninstaller).destroyCluster /go/src/github.com/openshift/installer/pkg/destroy/ibmcloud/ibmcloud.go:157 +0xb3b
I performed some additional investigation and testing and came up with the following PR https://github.com/openshift/installer/pull/6152 In cases where the metadata.json contains an incorrect/non-existing ResourceGroupName, or an empty string, the installer will exit safely with a failure, noted below for example. # bin/openshift-install destroy cluster --dir bz_2061947/bz2061947-no-rg-2 FATAL Failed to destroy cluster: No ResourceGroupName provided # bin/openshift-install destroy cluster --dir bz_2061947/bz2061947-no-rg-2 FATAL Failed to destroy cluster: ResourceGroup '"bz2061947-no-rg-not-rg-tqcmc"' not found Rather than return successfully, which will end up removing the metadata.json, a failure was desired better. This will prevent errors on the user side, where perhaps they have the wrong IC_API_KEY, or account setup, and the installer not finding the ResourceGroup (which exists in another account or accessible via another IC_API_KEY) from returning successfully without destroying and then removing the metadata.json file, making it more difficult to reconstruct and perform the destroy on the proper account/etc.
tested with ./openshift-install 4.12.0-0.ci-2022-08-23-112842 DEBUG Built from commit f84ce649a1e8cba455fb2411ca9abc00050a1e01 1. try to destroy the methodata.json which the resource group is not existed, can exit with failure, $timeout 20m ./openshift-install destroy cluster --dir ${1} --log-level debug DEBUG OpenShift Installer 4.12.0-0.ci-2022-08-23-112842 DEBUG Built from commit f84ce649a1e8cba455fb2411ca9abc00050a1e01 FATAL Failed to destroy cluster: ResourceGroup '"rioliu-20423-khqx8"' not found 2. in the metadata.json the resource group is existed, no resource in it, destroy succeed and the resourc group is deleted.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399