Description of the problem: ACM 2.4 install attempt to an OCP 4.9 IPv6 disconnected hub (ipi bm) fails due to multicluster-operators-standalone-subscription pod in CrashLoopBackoff. Operator snapshot version: 2.4.0-DOWNSTREAM-2021-08-31-23-32-56 OCP version: 4.9.0-0.nightly-2021-08-31-123131 Steps to reproduce: 1. Deploy OCP 4.9 BM ipi hub in ipv6 disconnected env 2. Mirror ACM ds snapshot images and create CatalogSource from mirroed index image 3. Create subscription + multicluster object for ACM Actual results: multicluster-operators-hub-subscription-76d697f8ff-p4r9w 1/1 Running 0 85m multicluster-operators-standalone-subscription-6fffb5758-q2cvf 0/1 CrashLoopBackOff 19 (3m29s ago) 85m multiclusterhub-operator-6dbdcd8f8d-8gbnh 1/1 Running 0 85m Expected results: All pods running and ACM installs successfully Additional info:
Attached more detailed logs, but here are excerpts: ## oc get pods multicluster-operators-channel-7966cc67dc-bbkkw 1/1 Running 1 (87m ago) 89m multicluster-operators-hub-subscription-76d697f8ff-p4r9w 1/1 Running 0 89m multicluster-operators-standalone-subscription-6fffb5758-q2cvf 0/1 CrashLoopBackOff 20 (100s ago) 89m multiclusterhub-operator-6dbdcd8f8d-8gbnh 1/1 Running 0 89m ## oc logs from multicluster-operators-standalone-subscription-6fffb5758-q2cvf I0901 18:08:04.709040 1 subscription.go:1074] setting auto-reconcile rate to low E0901 18:08:04.714746 1 gitrepo.go:263] Get "https://github.com/open-cluster-management/acm-hive-openshift-releases.git/info/refs?service=git-upload-pack": dial tcp 140.82.112.4:443: connect: network is unreachable Failed to git clone with the primary channel: Get "https://github.com/open-cluster-management/acm-hive-openshift-releases.git/info/refs?service=git-upload-pack": dial tcp 140.82.112.4:443: connect: network is unreachable I0901 18:08:04.714838 1 panic.go:965] exit doSubscription: rhacm/hive-clusterimagesets-subscription-fast-0 E0901 18:08:04.715501 1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference) goroutine 5810 [running]: k8s.io/apimachinery/pkg/util/runtime.logPanic(0x228c2e0, 0x390a570) /remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/runtime/runtime.go:74 +0x95 k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0) /remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/runtime/runtime.go:48 +0x86 panic(0x228c2e0, 0x390a570) /usr/lib/golang/src/runtime/panic.go:965 +0x1b9 github.com/open-cluster-management/multicloud-operators-subscription/pkg/utils.getConnectionOptions(0xc00241b888, 0x0, 0x1, 0xc00241b6e8, 0x3) /remote-source/app/pkg/utils/gitrepo.go:169 +0x73 github.com/open-cluster-management/multicloud-operators-subscription/pkg/utils.CloneGitRepo(0xc00241b888, 0x0, 0xc0015d8500, 0x0, 0x0) /remote-source/app/pkg/utils/gitrepo.go:266 +0x13c5 github.com/open-cluster-management/multicloud-operators-subscription/pkg/subscriber/git.(*SubscriberItem).cloneGitRepo(0xc0016d2480, 0x29, 0xc00286e1f0, 0x28933f0, 0x1) /remote-source/app/pkg/subscriber/git/git_subscriber_item.go:764 +0x2b4 github.com/open-cluster-management/multicloud-operators-subscription/pkg/subscriber/git.(*SubscriberItem).doSubscription(0xc0016d2480, 0x0, 0x0) /remote-source/app/pkg/subscriber/git/git_subscriber_item.go:204 +0x30c github.com/open-cluster-management/multicloud-operators-subscription/pkg/subscriber/git.(*SubscriberItem).doSubscriptionWithRetries(0xc0016d2480, 0x29e8d60800, 0x3) /remote-source/app/pkg/subscriber/git/git_subscriber_item.go:157 +0x45 github.com/open-cluster-management/multicloud-operators-subscription/pkg/subscriber/git.(*SubscriberItem).Start.func1() /remote-source/app/pkg/subscriber/git/git_subscriber_item.go:146 +0x18f k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc001b33140) /remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:155 +0x5f k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc001b33140, 0x28c8c80, 0xc001b2a8a0, 0x1, 0xc001a10360) /remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:156 +0x9b k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc001b33140, 0x34630b8a000, 0x0, 0xc001d55e01, 0xc001a10360) /remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:133 +0x98 k8s.io/apimachinery/pkg/util/wait.Until(0xc001b33140, 0x34630b8a000, 0xc001a10360) /remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:90 +0x4d created by github.com/open-cluster-management/multicloud-operators-subscription/pkg/subscriber/git.(*SubscriberItem).Start /remote-source/app/pkg/subscriber/git/git_subscriber_item.go:128 +0x254 panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1c60693] goroutine 5810 [running]: k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0) /remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/runtime/runtime.go:55 +0x109 panic(0x228c2e0, 0x390a570) /usr/lib/golang/src/runtime/panic.go:965 +0x1b9 github.com/open-cluster-management/multicloud-operators-subscription/pkg/utils.getConnectionOptions(0xc00241b888, 0x0, 0x1, 0xc00241b6e8, 0x3) /remote-source/app/pkg/utils/gitrepo.go:169 +0x73 github.com/open-cluster-management/multicloud-operators-subscription/pkg/utils.CloneGitRepo(0xc00241b888, 0x0, 0xc0015d8500, 0x0, 0x0) /remote-source/app/pkg/utils/gitrepo.go:266 +0x13c5 github.com/open-cluster-management/multicloud-operators-subscription/pkg/subscriber/git.(*SubscriberItem).cloneGitRepo(0xc0016d2480, 0x29, 0xc00286e1f0, 0x28933f0, 0x1) /remote-source/app/pkg/subscriber/git/git_subscriber_item.go:764 +0x2b4 github.com/open-cluster-management/multicloud-operators-subscription/pkg/subscriber/git.(*SubscriberItem).doSubscription(0xc0016d2480, 0x0, 0x0) /remote-source/app/pkg/subscriber/git/git_subscriber_item.go:204 +0x30c github.com/open-cluster-management/multicloud-operators-subscription/pkg/subscriber/git.(*SubscriberItem).doSubscriptionWithRetries(0xc0016d2480, 0x29e8d60800, 0x3) /remote-source/app/pkg/subscriber/git/git_subscriber_item.go:157 +0x45 github.com/open-cluster-management/multicloud-operators-subscription/pkg/subscriber/git.(*SubscriberItem).Start.func1() /remote-source/app/pkg/subscriber/git/git_subscriber_item.go:146 +0x18f k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc001b33140) /remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:155 +0x5f k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc001b33140, 0x28c8c80, 0xc001b2a8a0, 0x1, 0xc001a10360) /remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:156 +0x9b k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc001b33140, 0x34630b8a000, 0x0, 0xc001d55e01, 0xc001a10360) /remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:133 +0x98 k8s.io/apimachinery/pkg/util/wait.Until(0xc001b33140, 0x34630b8a000, 0xc001a10360) /remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:90 +0x4d created by github.com/open-cluster-management/multicloud-operators-subscription/pkg/subscriber/git.(*SubscriberItem).Start /remote-source/app/pkg/subscriber/git/git_subscriber_item.go:128 +0x254
It seems the issue is that the pod can't reach Github, which is expected as this is an ipv6 disconnected env. We do not run into this issue with ACM 2.3 E0901 18:08:04.714746 1 gitrepo.go:263] Get "https://github.com/open-cluster-management/acm-hive-openshift-releases.git/info/refs?service=git-upload-pack": dial tcp 140.82.112.4:443: connect: network is unreachable Failed to git clone with the primary channel: Get "https://github.com/open-cluster-management/acm-hive-openshift-releases.git/info/refs?service=git-upload-pack": dial tcp 140.82.112.4:443: connect: network is unreachable
As discussed with Chad, there seems to be a local git subscription that is causing this nilpointer. The workaround is remove that git subscription for now while we fix the nilpointer.
The "connect: network is unreachable" is not the reason why the standalone subscription pod has been crashing. The root clause is a nilpointer that was introduced when we added the secondary channel. This has been fixed in https://github.com/open-cluster-management/multicloud-operators-subscription/pull/564/commits/cc19997dbf5af010d73af08e7c42f64bfb77cf6f already. I think you can try again using a more recent 2.4 development build and the standalone subscription pod should not crash anymore
Validating with latest 2.4 today...
Verified no more nilpointer error crash on 2.4.0-DOWNSTREAM-2021-09-03-01-00-25 multicluster-operators-standalone-subscription-78f8d9bb48-tqgvc 1/1 Running 0 124m
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat Advanced Cluster Management 2.4 images and security updates), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:4618
*** Bug 2040500 has been marked as a duplicate of this bug. ***