Bug 1918799

Summary: Missing git repo secret causes multicluster-operators-hub-subscription to crash
Product: Red Hat Advanced Cluster Management for Kubernetes Reporter: Ryan Cook <rcook>
Component: App LifecycleAssignee: ian zhang <izhang>
Status: CLOSED ERRATA QA Contact: Rafat Islam <rislam>
Severity: high Docs Contact: bswope <bswope>
Priority: unspecified    
Version: rhacm-2.1CC: gghezzo, rjung, xiangli
Target Milestone: ---Flags: rislam: qe_test_coverage+
gghezzo: rhacm-2.1.z+
Target Release: rhacm-2.1.3   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: rhacm-2.1.3 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-17 18:19:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ryan Cook 2021-01-21 15:10:21 UTC
Description of problem: If you create an application and subscription within ACM and the secret for the git repository is not available it causes the subscription hub to go into a crash loop. This could cause cascading issues because it breaks app deployment and updating.


Version-Release number of selected component (if applicable): 2.1


How reproducible:
Deploy all application components (channel, app, placement, subscription) using a private repository and don't provide a secret.

Steps to Reproduce:
1. create private repo
2. create acm application without secret
3. get logs of pod
oc logs -f multicluster-operators-hub-subscription-769d89fb95-mfhv2 -n open-cluster-management

Actual results:
E0121 15:04:49.018617       1 gitrepo.go:198] Secret "github-gitops-clusters" not foundUnable to get secret from local cluster.
E0121 15:04:49.018630       1 hub_git.go:323] subscription-hub-reconciler "msg"="failed to register subscription to git watcher register" "error"="Secret \"github-gitops-clusters\" not found"  
I0121 15:04:49.018648       1 mcmhub_controller.go:705] subscription-hub-reconciler "msg"="Enter finalCommit..."  
I0121 15:04:49.018684       1 mcmhub_controller.go:733] subscription-hub-reconciler "msg"="spec or metadata of scribe-system/scribe is updated"  
I0121 15:04:49.018703       1 mcmhub_controller.go:769] subscription-hub-reconciler "msg"="no post hooks, exit the reconcile."  
I0121 15:04:49.018722       1 mcmhub_controller.go:770] subscription-hub-reconciler "msg"="Eixt finalCommit..."  
I0121 15:04:49.018738       1 panic.go:679] subscription-hub-reconciler/scribe-system/scribe "msg"="exist Hub Reconciling subscription: scribe-system/scribe"  
E0121 15:04:49.018800       1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 3621 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x2dcc520, 0x4a6bb70)
	/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/runtime/runtime.go:74 +0xa3
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/runtime/runtime.go:48 +0x82
panic(0x2dcc520, 0x4a6bb70)
	/usr/lib/golang/src/runtime/panic.go:679 +0x1b2
github.com/open-cluster-management/multicloud-operators-subscription/pkg/controller/mcmhub.(*HubGitOps).GetLatestCommitID(0xc000aa9500, 0xc000315440, 0x3166707, 0x25, 0xc000fbb2d8, 0x1)
	/remote-source/app/pkg/controller/mcmhub/hub_git.go:457 +0x176
github.com/open-cluster-management/multicloud-operators-subscription/pkg/controller/mcmhub.(*ReconcileSubscription).UpdateGitDeployablesAnnotation(0xc0008a8000, 0xc000315440, 0xc0005791e0, 0x6, 0x0)
	/remote-source/app/pkg/controller/mcmhub/gitrepo_sync.go:75 +0x1f4
github.com/open-cluster-management/multicloud-operators-subscription/pkg/controller/mcmhub.(*ReconcileSubscription).doMCMHubReconcile(0xc0008a8000, 0xc000315440, 0x0, 0x0)
	/remote-source/app/pkg/controller/mcmhub/hub.go:72 +0x37c
github.com/open-cluster-management/multicloud-operators-subscription/pkg/controller/mcmhub.(*ReconcileSubscription).Reconcile(0xc0008a8000, 0xc00116afd0, 0xd, 0xc00116afb4, 0x6, 0xc000a3e400, 0x0, 0x0, 0x0)
	/remote-source/app/pkg/controller/mcmhub/mcmhub_controller.go:565 +0xec0
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0006e0990, 0x2e80c20, 0xc0009d9460, 0x10e7c00)
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/internal/controller/controller.go:235 +0x27d
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0006e0990, 0x0)
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/internal/controller/controller.go:209 +0xcb
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc0006e0990)
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/internal/controller/controller.go:188 +0x2b
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc00125a2a0)
	/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:155 +0x5e
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc00125a2a0, 0x35ad420, 0xc000a51770, 0x1, 0xc0000d4600)
	/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:156 +0xa3
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00125a2a0, 0x3b9aca00, 0x0, 0x1, 0xc0000d4600)
	/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:133 +0xaa
k8s.io/apimachinery/pkg/util/wait.Until(0xc00125a2a0, 0x3b9aca00, 0xc0000d4600)
	/remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery.3/pkg/util/wait/wait.go:90 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/internal/controller/controller.go:170 +0x431
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x28ab226]


Expected results:
Application is just skipped and error logged to logs and potentially console 

Additional info:

Comment 2 Roke Jung 2021-01-27 22:53:41 UTC
Fix will be available in 2.1.3 to contain the accurate error message in subscription status.

Comment 3 Roke Jung 2021-01-27 22:59:22 UTC
You can wait for ACM2.1.3 to verify it. If you need the fix earlier, you this doc https://github.com/open-cluster-management/multicloud-operators-subscription/blob/master/docs/patching_subscription_image.md and use community-2.1 tag instead of community-latest.

Comment 9 errata-xmlrpc 2021-02-17 18:19:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Advanced Cluster Management 2.1.3 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0607