Description of problem: ACM multicluster-operators-hub pods are failing with "panic" error massage while running subscription update. Version-Release number of selected component (if applicable): ACM Version: 2.04 How reproducible: Found the error message as following on multicluster-operators-hub pods. In this log: multicluster-operators-hub-subscription-84cd7cd4bb-nznx6-multicluster-operators-hub-subscription.log "E0125 15:15:32.899139 1 runtime.go:78] Observed a panic: &runtime.TypeAssertionError{_interface:(*runtime._type)(0x2ae4ba0), concrete:(*runtime._type)(nil), asserted:(*runtime._type)(0x2a58a40), missingMethod:""} (interface conversion: interface {} is nil, not int64) " it seems that it is finding a null character that is not int64 type. Not sure what would cause that. So could you help take a look? Thanks in advance.
It looks like the `multicluster-operators-hub-subscription` pod(s) are failing. Assigning App Lifecycle to triage.
G2Bsync 769175729 comment mikeshng Thu, 28 Jan 2021 15:45:25 UTC G2Bsync could you please provide more log entries around the panic error?
@pengbo Could you provide the full log multicluster-operators-hub-subscription-84cd7cd4bb-nznx6-multicluster-operators-hub-subscription.log, where the panic stack trace should be included to indicate the detailed code line number? Also I noticed the panic happened in 2.0.Z, Could you upgrade to 2.1 to see if the issue will be gone?
Created attachment 1751930 [details] multicluster-operators-hub-subscription-84cd7cd4bb-nznx6-multicluster-operators-hub-subscription.log The attachment is multicluster-operators-hub-subscription-84cd7cd4bb-nznx6-multicluster-operators-hub-subscription.log file. And I will ask customer if they can upgrade to ACM 2.1.x. Thanks pb
G2Bsync 769357729 comment mikeshng Thu, 28 Jan 2021 20:19:16 UTC G2Bsync There are not enough log entries to know for sure the exact problematic spot but there seems to be only one `int64` reference in the entire repo. So given that info, fixes have been made to 2.0, 2.1, 2.2 and master branches.
Thanks Peng, code fix has been merged to 2.0, 2.1, 2.2.
Just to clarify, the fix has been merged in all three 2.x branches but until there is an actual release out, this problem might still happen. To avoid this issue just make sure the spec replicas value is populated and with an integer value.
G2Bsync 769971367 comment xiangjingli Fri, 29 Jan 2021 18:25:19 UTC G2Bsync @ pengbo.com. Thanks Peng for the log. That turns out the fix does address the panic. The fix has been merged to 2.0 and 2.1 for Z release. We plan to have 2.0.8 GA on Mar 11 right now.
Ok, I will inform customer your message "We plan to have 2.0.8 GA on Mar 11 right now". One more question, is that fix already in ACM 2.1.x ? If customer already upgraded to 2.1.X as we suggested last time, the problem should be gone, right? Thanks
Hi Peng, just to clarify. The fix is not in the current 2.1 release. He/she will need to wait for a new 2.1 release similar to waiting for a new 2.0.8 release.
Peng, FYI 2.1.3 release is planned for Feb 17 perhaps moving up to the 2.1.3 is the best course of action. Thanks.
yes, this is another panic. I noticed you have created a new bugzilla #1925281
(In reply to Xiangjing Li from comment #13) > yes, this is another panic. I noticed you have created a new bugzilla > #1925281 I did, thanks for the follow up! After chatting in the forum it seemed like the similarities between the two were only very loose so a separate bug made sense.
Ezequiel. The fix will be available in ACM 2.0.8. If you want to patch your ACM cluster before ACM 2.0.8, here is the instruction. In open-cluster-management namespace on ACM hub cluster, edit the advanced-cluster-management.v2.1.0 csv. (or 2.1.1 CSV) oc edit csv advanced-cluster-management.v2.0.4 -n open-cluster-management Look for containers multicluster-operators-standalone-subscription and multicluster-operators-hub-subscription and update their images to quay.io/open-cluster-management/multicluster-operators-subscription:TAG (it is recommended you note the current SHA tag if you want to revert the change). Replace TAG with 2.0.8-SNAPSHOT-2021-02-03-19-04-48 so the whole image URL is quay.io/open-cluster-management/multicluster-operators-subscription:2.0.8-SNAPSHOT-2021-02-03-19-04-48. This will recreate multicluster-operators-standalone-subscription-xxxxxxx and multicluster-operators-hub-subscription-xxxxxxx pods in open-cluster-management namespace. Check that the new pods are running with the new container image. After this, please let us know if this fixes your problem. Thanks.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (RHACM 2.0.Z multicluster-operators-subscription hotfix), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:0514
@ebrizuel @gekis The hot fix should have been delivered. Have this comment for stopping the daily reminder email by BZ :-)