Bug 2055359
| Summary: | [DR] ramen-dr-cluster-operator pod go into CrashLoopBackOff state during failover or relocate operation | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Sidhant Agrawal <sagrawal> |
| Component: | odf-dr | Assignee: | Shyamsundar <srangana> |
| odf-dr sub component: | ramen | QA Contact: | Sidhant Agrawal <sagrawal> |
| Status: | CLOSED CURRENTRELEASE | Docs Contact: | |
| Severity: | urgent | ||
| Priority: | unspecified | CC: | kramdoss, madam, mmuench, muagarwa, ocs-bugs, odf-bz-bot, prsurve, rperiyas, srangana |
| Version: | 4.10 | Keywords: | AutomationBackLog |
| Target Milestone: | ODF Blockers Only (Development Freeze) | ||
| Target Release: | ODF 4.10.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | 4.10.0-167 | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-04-21 09:12:46 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Please backport to 4.10 |
Description of problem (please be detailed as possible and provide log snippests): After initiating failover or relocate operation, ramen-dr-cluster-operator pod goes into CrashLoopBackOff state and restarts ~6 times before coming to Running state. Also observed the panic "panic: odd number of arguments passed as key-value pairs for logging" in the pod logs. Output from one of the managed cluster: ``` 2022-02-16T18:22:18.943Z DPANIC controllers.VolumeReplicationGroup.vrginstance controllers/volumereplicationgroup_controller.go:1671 odd number of arguments passed as key-value pairs for logging {"VolumeReplicationGroup": "busybox-sagrawal-c1/busybox-drpc", "State": "secondary", "pvc": "busybox-sagrawal-c1/busybox-pvc-2", "ignored key": "secondary"} github.com/ramendr/ramen/controllers.(*VRGInstance).updateVR /remote-source/app/controllers/volumereplicationgroup_controller.go:1671 github.com/ramendr/ramen/controllers.(*VRGInstance).createOrUpdateVR /remote-source/app/controllers/volumereplicationgroup_controller.go:1640 github.com/ramendr/ramen/controllers.(*VRGInstance).processVRAsSecondary /remote-source/app/controllers/volumereplicationgroup_controller.go:1586 github.com/ramendr/ramen/controllers.(*VRGInstance).reconcileVRAsSecondary /remote-source/app/controllers/volumereplicationgroup_controller.go:1162 github.com/ramendr/ramen/controllers.(*VRGInstance).reconcileVRsAsSecondary /remote-source/app/controllers/volumereplicationgroup_controller.go:1128 github.com/ramendr/ramen/controllers.(*VRGInstance).handleVRGMode /remote-source/app/controllers/volumereplicationgroup_controller.go:983 github.com/ramendr/ramen/controllers.(*VRGInstance).processAsSecondary /remote-source/app/controllers/volumereplicationgroup_controller.go:1082 github.com/ramendr/ramen/controllers.(*VRGInstance).processVRGActions /remote-source/app/controllers/volumereplicationgroup_controller.go:412 github.com/ramendr/ramen/controllers.(*VRGInstance).processVRG /remote-source/app/controllers/volumereplicationgroup_controller.go:398 github.com/ramendr/ramen/controllers.(*VolumeReplicationGroupReconciler).Reconcile /remote-source/app/controllers/volumereplicationgroup_controller.go:312 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.0-beta.4/pkg/internal/controller/controller.go:298 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.0-beta.4/pkg/internal/controller/controller.go:253 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.0-beta.4/pkg/internal/controller/controller.go:214 2022-02-16T18:22:19.032Z INFO controllers.VolumeReplicationGroup.vrginstance zapcore/entry.go:234 Exiting processing VolumeReplicationGroup {"VolumeReplicationGroup": "busybox-sagrawal-c1/busybox-drpc", "State": "secondary"} 2022-02-16T18:22:19.032Z INFO controllers.VolumeReplicationGroup zapcore/entry.go:234 Exiting reconcile loop {"VolumeReplicationGroup": "busybox-sagrawal-c1/busybox-drpc"} panic: odd number of arguments passed as key-value pairs for logging goroutine 565 [running]: go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc000af8480, 0xc00099c540, 0x1, 0x1) /remote-source/deps/gomod/pkg/mod/go.uber.org/zap.1/zapcore/entry.go:234 +0x532 go.uber.org/zap.(*Logger).DPanic(0xc000c9aae0, 0x1dc24e8, 0x3d, 0xc00099c540, 0x1, 0x1) /remote-source/deps/gomod/pkg/mod/go.uber.org/zap.1/logger.go:220 +0x85 github.com/go-logr/zapr.handleFields(0xc000c9aae0, 0xc000d17830, 0x3, 0x3, 0x0, 0x0, 0x0, 0x10, 0x1a5d760, 0x1) /remote-source/deps/gomod/pkg/mod/github.com/go-logr/zapr.0/zapr.go:100 +0x5e5 github.com/go-logr/zapr.(*zapLogger).Info(0xc00014aab0, 0x1d940f6, 0x29, 0xc000d17830, 0x3, 0x3) /remote-source/deps/gomod/pkg/mod/github.com/go-logr/zapr.0/zapr.go:127 +0xb0 github.com/ramendr/ramen/controllers.(*VRGInstance).updateVR(0xc000af8cc0, 0xc000db3180, 0x1d5da48, 0x9, 0x207dfc8, 0xc00014aab0, 0xd, 0x20989e0, 0xc000db3180) /remote-source/app/controllers/volumereplicationgroup_controller.go:1671 +0x6e3 github.com/ramendr/ramen/controllers.(*VRGInstance).createOrUpdateVR(0xc000af8cc0, 0xc0005d53c8, 0x13, 0xc0002f1550, 0xd, 0x1d5da48, 0x9, 0x207dfc8, 0xc00014aab0, 0xc0009f9518, ...) /remote-source/app/controllers/volumereplicationgroup_controller.go:1640 +0x957 github.com/ramendr/ramen/controllers.(*VRGInstance).processVRAsSecondary(...) /remote-source/app/controllers/volumereplicationgroup_controller.go:1586 github.com/ramendr/ramen/controllers.(*VRGInstance).reconcileVRAsSecondary(0xc000af8cc0, 0xc000517d98, 0x207dfc8, 0xc00014aab0, 0x2070000) /remote-source/app/controllers/volumereplicationgroup_controller.go:1162 +0xd2 github.com/ramendr/ramen/controllers.(*VRGInstance).reconcileVRsAsSecondary(0xc000af8cc0, 0x0) /remote-source/app/controllers/volumereplicationgroup_controller.go:1128 +0x26b github.com/ramendr/ramen/controllers.(*VRGInstance).handleVRGMode(0xc000af8cc0, 0x1d5da48, 0x9, 0x0) /remote-source/app/controllers/volumereplicationgroup_controller.go:983 +0x115 github.com/ramendr/ramen/controllers.(*VRGInstance).processAsSecondary(0xc000af8cc0, 0xc0005b4d00, 0x0, 0x0, 0x0) /remote-source/app/controllers/volumereplicationgroup_controller.go:1082 +0x3fe github.com/ramendr/ramen/controllers.(*VRGInstance).processVRGActions(0xc000af8cc0, 0x0, 0x0, 0x431ae7, 0x585585585585) /remote-source/app/controllers/volumereplicationgroup_controller.go:412 +0x1c5 github.com/ramendr/ramen/controllers.(*VRGInstance).processVRG(0xc000af8cc0, 0xc000af8cf0, 0xc000c82d80, 0xc00061abe8, 0x13) /remote-source/app/controllers/volumereplicationgroup_controller.go:398 +0x74e github.com/ramendr/ramen/controllers.(*VolumeReplicationGroupReconciler).Reconcile(0xc000bb8300, 0x2073c78, 0xc000c82d80, 0xc00061abe8, 0x13, 0xc000b872d0, 0xc, 0xc000c82d00, 0x0, 0x0, ...) /remote-source/app/controllers/volumereplicationgroup_controller.go:312 +0x710 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000b43ea0, 0x2073bd0, 0xc000d30000, 0x1b72b60, 0xc000d2a440) /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.0-beta.4/pkg/internal/controller/controller.go:298 +0x30d sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000b43ea0, 0x2073bd0, 0xc000d30000, 0xc000594d00) /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.0-beta.4/pkg/internal/controller/controller.go:253 +0x205 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2(0xc0002f04b0, 0xc000b43ea0, 0x2073bd0, 0xc000d30000) /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.0-beta.4/pkg/internal/controller/controller.go:214 +0x6b created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.0-beta.4/pkg/internal/controller/controller.go:210 +0x425 ``` Version of all relevant components (if applicable): OCP: 4.10.0-0.nightly-2022-02-11-123954 ODF: 4.10.0-156 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Is there any workaround available to the best of your knowledge? NA Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 3 Can this issue reproducible? Yes Can this issue reproduce from the UI? Yes If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Configure RDR setup with 1 ACM hub and 2 managed clusters 2. Deploy workload 3. Initiate failover or relocate action and observe the ramen-dr-cluster-operator pod where the workload was running Actual results: ramen-dr-cluster-operator pod goes into CrashLoopBackOff state due to panic during failover or relocate operations Expected results: ramen-dr-cluster-operator pod should not go into CrashLoopBackOff state during failover or relocate operations Additional info: