Bug 1737878
Summary: | Failed to install app-migration component by mig-operator on ocp 3.7-3.10 | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Zihan Tang <zitang> |
Component: | Migration Tooling | Assignee: | Derek Whatley <dwhatley> |
Status: | CLOSED ERRATA | QA Contact: | Zhang Cheng <chezhang> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 4.2.0 | CC: | chezhang, dymurray, jmontleo, rpattath, sregidor, xjiang |
Target Milestone: | --- | ||
Target Release: | 4.2.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-10-16 06:34:52 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Zihan Tang
2019-08-06 10:17:46 UTC
The error message "Unable to update the status to mark cr as running" shows up in an operator-sdk issue matching our scenario of attempting to run an ansible-operator on OpenShift 3.10: https://github.com/operator-framework/operator-sdk/issues/1124 After speaking with some operator-sdk maintainers (Shawn + Fabian), we determined that operator-sdk doesn't support 3.7-3.10, support starts at 3.11. This indicates we'll need a different means to deploy the app migration solution onto earlier versions of OpenShift. Tests we performed showed that this error isn't related to recent changes in mig-operator. We tested with the 'htb1' image tag, a known good tag of the operator, and saw the same issue in operator logs. Interestingly, we were able to run mig-operator successfully on Origin 3.10, but experienced this issue on OCP 3.10. The official word from operator-sdk devs is that ansible-operator + 3.10 _may_ work in some cases, but we shouldn't rely on it, and it definitely won't work on earlier versions of OpenShift. Ok, we can use opc3.11 to continue testing, and waiting for the doc ready for ocp3.7-3.10 to install mig-controller components. Hi Derek, what's the status of this bug? We have tested migration on ocp3.10 and 3.11, and we're planning to test on ocp3.9 and 3.7 in this week, does the mig-operator work for ocp3.9 or ocp3.7 now? If operator not work, can you provide other workarounds to install migratoin components on ocp3.7 or 3.9? I use the latest operator.yml https://raw.githubusercontent.com/fusor/mig-operator/master/operator.yml in ocp 3.10, It also failed to create operator pod. # oc get pod -n openshift-migration-operator NAME READY STATUS RESTARTS AGE migration-operator-978959bfc-r4m5m 1/2 CrashLoopBackOff 7 12m # oc get pod -o yaml | grep image image: quay.io/ocpmigrate/mig-operator:latest imagePullPolicy: Always image: quay.io/ocpmigrate/mig-operator:latest imagePullPolicy: Always imagePullSecrets: image: quay.io/ocpmigrate/mig-operator:latest imageID: docker-pullable://quay.io/ocpmigrate/mig-operator@sha256:e81b8ee3ae8572b6562e308ee7b7b74e86a6914f1468830cd1f5d8e0372f1888 image: quay.io/ocpmigrate/mig-operator:latest imageID: docker-pullable://quay.io/ocpmigrate/mig-operator@sha256:e81b8ee3ae8572b6562e308ee7b7b74e86a6914f1468830cd1f5d8e0372f1888 # oc logs -f migration-operator-978959bfc-r4m5m -c operator {"level":"info","ts":1566368029.972873,"logger":"cmd","msg":"Go Version: go1.12.7"} {"level":"info","ts":1566368029.9729533,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"} {"level":"info","ts":1566368029.9729784,"logger":"cmd","msg":"Version of operator-sdk: v0.9.0"} {"level":"info","ts":1566368029.973028,"logger":"cmd","msg":"Watching namespace.","Namespace":"openshift-migration-operator"} {"level":"info","ts":1566368030.0616763,"logger":"leader","msg":"Trying to become the leader."} {"level":"info","ts":1566368030.1370168,"logger":"leader","msg":"Found existing lock with my name. I was likely restarted."} {"level":"info","ts":1566368030.1370509,"logger":"leader","msg":"Continuing as the leader."} {"level":"error","ts":1566368030.2092516,"logger":"cmd","msg":"Exposing metrics port failed.","Namespace":"openshift-migration-operator","error":"failed to initialize service object for metrics: replicasets.extensions \"migration-operator-978959bfc\" is forbidden: User \"system:serviceaccount:openshift-migration-operator:migration-operator\" cannot get replicasets.extensions in the namespace \"openshift-migration-operator\": User \"system:serviceaccount:openshift-migration-operator:migration-operator\" cannot get replicasets.extensions in project \"openshift-migration-operator\"","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tpkg/mod/github.com/go-logr/zapr.1/zapr.go:128\ngithub.com/operator-framework/operator-sdk/pkg/ansible.Run\n\tsrc/github.com/operator-framework/operator-sdk/pkg/ansible/run.go:103\ngithub.com/operator-framework/operator-sdk/cmd/operator-sdk/run.newRunAnsibleCmd.func1\n\tsrc/github.com/operator-framework/operator-sdk/cmd/operator-sdk/run/ansible.go:38\ngithub.com/spf13/cobra.(*Command).execute\n\tpkg/mod/github.com/spf13/cobra.3/command.go:762\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tpkg/mod/github.com/spf13/cobra.3/command.go:852\ngithub.com/spf13/cobra.(*Command).Execute\n\tpkg/mod/github.com/spf13/cobra.3/command.go:800\nmain.main\n\tsrc/github.com/operator-framework/operator-sdk/cmd/operator-sdk/main.go:85\nruntime.main\n\t/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/runtime/proc.go:200"} Error: failed to initialize service object for metrics: replicasets.extensions "migration-operator-978959bfc" is forbidden: User "system:serviceaccount:openshift-migration-operator:migration-operator" cannot get replicasets.extensions in the namespace "openshift-migration-operator": User "system:serviceaccount:openshift-migration-operator:migration-operator" cannot get replicasets.extensions in project "openshift-migration-operator" Usage: operator-sdk run ansible [flags] Flags: -h, --help help for ansible --inject-owner-ref The ansible operator will inject owner references unless this flag is false (default true) --max-workers int Maximum number of workers to use. Overridden by environment variable. (default 1) --reconcile-period duration Default reconcile period for controllers (default 1m0s) --watches-file string Path to the watches file to use (default "./watches.yaml") --zap-devel Enable zap development mode (changes defaults to console encoder, debug log level, and disables sampling) --zap-encoder encoder Zap log encoding ('json' or 'console') --zap-level level Zap log level (one of 'debug', 'info', 'error' or any integer value > 0) (default info) --zap-sample sample Enable zap log sampling. Sampling will be disabled for integer log levels > 1 Global Flags: --verbose Enable verbose logging Make sure you delete the previous migration-operator clusterrolebinding. I deleted the mig namespace, and did an oc create -f operator.yml and saw the following two messages: Error from server (AlreadyExists): error when creating "operator.yml": customresourcedefinitions.apiextensions.k8s.io "migrationcontrollers.migration.openshift.io" already exists Error from server (AlreadyExists): error when creating "operator.yml": clusterrolebindings.rbac.authorization.k8s.io "migration-operator" already exists Of course, the old version of the clusterrolebinding references the systemaccount from the old namespace and so doesn't provide the serviceaccount from the new namespace the appropriate permissions, and I saw the same error. I deleted the clusterrolebinding and namespace, tried again, and this time things came up properly. Thanks, operator now works for OCP3.9 and OCP3.10 Hey Zihan, This BZ should have been resolved by https://github.com/fusor/mig-operator/pull/34. Could you please confirm that you're able to deploy to 3.7-3.11, and also 4.x successfully now? Verified. Migration Tool now can install on ocp 3.7, 3.9, 3.10, 3.11 and 4.2 using https://raw.githubusercontent.com/fusor/mig-operator/master/operator.yml Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922 |