Bug 1921559 - "directvolumemigration-stunnel-transfer" pod is forbidden when migrate app from ocp3.9 to ocp 4.7
Summary: "directvolumemigration-stunnel-transfer" pod is forbidden when migrate app fr...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Migration Toolkit for Containers
Classification: Red Hat
Component: Documentation
Version: 1.4.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 1.4.0
Assignee: Avital Pinnick
QA Contact: Xin jiang
Avital Pinnick
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-28 07:54 UTC by whu
Modified: 2021-02-01 15:30 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-01 15:30:50 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description whu 2021-01-28 07:54:06 UTC
Description of problem:
When migrate a basic nginx application from ocp3.9 to ocp 4.7 by direct volume mode, pods "directvolumemigration-stunnel-transfer" is forbidden because that unable to validate against any security context constraint: [spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 0: must be in the ranges: [1000160000, 1000169999]]'. then the migration will hung.

Version-Release number of selected component (if applicable):
MTC 1.4.0
registry.redhat.io/rhmtc/openshift-migration-rhel7-operator@sha256:79f524931e7188bfbfddf1e3d23f491b627d691ef7849a42432c7aec2d5f8a54
source cluster: AWS OCP 3.9
target cluster: AWS OCP 4.7  (controller)

How reproducible:
always


Steps to Reproduce:
1. deploy nginx application in source cluster
oc process -p LOGS_ACCESSMODE=ReadWriteOnce  -p LOGS_STORAGECLASS=gp2  -p HTML_ACCESSMODE=ReadWriteOnce  -p HTML_STORAGECLASS=gp2 -p namespace=ocp-22222-nginx -f https://gitlab.cee.redhat.com/app-mig/cam-helper/raw/master/ocp-24706/nginx_with_pv_template.yml  | oc create -f -

2. Create migration plan, choose direct migration mode, trigger migration 

Actual results:
pods "directvolumemigration-stunnel-transfer" is forbidden because that unable to validate against any security context constraint: [spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 0: must be in the ranges: [1000160000, 1000169999]]'. then the migration will hung.

Expected results:
Migration should complete successfully.

Additional info:
[In source cluster]
$ oc get all -n ocp-22222-nginx
NAME                                               TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
service/directvolumemigration-rsync-transfer-svc   ClusterIP   172.30.32.228    <none>        2222/TCP   15m
service/my-nginx                                   ClusterIP   172.30.117.113   <none>        8081/TCP   30m

NAME                               DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/nginx-deployment   0         0         0            0           30m

NAME                                          DESIRED   CURRENT   READY     AGE
replicaset.apps/nginx-deployment-5cbc6548b7   0         0         0         30m

NAME                                HOST/PORT                                               PATH      SERVICES   PORT      TERMINATION   WILDCARD
route.route.openshift.io/my-nginx   my-nginx-ocp-22222-nginx.apps.0128-vqc.qe.rhcloud.com             my-nginx   8081                    None


[In target cluster]
$ oc get all -n ocp-22222-nginx
NAME                                       READY   STATUS    RESTARTS   AGE
pod/directvolumemigration-rsync-transfer   2/2     Running   0          17m

NAME                                               TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
service/directvolumemigration-rsync-transfer-svc   ClusterIP   172.30.226.161   <none>        2222/TCP   17m

NAME                           HOST/PORT                                                            PATH   SERVICES                                   PORT   TERMINATION   WILDCARD
route.route.openshift.io/dvm   dvm-ocp-22222-nginx.apps.cam-tgt-14730.qe.devcluster.openshift.com          directvolumemigration-rsync-transfer-svc   2222   passthrough   None


$ oc get pvc -n ocp-22222-nginx
NAME         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
nginx-html   Bound    pvc-033a0055-f785-4aee-acff-c4a3360d291b   1Gi        RWO            gp2            17m
nginx-logs   Bound    pvc-526ef3a4-964b-4e48-a48d-ea9f7570d4b6   1Gi        RWO            gp2            17m


$ oc get migplan mig-plan-22222-ocp-22222-nginx -o yaml
apiVersion: migration.openshift.io/v1alpha1
kind: MigPlan
metadata:
......
  name: mig-plan-22222-ocp-22222-nginx
  namespace: openshift-migration
  resourceVersion: "92095"
spec:
  destMigClusterRef:
    name: host
    namespace: openshift-migration
  migStorageRef:
    name: automatic
    namespace: openshift-migration
  namespaces:
  - ocp-22222-nginx
  persistentVolumes:
  - capacity: 1Gi
    name: pvc-ce95d048-6134-11eb-a710-0ed79d76533b
    pvc:
      accessModes:
      - ReadWriteOnce
      hasReference: true
      name: nginx-logs
      namespace: ocp-22222-nginx
    selection:
      action: copy
      copyMethod: filesystem
      storageClass: gp2
    storageClass: gp2
    supported:
      actions:
      - copy
      - move
      copyMethods:
      - filesystem
      - snapshot
  - capacity: 1Gi
    name: pvc-ce9abcbb-6134-11eb-a710-0ed79d76533b
    pvc:
      accessModes:
      - ReadWriteOnce
      hasReference: true
      name: nginx-html
      namespace: ocp-22222-nginx
    selection:
      action: copy
      copyMethod: filesystem
      storageClass: gp2
    storageClass: gp2
    supported:
      actions:
      - copy
      - move
      copyMethods:
      - filesystem
      - snapshot
  srcMigClusterRef:
    name: source-cluster
    namespace: openshift-migration
status:
  conditions:
  .....
  - category: Required
    lastTransitionTime: "2021-01-28T06:50:22Z"
    message: The migration plan is ready.
    status: "True"
    type: Ready
  .....



$ oc get migmigration  mig-migration-22222-ocp-22222-nginx  -o yaml
apiVersion: migration.openshift.io/v1alpha1
kind: MigMigration
metadata:
......
  name: mig-migration-22222-ocp-22222-nginx
  namespace: openshift-migration
spec:
  migPlanRef:
    name: mig-plan-22222-ocp-22222-nginx
    namespace: openshift-migration
  quiescePods: true
  stage: false
status:
  conditions:
  - category: Advisory
    lastTransitionTime: "2021-01-28T07:02:17Z"
    message: 'Step: 36/47'
    reason: WaitForDirectVolumeMigrationToComplete
    status: "True"
    type: Running
  - category: Required
    lastTransitionTime: "2021-01-28T07:00:51Z"
    message: The migration is ready.
    status: "True"
    type: Ready
  itinerary: Final
  observedDigest: e2965b4d15f19f020e4671298cfb1fe797fa1804aaea52ef2b77ecda4292e407
  phase: WaitForDirectVolumeMigrationToComplete
  pipeline:
  - completed: "2021-01-28T07:01:29Z"
    message: Completed
    name: Prepare
    started: "2021-01-28T07:00:49Z"
  - completed: "2021-01-28T07:01:53Z"
    message: Completed
    name: Backup
    progress:
    - 'Backup openshift-migration/mig-migration-22222-ocp-22222-nginx-kc2v6: 34 out of estimated total of 34 objects backed up (6s)'
    started: "2021-01-28T07:01:29Z"
  - completed: "2021-01-28T07:02:12Z"
    message: Completed
    name: StageBackup
    started: "2021-01-28T07:01:53Z"
  - message: Skipped
    name: StageRestore
    skipped: true
  - completed: "2021-01-28T07:02:17Z"
    message: Waiting for Direct Image Migration to complete.
    name: DirectImage
    phase: WaitForDirectImageMigrationToComplete
    progress:
    - 0 total ImageStreams; 0 running; 0 successful; 0 failed
    started: "2021-01-28T07:02:12Z"
  - message: Creating Stunnel client pods on the source cluster
    name: DirectVolume
    phase: WaitForDirectVolumeMigrationToComplete
    started: "2021-01-28T07:02:17Z"
  - message: Not started
    name: Restore
  - message: Not started
    name: Cleanup
  startTimestamp: "2021-01-28T07:00:49Z"




$ oc get dvm  mig-migration-22222-ocp-22222-nginx-pj9ns -o yaml
apiVersion: migration.openshift.io/v1alpha1
kind: DirectVolumeMigration
metadata:
.....
  name: mig-migration-22222-ocp-22222-nginx-pj9ns
  namespace: openshift-migration
.....
spec:
  createDestinationNamespaces: true
  destMigClusterRef:
    name: host
    namespace: openshift-migration
  persistentVolumeClaims:
  - name: nginx-logs
    namespace: ocp-22222-nginx
    targetAccessModes:
    - ReadWriteOnce
    targetStorageClass: gp2
    verify: false
  - name: nginx-html
    namespace: ocp-22222-nginx
    targetAccessModes:
    - ReadWriteOnce
    targetStorageClass: gp2
    verify: false
  srcMigClusterRef:
    name: source-cluster
    namespace: openshift-migration
status:
  conditions:
  - category: Required
    lastTransitionTime: "2021-01-28T07:02:09Z"
    message: Direct migration is ready
    status: "True"
    type: Ready
  - category: Advisory
    durable: true
    lastTransitionTime: "2021-01-28T07:02:51Z"
    message: 'The migration has failed.  See: Errors.'
    reason: CreateStunnelClientPods
    status: "True"
    type: Failed
  errors:
  - 'pods "directvolumemigration-stunnel-transfer" is forbidden: unable to validate against any security context constraint: [spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 0: must be in the ranges: [1000160000, 1000169999]]'
  itinerary: VolumeMigration
  observedDigest: e20b882451fc7555400f4984dc51ef70c9f68bda422d4b960429ab048b26fca6
  phase: CreateStunnelClientPods
  phaseDescription: Creating Stunnel client pods on the source cluster
  startTimestamp: "2021-01-28T07:02:09Z"


Filter error log from migration-log-reader

In source cluster
velero-67c9bbfcd7-jd2m9 velero time="2021-01-28T07:00:51Z" level=error msg="Error updating backup location phase" backupstoragelocation=automatic-ctcd6 controller=backupstoragelocation error="the server could not find the requested resource (patch backupstoragelocations.velero.io automatic-ctcd6)" logSource="pkg/controller/backupstoragelocation_controller.go:115"
velero-67c9bbfcd7-jd2m9 velero time="2021-01-28T07:00:55Z" level=error msg="Error patching backup location's last-synced time" backupLocation=automatic-gbs62 controller=backup-sync error="the server could not find the requested resource (patch backupstoragelocations.velero.io automatic-gbs62)" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/controller/backup_sync_controller.go:311" error.function="github.com/vmware-tanzu/velero/pkg/controller.(*backupSyncController).run" logSource="pkg/controller/backup_sync_controller.go:311"


In target cluster
migration-controller-f864d7cc5-2m9cp mtc {"level":"info","ts":1611817371.3278346,"logger":"direct|pwzfh","msg":"","direct":"openshift-migration/mig-migration-22222-ocp-22222-nginx-pj9ns","error":"pods \"directvolumemigration-stunnel-transfer\" is forbidden: unable to validate against any security context constraint: [spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 0: must be in the ranges: [1000160000, 1000169999]]","stacktrace":"\ngithub.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*Task).Run()\n\t/remote-source/app/pkg/controller/directvolumemigration/task.go:359\ngithub.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*ReconcileDirectVolumeMigration).migrate()\n\t/remote-source/app/pkg/controller/directvolumemigration/migrate.go:40\ngithub.com/konveyor/mig-controller/pkg/controller/directvolumemigration.(*ReconcileDirectVolumeMigration).Reconcile()\n\t/remote-source/app/pkg/controller/directvolumemigration/directvolumemigration_controller.go:126\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem()\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:215\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1()\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1()\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntil()\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\nk8s.io/apimachinery/pkg/util/wait.Until()\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88\nruntime.goexit()\n\t/opt/rh/go-toolset-1.15/root/usr/lib/go-toolset-1.15-golang/src/runtime/asm_amd64.s:1374"}

Comment 3 whu 2021-02-01 12:50:07 UTC
After running command `oc adm policy add-scc-to-user anyuid -z migration-controller  -n openshift-migration` in ocp 3.9 cluster, then run DVM migration, this time, migration wss successful.

Comment 4 Avital Pinnick 2021-02-01 15:30:50 UTC
Doc changes merged.


Note You need to log in before you can comment on or make changes to this bug.