Bug 1908721
| Summary: | directvolumemigration-rsync-transfer pods completed correctly but DVM status threw out error against pods | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Migration Toolkit for Containers | Reporter: | whu | ||||
| Component: | General | Assignee: | John Matthews <jmatthew> | ||||
| Status: | CLOSED NOTABUG | QA Contact: | Xin jiang <xjiang> | ||||
| Severity: | unspecified | Docs Contact: | Avital Pinnick <apinnick> | ||||
| Priority: | unspecified | ||||||
| Version: | 1.4.0 | CC: | ernelson, sregidor, whu, xjiang | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 1.3.z | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2020-12-18 14:06:33 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
In the latest MTC 1.4.0 build, this bug has gone. so close this bug. image: registry.redhat.io/rhmtc/openshift-migration-rhel7-operator@sha256:236d5f964061938f4e845e6d9e62926da68435c0f0304329a0473e060cdb53f1 |
Created attachment 1739975 [details] the log-reader dump from souce and target cluster during the stage running Description of problem: When run DVM stage migration for a nginx application, the stage process is complete with warning message "DirectImageMigration (dvm): openshift-migration/34688070-404f-11eb-b9a9-d3558b12b711-hb7r2 failed". Check the DVM status, it showed directvolumemigration-rsync-transfer pods failed. but check these pods, they are completed without any error. *Version-Release number of selected components (if applicable):* MTC 1.4.0 source cluster : AWS OCP 4.4 target cluster : AWS OCP 4.7 How reproducible: Always Steps to Reproduce: 1. deploy a nginx with 2 pvs application in source cluster $ oc create namespace ocp-00000 $ oc process -p NAMESPACE= ocp-00000 -f nginx_with_pv_defaultsc_template.yml | oc create -f - $ oc get pvc -n ocp-00000 NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE nginx-html Bound pvc-21bfa21a-d89e-4387-bf53-c72bdf79aa76 1Gi RWO gp2 149m nginx-logs Bound pvc-dcb0476b-036f-43d0-b05c-ee6d4058e5c9 1Gi RWO gp2 149m 2. create migration plan for ocp-00000 Just follow common migplan creation process, when enter into "Migration option" phase, checked "Use direct PV migration for filesystem copies" option. 3. trigger stage migration Actual results: Stage migration was complete with warning message "Staging succeeded e with warnings,DirectImageMigration (dvm): openshift-migration/34688070-404f-11eb-b9a9-d3558b12b711-hb7r2 failed. See in dvm status.Errors" The DVM status shows status: ....... errors: - One or more pods are in error state failedPods: - name: directvolumemigration-rsync-transfer-nginx-html namespace: ocp-00000 - name: directvolumemigration-rsync-transfer-nginx-logs namespace: ocp-00000 But the pods directvolumemigration-rsync-transfer-nginx-html and directvolumemigration-rsync-transfer-nginx-logs completed without any error. Expected results: If the related pods completed without any error. the stage migration should finished without any error warning message. Additional info: In target cluster $ oc get migmigration 34688070-404f-11eb-b9a9-d3558b12b711 -o yaml apiVersion: migration.openshift.io/v1alpha1 kind: MigMigration ....... spec: migPlanRef: name: hwh2 namespace: openshift-migration stage: true status: conditions: - category: Warn durable: true lastTransitionTime: "2020-12-17T10:05:25Z" message: 'DirectImageMigration (dvm): openshift-migration/34688070-404f-11eb-b9a9-d3558b12b711-hb7r2 failed. See in dvm status.Errors' status: "True" type: DirectVolumeMigrationFailed - category: Advisory durable: true lastTransitionTime: "2020-12-17T10:05:27Z" message: The migration has completed with warnings, please look at `Warn` conditions. reason: Completed status: "True" type: SucceededWithWarnings itinerary: Stage observedDigest: 3d65757a3919d0ae0518294eb16229c6f2d8782fa8c412589e1bda7da134b9af phase: Completed pipeline: - completed: "2020-12-17T10:04:38Z" message: Completed name: Prepare started: "2020-12-17T10:04:20Z" - completed: "2020-12-17T10:05:19Z" message: Completed name: StageBackup started: "2020-12-17T10:04:38Z" - message: Skipped name: StageRestore skipped: true - message: Skipped name: DirectImage skipped: true - completed: "2020-12-17T10:05:25Z" message: Completed name: DirectVolume progress: - 'Rsync Client Pod ocp-00000/directvolumemigration-rsync-transfer-nginx-html: Failed' - 'Rsync Client Pod ocp-00000/directvolumemigration-rsync-transfer-nginx-logs: Failed' started: "2020-12-17T10:05:19Z" - completed: "2020-12-17T10:05:27Z" message: Completed name: Cleanup started: "2020-12-17T10:05:25Z" startTimestamp: "2020-12-17T10:04:20Z" $ oc get dvm 34688070-404f-11eb-b9a9-d3558b12b711-hb7r2 -o yaml apiVersion: migration.openshift.io/v1alpha1 kind: DirectVolumeMigration ...... spec: createDestinationNamespaces: true destMigClusterRef: name: host namespace: openshift-migration persistentVolumeClaims: - name: nginx-html namespace: ocp-00000 targetAccessModes: - ReadWriteOnce targetStorageClass: gp2 - name: nginx-logs namespace: ocp-00000 targetAccessModes: - ReadWriteOnce targetStorageClass: gp2 srcMigClusterRef: name: source-cluster namespace: openshift-migration status: conditions: - category: Advisory durable: true lastTransitionTime: "2020-12-17T10:05:24Z" message: 'The migration has failed. See: Errors.' reason: WaitForRsyncClientPodsCompleted status: "True" type: Failed errors: - One or more pods are in error state failedPods: - name: directvolumemigration-rsync-transfer-nginx-html namespace: ocp-00000 - name: directvolumemigration-rsync-transfer-nginx-logs namespace: ocp-00000 itinerary: VolumeMigration observedDigest: 28bf54b9b6e3c72a6e029b99f89d52e56036f66628d36f15656737f46f9597e7 phase: Completed phaseDescription: Complete startTimestamp: "2020-12-17T10:04:39Z" In source cluster $ oc get pod -n ocp-00000 NAME READY STATUS RESTARTS AGE directvolumemigration-rsync-transfer-nginx-html 0/1 Completed 0 65m directvolumemigration-rsync-transfer-nginx-logs 0/1 Completed 0 65m directvolumemigration-stunnel-transfer 1/1 Running 0 65m nginx-deployment-6578f8dd49-bqmnx 1/1 Running 0 72m $ oc logs directvolumemigration-rsync-transfer-nginx-html -n ocp-00000 2020/12/17 10:05:27 [1] building file list sending incremental file list 2020/12/17 10:05:27 [1] .d..tp.g... ./ 21 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=1/3)2020/12/17 10:05:27 [1] <f+++++++++ index.html 2020/12/17 10:05:27 [1] .d..tp.g... lost+found/ 21 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=0/3) 2020/12/17 10:05:27 [1] Number of files: 3 (reg: 1, dir: 2) Number of files: 3 (reg: 1, dir: 2) 2020/12/17 10:05:27 [1] Number of created files: 1 (reg: 1) Number of created files: 1 (reg: 1) 2020/12/17 10:05:27 [1] Number of deleted files: 0 Number of deleted files: 0 2020/12/17 10:05:27 [1] Number of regular files transferred: 1 Number of regular files transferred: 1 2020/12/17 10:05:27 [1] Total file size: 21 bytes Total file size: 21 bytes 2020/12/17 10:05:27 [1] Total transferred file size: 21 bytes Total transferred file size: 21 bytes 2020/12/17 10:05:27 [1] Literal data: 21 bytes Literal data: 21 bytes 2020/12/17 10:05:27 [1] Matched data: 0 bytes Matched data: 0 bytes 2020/12/17 10:05:27 [1] File list size: 0 File list size: 0 2020/12/17 10:05:27 [1] File list generation time: 0.001 seconds File list generation time: 0.001 seconds 2020/12/17 10:05:27 [1] File list transfer time: 0.000 seconds File list transfer time: 0.000 seconds 2020/12/17 10:05:27 [1] Total bytes sent: 195 Total bytes sent: 195 2020/12/17 10:05:27 [1] Total bytes received: 46 Total bytes received: 46 2020/12/17 10:05:27 [1] sent 195 bytes received 46 bytes 482.00 bytes/sec sent 195 bytes received 46 bytes 482.00 bytes/sec 2020/12/17 10:05:27 [1] total size is 21 speedup is 0.09 total size is 21 speedup is 0.09 $ oc get pod directvolumemigration-rsync-transfer-nginx-html -n ocp-00000 -o yaml apiVersion: v1 kind: Pod metadata: annotations: k8s.v1.cni.cncf.io/networks-status: "" openshift.io/scc: privileged creationTimestamp: "2020-12-17T10:05:23Z" labels: app: directvolumemigration-rsync-transfer directvolumemigration: rsync-client name: directvolumemigration-rsync-transfer-nginx-html namespace: ocp-00000 resourceVersion: "172587" selfLink: /api/v1/namespaces/ocp-00000/pods/directvolumemigration-rsync-transfer-nginx-html uid: 4796e8aa-87c6-4824-808e-e4eaa674733c spec: containers: - command: - rsync - --archive - --hard-links - --human-readable - --partial - --delete - --port - "2222" - --log-file - /dev/stdout - --info=COPY2,DEL2,REMOVE2,SKIP2,FLIST2,PROGRESS2,STATS2 - /mnt/ocp-00000/nginx-html/ - rsync://root.54.74/nginx-html env: - name: RSYNC_PASSWORD value: hxyNIq image: quay.io/konveyor/rsync-transfer:latest imagePullPolicy: Always name: rsync-client ports: - containerPort: 22 name: rsync-client protocol: TCP resources: {} securityContext: privileged: true readOnlyRootFilesystem: true runAsUser: 0 terminationMessagePath: /dev/termination-log terminationMessagePolicy: FallbackToLogsOnError volumeMounts: - mountPath: /mnt/ocp-00000/nginx-html name: nginx-html - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: default-token-ld5c7 readOnly: true dnsPolicy: ClusterFirst enableServiceLinks: true imagePullSecrets: - name: default-dockercfg-qlczm nodeName: ip-10-0-144-174.us-east-2.compute.internal priority: 0 restartPolicy: Never schedulerName: default-scheduler securityContext: {} serviceAccount: default serviceAccountName: default terminationGracePeriodSeconds: 30 tolerations: - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 volumes: - name: nginx-html persistentVolumeClaim: claimName: nginx-html - name: default-token-ld5c7 secret: defaultMode: 420 secretName: default-token-ld5c7 status: conditions: - lastProbeTime: null lastTransitionTime: "2020-12-17T10:05:23Z" reason: PodCompleted status: "True" type: Initialized - lastProbeTime: null lastTransitionTime: "2020-12-17T10:05:23Z" reason: PodCompleted status: "False" type: Ready - lastProbeTime: null lastTransitionTime: "2020-12-17T10:05:23Z" reason: PodCompleted status: "False" type: ContainersReady - lastProbeTime: null lastTransitionTime: "2020-12-17T10:05:23Z" status: "True" type: PodScheduled containerStatuses: - containerID: cri-o://4e388df2a621e67b5010464c1427db17c28bf8d7d0fc6e6f830d21b0c8ad1afc image: quay.io/konveyor/rsync-transfer:latest imageID: quay.io/konveyor/rsync-transfer@sha256:c38f5aad71e9ea45ed775727114f6b0931c0a2ee0acd3bbbd6792f0312bcaa69 lastState: {} name: rsync-client ready: false restartCount: 0 started: false state: terminated: containerID: cri-o://4e388df2a621e67b5010464c1427db17c28bf8d7d0fc6e6f830d21b0c8ad1afc exitCode: 0 finishedAt: "2020-12-17T10:05:27Z" reason: Completed startedAt: "2020-12-17T10:05:27Z" hostIP: 10.0.144.174 phase: Succeeded podIP: 10.129.2.34 podIPs: - ip: 10.129.2.34 qosClass: BestEffort startTime: "2020-12-17T10:05:23Z" In target cluster, after stage migration $ oc get pvc -n ocp-00000 NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE nginx-html Bound pvc-585bee25-6206-4e7c-b775-29df7e8522a2 1Gi RWO gp2 62m nginx-logs Bound pvc-060a3710-5548-423b-b659-60e850e9ec9d 1Gi RWO gp2 62m $ oc get all -n ocp-00000 NAME READY STATUS RESTARTS AGE pod/directvolumemigration-rsync-transfer 2/2 Running 0 68m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/directvolumemigration-rsync-transfer-svc ClusterIP 172.30.242.227 <none> 2222/TCP 68m NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD route.route.openshift.io/directvolumemigration-rsync-transfer-route directvolumemigration-rsync-transfer-route-ocp-00000.apps.cam-tgt-12870.qe.devcluster.openshift.com directvolumemigration-rsync-transfer-svc 2222 passthrough None