Description of problem: When we execute several direct migrations in parallel, some rsync pods are reporting this error: $ oc logs rsync-vc492 -c rsync 2021/09/16 14:09:18 [26] @ERROR: auth failed on module 81c3b080dad537de7e10e0987a4bf52e 2021/09/16 14:09:18 [26] rsync error: error starting client-server protocol (code 5) at main.c(1661) [sender=3.1.3] @ERROR: auth failed on module 81c3b080dad537de7e10e0987a4bf52e rsync error: error starting client-server protocol (code 5) at main.c(1661) [sender=3.1.3] Version-Release number of selected component (if applicable): SOURCE CLUSTER: AWS OCP 3.11 (MTC 1.5.1) TARGET CLSUTER: AWS OCP 4.9 (MTC 1.6.0) (CONTROLLER + UI) REPLICATION REPOSITORY: AWS S3 How reproducible: Intermittent Steps to Reproduce: 1. Execute several direct migrations in parallel 2. 3. Actual results: Some rsync pods are failing. In the rsync pod in source cluster we can see this error in the logs: $ oc logs rsync-vc492 -c rsync 2021/09/16 14:09:18 [26] @ERROR: auth failed on module 81c3b080dad537de7e10e0987a4bf52e 2021/09/16 14:09:18 [26] rsync error: error starting client-server protocol (code 5) at main.c(1661) [sender=3.1.3] @ERROR: auth failed on module 81c3b080dad537de7e10e0987a4bf52e rsync error: error starting client-server protocol (code 5) at main.c(1661) [sender=3.1.3] In the source cluster’s rsync pod yaml information we can see this: xinjiang: which pod? env: - name: RSYNC_PASSWORD image: registry.redhat.io/rhmtc/openshift-migration-rsync-transfer-rhel8:v1.5.1-3 imagePullPolicy: IfNotPresent It seems that the RSYNC_PASSWORD value is missing Expected results: Rsync pods should not fail and the migrations should be executed successfully. Additional info: This is the full rsync pod yaml info: $ oc get pods -o yaml rsync-vc492 apiVersion: v1 kind: Pod metadata: annotations: openshift.io/scc: rsync-anyuid creationTimestamp: 2021-09-16T14:09:15Z generateName: rsync- labels: app: directvolumemigration-rsync-transfer app.kubernetes.io/part-of: openshift-migration directvolumemigration: ca619596-b358-4f3f-ba44-645b6958d501 migration.openshift.io/created-for-pvc: 81c3b080dad537de7e10e0987a4bf52e migration.openshift.io/dvmp-done: "True" migration.openshift.io/migrated-by-migplan: 8686365f-71b3-46b0-9988-6fafdb5ff3f5 migration.openshift.io/rsync-attempt: "20" owner: directvolumemigration name: rsync-vc492 namespace: ocp-24659-mysql resourceVersion: "53860" selfLink: /api/v1/namespaces/ocp-24659-mysql/pods/rsync-vc492 uid: ac97e99f-16f7-11ec-8bf3-0e062dacb45f spec: containers: - command: - /bin/bash - -c - trap "touch /usr/share/rsync/rsync-client-container-done" EXIT SIGINT SIGTERM; timeout=120; SECONDS=0; while [ $SECONDS -lt $timeout ]; do nc -z localhost 6443; rc=$?; if [ $rc -eq 0 ]; then /usr/bin/rsync --recursive --links --perms --devices --specials --times --owner --group --hard-links --delete --partial --human-readable --log-file=/dev/stdout --info=COPY2,DEL2,REMOVE2,SKIP2,FLIST2,PROGRESS2,STATS2 /mnt/ocp-24659-mysql/81c3b080dad537de7e10e0987a4bf52e/ rsync://root@localhost/81c3b080dad537de7e10e0987a4bf52e --port 6443; rc=$?; break; fi; done; exit $rc; env: - name: RSYNC_PASSWORD image: registry.redhat.io/rhmtc/openshift-migration-rsync-transfer-rhel8:v1.5.1-3 imagePullPolicy: IfNotPresent name: rsync resources: limits: cpu: "1" memory: 1Gi requests: cpu: 100m memory: 1Gi securityContext: capabilities: drop: - MKNOD - SETPCAP privileged: false readOnlyRootFilesystem: true runAsUser: 0 terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /mnt/ocp-24659-mysql/81c3b080dad537de7e10e0987a4bf52e name: mnt - mountPath: /usr/share/rsync name: rsync-communication - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: default-token-n7qg9 readOnly: true - command: - /bin/bash - -c - |- /bin/stunnel /etc/stunnel/stunnel.conf while true do test -f /usr/share/rsync/rsync-client-container-done if [ $? -eq 0 ] then break fi done exit 0 image: registry.redhat.io/rhmtc/openshift-migration-rsync-transfer-rhel8:v1.5.1-3 imagePullPolicy: IfNotPresent name: stunnel ports: - containerPort: 6443 name: stunnel protocol: TCP resources: limits: cpu: "1" memory: 1Gi requests: cpu: 100m memory: 1Gi securityContext: capabilities: drop: - MKNOD - SETPCAP privileged: false readOnlyRootFilesystem: true runAsUser: 0 terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/stunnel/stunnel.conf name: crane2-stunnel-client-config subPath: stunnel.conf - mountPath: /etc/stunnel/certs name: crane2-stunnel-client-secret - mountPath: /usr/share/rsync name: rsync-communication - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: default-token-n7qg9 readOnly: true dnsPolicy: ClusterFirst imagePullSecrets: - name: default-dockercfg-9hhr4 nodeName: ip-172-18-6-255.ec2.internal nodeSelector: node-role.kubernetes.io/compute: "true" priority: 0 restartPolicy: Never schedulerName: default-scheduler securityContext: seLinuxOptions: level: s0:c26,c10 serviceAccount: default serviceAccountName: default terminationGracePeriodSeconds: 30 tolerations: - effect: NoSchedule key: node.kubernetes.io/memory-pressure operator: Exists volumes: - name: mnt persistentVolumeClaim: claimName: mysql - emptyDir: {} name: rsync-communication - configMap: defaultMode: 420 name: crane2-stunnel-client-config name: crane2-stunnel-client-config - name: crane2-stunnel-client-secret secret: defaultMode: 420 items: - key: tls.crt path: tls.crt - key: tls.key path: tls.key secretName: crane2-stunnel-client-secret - name: default-token-n7qg9 secret: defaultMode: 420 secretName: default-token-n7qg9 status: conditions: - lastProbeTime: null lastTransitionTime: 2021-09-16T14:09:15Z status: "True" type: Initialized - lastProbeTime: null lastTransitionTime: 2021-09-16T14:09:15Z message: 'containers with unready status: [rsync stunnel]' reason: ContainersNotReady status: "False" type: Ready - lastProbeTime: null lastTransitionTime: null message: 'containers with unready status: [rsync stunnel]' reason: ContainersNotReady status: "False" type: ContainersReady - lastProbeTime: null lastTransitionTime: 2021-09-16T14:09:15Z status: "True" type: PodScheduled containerStatuses: - containerID: docker://517d80bf430ff242ea4217ba08a4d760f90a001a0b5fd14bc7f189b203efad44 image: registry.redhat.io/rhmtc/openshift-migration-rsync-transfer-rhel8:v1.5.1-3 imageID: docker-pullable://registry.redhat.io/rhmtc/openshift-migration-rsync-transfer-rhel8@sha256:d08650fb7ee7ce1b48e44515d794285dd5f9b9effec984aa034e329845bbe802 lastState: {} name: rsync ready: false restartCount: 0 state: terminated: containerID: docker://517d80bf430ff242ea4217ba08a4d760f90a001a0b5fd14bc7f189b203efad44 exitCode: 5 finishedAt: 2021-09-16T14:09:18Z reason: Error startedAt: 2021-09-16T14:09:18Z - containerID: docker://807c0e1f4449af089efc7dc6d7ad38d4474ff98b4220612659595cc6b2e3614c image: registry.redhat.io/rhmtc/openshift-migration-rsync-transfer-rhel8:v1.5.1-3 imageID: docker-pullable://registry.redhat.io/rhmtc/openshift-migration-rsync-transfer-rhel8@sha256:d08650fb7ee7ce1b48e44515d794285dd5f9b9effec984aa034e329845bbe802 lastState: {} name: stunnel ready: false restartCount: 0 state: terminated: containerID: docker://807c0e1f4449af089efc7dc6d7ad38d4474ff98b4220612659595cc6b2e3614c exitCode: 0 finishedAt: 2021-09-16T14:09:18Z reason: Completed startedAt: 2021-09-16T14:09:18Z hostIP: 172.18.6.255 phase: Failed podIP: 10.130.2.95 qosClass: Burstable startTime: 2021-09-16T14:09:15Z
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Migration Toolkit for Containers (MTC) 1.7.0 release advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:1043