Created attachment 1821605 [details] Logs showing the controller "creating" the stage pod and then waiting Description of problem: When I create the test app mentioned in the reproducer steps below, the result is: 1. Logs indicate stage pods are getting launched for completed app 'validator' app Pod 2. stage pod is not actually launched 3. mig-controller stalls waiting for stage pod to enter running state, which will never happen. I've been able to trace this issue to having been introduced in PR (doesn't happen pre this commit, happens after) https://github.com/konveyor/mig-controller/pull/1164, but I'm not sure of the complete intent and reasoning behind the changes made there. Version-Release number of selected component (if applicable): MTC 1.6.0. Clusters: OCP 4.8 AWS (control) / OCP 3.11 AWS (remote) How reproducible: Always Steps to Reproduce: 1.Create a namespace and a quota $ oc new-project ocp-31309-quotanoattach Create this quota $ cat <<EOF | oc create -f - apiVersion: v1 kind: ResourceQuota metadata: name: object-quota namespace: ocp-31309-quotanoattach spec: hard: persistentvolumeclaims: "2" services.loadbalancers: "0" services.nodeports: "0" pods: "1" replicationcontrollers: "1" secrets: "6" configmaps: "4" services: "10" limits.cpu: "20" limits.memory: 20Gi requests.cpu: "10" requests.memory: 10Gi EOF 2. Create a PVC $ cat <<EOF | oc create -f - apiVersion: v1 kind: PersistentVolumeClaim metadata: name: quoatdev-test namespace: ocp-31309-quotanoattach spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Mi EOF 3. Provision the PVC $ cat <<EOF | oc create -f - apiVersion: v1 kind: Pod metadata: name: provisioner-pod namespace: ocp-31309-quotanoattach labels: app: provision spec: restartPolicy: OnFailure containers: - name: provisioner resources: limits: cpu: "0.01" memory: 128Mi image: alpine command: [ "/bin/sh", "-c", "--" ] args: [ "echo 'data inserted' > /data/vol/data.txt ; dd if=/dev/urandom of=/data/vol/binary.rnd bs=1000000 count=1" ] volumeMounts: - name: testvolume mountPath: /data/vol volumes: - name: testvolume persistentVolumeClaim: claimName: quoatdev-test EOF 4. Remove the provisioner pod once it's completed $ oc delete pod provisioner-pod -n ocp-31309-quotanoattach 5. Create a validation pod job $ cat <<EOF | oc create -f - apiVersion: batch/v1 kind: Job metadata: name: validator-job namespace: ocp-31309-quotanoattach labels: app: validation spec: template: spec: restartPolicy: OnFailure containers: - name: validator image: alpine resources: limits: cpu: "0.01" memory: 128Mi command: [ "/bin/sh", "-c", "--" ] args: - set -e; echo 'Validating'; cd /data/vol; ls data.txt; ls binary.rnd; export CONTENT=\$(cat data.txt); [[ "\$CONTENT" == 'data inserted' ]] || { echo 'Wrong data content' && exit 1; } ; export SIZE=\$( wc -c binary.rnd | cut -d ' ' -f 1 ); [[ \$SIZE == '1000000' ]] || { echo 'Wrong binary file size' && exit 1; }; volumeMounts: - name: testvolume mountPath: /data/vol volumes: - name: testvolume persistentVolumeClaim: claimName: quoatdev-test backoffLimit: 4 EOF 6. Migrate the namespace once the validator pod is completed (do not delete the validator pod) Actual results: Migration will get stuck waiting for stage pods to come online, but it never created the pods Expected results: Migration will either: 1) create stage pods and then wait for them 2) not create stage pods and not wait for them Additional info:
This PR cherry-pick the change in release branch - https://github.com/konveyor/mig-controller/pull/1199 Changing the status to MODIFIED
verified with mtc 1.6.0 registry.redhat.io/rhmtc/openshift-migration-controller-rhel8@sha256:3b5efa9c8197fe0313a2ab7eb184d135ba9749c9a4f0d15a6abb11c0d18b9194
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Migration Toolkit for Containers (MTC) 1.6.0 security & bugfix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3694