Description of problem: Migration of app created from rails-pgsql-persistent openshift template fails to migrate postgresql pod Version-Release number of selected component (if applicable): # oc describe pod/controller-manager-78d9589445-f44hp | grep Image Image: quay.io/ocpmigrate/mig-controller:release-1.0 Image ID: quay.io/ocpmigrate/mig-controller@sha256:b9e78beef9f9c9d36dacb84d552ec0c7ce09fea556293d6fbec8c90c11f70cb7 # oc describe pod/migration-operator-5cb94b46fb-kc77d | grep Image Image: quay.io/ocpmigrate/mig-operator:release-1.0 Image ID: quay.io/ocpmigrate/mig-operator@sha256:0c6ae48dc51277924a9496ffd4c986553d68fcdb6a47cc5e10bcc3894d44cfbb Image: quay.io/ocpmigrate/mig-operator:release-1.0 Image ID: quay.io/ocpmigrate/mig-operator@sha256:0c6ae48dc51277924a9496ffd4c986553d68fcdb6a47cc5e10bcc3894d44cfbb # oc describe pod/velero-58f7447985-mpcfj | grep Image Image: quay.io/ocpmigrate/migration-plugin:release-1.0 Image ID: quay.io/ocpmigrate/migration-plugin@sha256:eb9b82c3f26bcd876bc501e18dde7cffe7e451c8c8a231959ed4d9f1127b91a6 Image: quay.io/ocpmigrate/velero:fusor-1.1 Image ID: quay.io/ocpmigrate/velero@sha256:6c16a1288bf6aca74afbb0184fa987506839c5193ae8bb2be05cb6aa0a9f3dc5 # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.0-0.nightly-2019-09-18-114152 True False 2d1h Cluster version is 4.2.0-0.nightly-2019-09-18-114152 How reproducible: always Steps to Reproduce: 1. On OCP 3.11 create a new project and run oc new-app --template rails-pgsql-persistent # oc get pods NAME READY STATUS RESTARTS AGE postgresql-1-h2qxf 1/1 Running 0 1h rails-pgsql-persistent-1-build 0/1 Completed 0 1h rails-pgsql-persistent-1-qwh6k 1/1 Running 0 1h oc rsh postgresql-1-h2qxf mkdir -p /var/lib/pgsql/data/xtra create a sample file under that dir. 2. Install migration-operator on OCP 3.11 3. Install migration operator and controller on OCP 4. 4. Configure migration CR yamls with the required parameters. Actual results: Migration is successful but postgresql pod is not migrated. # oc get all NAME READY STATUS RESTARTS AGE pod/rails-pgsql-persistent-1-build 0/1 Completed 0 25m pod/rails-pgsql-persistent-2-deploy 0/1 Error 0 23m pod/rails-pgsql-persistent-2-hook-pre 0/1 Error 0 23m NAME DESIRED CURRENT READY AGE replicationcontroller/rails-pgsql-persistent-1 0 0 0 25m replicationcontroller/rails-pgsql-persistent-2 0 0 0 23m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/postgresql ClusterIP 172.30.211.190 <none> 5432/TCP 25m service/rails-pgsql-persistent ClusterIP 172.30.62.142 <none> 8080/TCP 25m NAME REVISION DESIRED CURRENT TRIGGERED BY deploymentconfig.apps.openshift.io/postgresql 0 1 0 config,image(postgresql:9.5) deploymentconfig.apps.openshift.io/rails-pgsql-persistent 2 1 0 config,image(rails-pgsql-persistent:latest) NAME TYPE FROM LATEST buildconfig.build.openshift.io/rails-pgsql-persistent Source Git 1 NAME TYPE FROM STATUS STARTED DURATION build.build.openshift.io/rails-pgsql-persistent-1 Source Git@67d882b Complete 25 minutes ago 2m12s NAME IMAGE REPOSITORY TAGS UPDATED imagestream.image.openshift.io/rails-pgsql-persistent image-registry.openshift-image-registry.svc:5000/rails-psql/rails-pgsql-persistent latest 23 minutes ago NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD route.route.openshift.io/rails-pgsql-persistent rails-pgsql-persistent-rails-psql.apps.rpattath-4-perfmig.perf-testing.devcluster.openshift.com rails-pgsql-persistent <all> None Expected results: Migration should be successful and app should be working as expected Additional info: Noticed these error messages on OCP 4 under velero pod time="2019-09-20T19:04:54Z" level=error msg="Using default resource values, couldn't parse resource requirements: couldn't parse CPU request \"\": quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$'." cmd=/velero logSource="pkg/restore/restic_restore_action.go:112" pluginName=velero pod=rails-psql/postgresql-1-h2qxf-stage restore=openshift-migration/migmigration-sample-st8dt time="2019-09-20T19:05:12Z" level=warning msg="unable to restore additional item" additionalResource=persistentvolumeclaims additionalResourceName=postgresql additionalResourceNamespace=rails-psql error="stat /tmp/135518595/resources/persistentvolumeclaims/namespaces/rails-psql/postgresql.json: no such file or directory" logSource="pkg/restore/restore.go:965" restore=openshift-migration/migmigration-sample-nhplk
Aligning to 4.3.0 Related to issue https://github.com/fusor/openshift-velero-plugin/issues/7
First commenting on the velero error messages. I don't believe either of those messages have anything to do with the problem. The first message is because velero sees an empty string and can't parse that as a CPU request, so it uses the default instead. This is for a stage pod where we don't include any explicit CPU resource requirements, so using the default is the expected result. Regarding the second, assuming this is from the final restore rather than the stage restore, this just means that velero was unable to restore the postgresql PVC since we exclude PVCs from final restore. Running `oc get pvc` in the destination cluster would show for sure whether the PVC is being restored properly. It's probably worth investigating which resource is attempting to pull in the PVC on final restore so we can suppress the message, though. As for the bug itself, this looks suspiciously similar to an issue I was debugging earlier today. deploymentconfig.apps.openshift.io/postgresql is present, but replicationcontroller/postgresql-1 is missing. I see from the "oc get all" listing that your deploymentconfig has an imagechange trigger on postgresql:9.5. It looks like in the latest 4.2 clusters, postgresql versions earlier than 9.6 are no longer installed. To confirm this, run "oc get imagestreamtag -n openshift postgresql:9.5" on the destination cluster and you will probably see that it's missing. Also look at the TAGS output of "oc get imagestream -n openshift postgresql" to see what versions are available. Basically, if the version referenced is missing, this means that postgresql image used by the src deploymentconfig is not available in the destination cluster. We don't migrate images in the openshift namespace since 1) most of them are actually located outside of the cluster, as referenced in the imagestreamtag resources, and 2) They're managed by the cluster. I suspect the resolution here is to, more generally, make sure that when we install clusters that will be used as migration targets we install the full set of openshift-namespace images that are likely to be available on the src cluster, and if we can't do this, then we need to document which ones are unavailable so that user applications which depend on these can either upgrade to later versions or migrate to using custom imagestreams installed in the application namespaces.
On OCP 4 # oc get imagestream -n openshift postgresql NAME IMAGE REPOSITORY TAGS UPDATED postgresql image-registry.openshift-image-registry.svc:5000/openshift/postgresql 10,9.6,latest 27 hours ago
OK, This confirms it -- the problem here is exactly as I described above. The required postgresql:9.5 image is not available in the destination cluster, so there's no way applications that depend on it can be migrated without either upgrading the dependencies prior to migration or installing the required images in the openshift namespace on the destination cluster. I'm still not sure whether this is ultimately a CPMA issue (making sure the dest cluster is configured with the right images available) or a documentation issue (so that users know that they have to have them available).
*** Bug 1749906 has been marked as a duplicate of this bug. ***
Moving to our 1.2.0 release to allow more time to decide how we want to handle this. As of now we have a note in the official docs about the potential for some images to be deprecated between migrations. https://docs.openshift.com/container-platform/4.2/migration/migrating-3-4/migrating-openshift-3-to-4.html If your application uses images from the openshift namespace, the required versions of the images must be present on the target cluster. If not, you must update the imagestreamtags references to use an available version that is compatible with your application. If the imagestreamtags cannot be updated, you can manually upload equivalent images to the application namespaces and update the applications to reference them. The following imagestreamtags have been removed from OpenShift Container Platform 4.2: dotnet:1.0, dotnet:1.1, dotnet:2.0 dotnet-runtime:2.0 mariadb:10.1 mongodb:2.4, mongodb:2.6 mysql:5.5, mysql:5.6 nginx:1.8 nodejs:0.10, nodejs:4, nodejs:6 perl:5.16, perl:5.20 php:5.5, php:5.6 postgresql:9.2, postgresql:9.4, postgresql:9.5 python:3.3, python:3.4 ruby:2.0, ruby:2.2
Tracking this as a larger RFE for adding more validations in https://issues.redhat.com/browse/MIG-169
Closing as stale, please reopen if issue persists as of the current release.