Bug 1754112 - Migration of app created from rails-pgsql-persistent openshift template fails to migrate postgresql pod
Summary: Migration of app created from rails-pgsql-persistent openshift template fails...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Migration Toolkit for Containers
Classification: Red Hat
Component: General
Version: 1.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 1.5.0
Assignee: Scott Seago
QA Contact: Roshni
Avital Pinnick
URL:
Whiteboard:
: 1749906 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-09-20 20:09 UTC by Roshni
Modified: 2021-04-07 20:41 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-04-07 20:41:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Roshni 2019-09-20 20:09:51 UTC
Description of problem:
Migration of app created from rails-pgsql-persistent openshift template fails to migrate postgresql pod

Version-Release number of selected component (if applicable):
# oc describe pod/controller-manager-78d9589445-f44hp | grep Image
    Image:         quay.io/ocpmigrate/mig-controller:release-1.0
    Image ID:      quay.io/ocpmigrate/mig-controller@sha256:b9e78beef9f9c9d36dacb84d552ec0c7ce09fea556293d6fbec8c90c11f70cb7
# oc describe pod/migration-operator-5cb94b46fb-kc77d | grep Image
    Image:         quay.io/ocpmigrate/mig-operator:release-1.0
    Image ID:      quay.io/ocpmigrate/mig-operator@sha256:0c6ae48dc51277924a9496ffd4c986553d68fcdb6a47cc5e10bcc3894d44cfbb
    Image:          quay.io/ocpmigrate/mig-operator:release-1.0
    Image ID:       quay.io/ocpmigrate/mig-operator@sha256:0c6ae48dc51277924a9496ffd4c986553d68fcdb6a47cc5e10bcc3894d44cfbb
# oc describe pod/velero-58f7447985-mpcfj | grep Image
    Image:          quay.io/ocpmigrate/migration-plugin:release-1.0
    Image ID:       quay.io/ocpmigrate/migration-plugin@sha256:eb9b82c3f26bcd876bc501e18dde7cffe7e451c8c8a231959ed4d9f1127b91a6
    Image:         quay.io/ocpmigrate/velero:fusor-1.1
    Image ID:      quay.io/ocpmigrate/velero@sha256:6c16a1288bf6aca74afbb0184fa987506839c5193ae8bb2be05cb6aa0a9f3dc5

# oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.2.0-0.nightly-2019-09-18-114152   True        False         2d1h    Cluster version is 4.2.0-0.nightly-2019-09-18-114152


How reproducible:
always

Steps to Reproduce:
1. On OCP 3.11 create a new project and run oc new-app --template rails-pgsql-persistent
# oc get pods
NAME                             READY     STATUS      RESTARTS   AGE
postgresql-1-h2qxf               1/1       Running     0          1h
rails-pgsql-persistent-1-build   0/1       Completed   0          1h
rails-pgsql-persistent-1-qwh6k   1/1       Running     0          1h

oc rsh postgresql-1-h2qxf
mkdir -p /var/lib/pgsql/data/xtra
create a sample file under that dir.
2. Install migration-operator on OCP 3.11
3. Install migration operator and controller on OCP 4.
4. Configure migration CR yamls with the required parameters.

Actual results:
Migration is successful but postgresql pod is not migrated. 

# oc get all
NAME                                    READY   STATUS      RESTARTS   AGE
pod/rails-pgsql-persistent-1-build      0/1     Completed   0          25m
pod/rails-pgsql-persistent-2-deploy     0/1     Error       0          23m
pod/rails-pgsql-persistent-2-hook-pre   0/1     Error       0          23m

NAME                                             DESIRED   CURRENT   READY   AGE
replicationcontroller/rails-pgsql-persistent-1   0         0         0       25m
replicationcontroller/rails-pgsql-persistent-2   0         0         0       23m

NAME                             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
service/postgresql               ClusterIP   172.30.211.190   <none>        5432/TCP   25m
service/rails-pgsql-persistent   ClusterIP   172.30.62.142    <none>        8080/TCP   25m

NAME                                                        REVISION   DESIRED   CURRENT   TRIGGERED BY
deploymentconfig.apps.openshift.io/postgresql               0          1         0         config,image(postgresql:9.5)
deploymentconfig.apps.openshift.io/rails-pgsql-persistent   2          1         0         config,image(rails-pgsql-persistent:latest)

NAME                                                    TYPE     FROM   LATEST
buildconfig.build.openshift.io/rails-pgsql-persistent   Source   Git    1

NAME                                                TYPE     FROM          STATUS     STARTED          DURATION
build.build.openshift.io/rails-pgsql-persistent-1   Source   Git@67d882b   Complete   25 minutes ago   2m12s

NAME                                                    IMAGE REPOSITORY                                                                     TAGS     UPDATED
imagestream.image.openshift.io/rails-pgsql-persistent   image-registry.openshift-image-registry.svc:5000/rails-psql/rails-pgsql-persistent   latest   23 minutes ago

NAME                                              HOST/PORT                                                                                         PATH   SERVICES                 PORT    TERMINATION   WILDCARD
route.route.openshift.io/rails-pgsql-persistent   rails-pgsql-persistent-rails-psql.apps.rpattath-4-perfmig.perf-testing.devcluster.openshift.com          rails-pgsql-persistent   <all>                 None


Expected results:
Migration should be successful and app should be working as expected

Additional info:
Noticed these error messages on OCP 4 under velero pod


time="2019-09-20T19:04:54Z" level=error msg="Using default resource values, couldn't parse resource requirements: couldn't parse CPU request \"\": quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$'." cmd=/velero logSource="pkg/restore/restic_restore_action.go:112" pluginName=velero pod=rails-psql/postgresql-1-h2qxf-stage restore=openshift-migration/migmigration-sample-st8dt
time="2019-09-20T19:05:12Z" level=warning msg="unable to restore additional item" additionalResource=persistentvolumeclaims additionalResourceName=postgresql additionalResourceNamespace=rails-psql error="stat /tmp/135518595/resources/persistentvolumeclaims/namespaces/rails-psql/postgresql.json: no such file or directory" logSource="pkg/restore/restore.go:965" restore=openshift-migration/migmigration-sample-nhplk

Comment 1 John Matthews 2019-09-23 18:31:46 UTC
Aligning to 4.3.0

Related to issue https://github.com/fusor/openshift-velero-plugin/issues/7

Comment 2 Scott Seago 2019-09-24 20:04:19 UTC
First commenting on the velero error messages. I don't believe either of those messages have anything to do with the problem. The first message is because velero sees an empty string and can't parse that as a CPU request, so it uses the default instead. This is for a stage pod where we don't include any explicit CPU resource requirements, so using the default is the expected result.

Regarding the second, assuming this is from the final restore rather than the stage restore, this just means that velero was unable to restore the postgresql PVC since we exclude PVCs from final restore. Running `oc get pvc` in the destination cluster would show for sure whether the PVC is being restored properly. It's probably worth investigating which resource is attempting to pull in the PVC on final restore so we can suppress the message, though.

As for the bug itself, this looks suspiciously similar to an issue I was debugging earlier today. deploymentconfig.apps.openshift.io/postgresql is present, but replicationcontroller/postgresql-1 is missing.

I see from the "oc get all" listing that your deploymentconfig has an imagechange trigger on postgresql:9.5. It looks like in the latest 4.2 clusters, postgresql versions earlier than 9.6 are no longer installed.
To confirm this, run  "oc get imagestreamtag -n openshift postgresql:9.5" on the destination cluster and you will probably see that it's missing. Also look at the TAGS output of "oc get imagestream -n openshift postgresql" to see what versions are available.

Basically, if the version referenced is missing, this means that postgresql image used by the src deploymentconfig is not available in the destination cluster. We don't migrate images in the openshift namespace since 1) most of them are actually located outside of the cluster, as referenced in the imagestreamtag resources, and 2) They're managed by the cluster.

I suspect the resolution here is to, more generally, make sure that when we install clusters that will be used as migration targets we install the full set of openshift-namespace images that are likely to be available on the src cluster, and if we can't do this, then we need to document which ones are unavailable so that user applications which depend on these can either upgrade to later versions or migrate to using custom imagestreams installed in the application namespaces.

Comment 3 Roshni 2019-09-25 18:12:58 UTC
On OCP 4

# oc get imagestream -n openshift postgresql
NAME         IMAGE REPOSITORY                                                        TAGS            UPDATED
postgresql   image-registry.openshift-image-registry.svc:5000/openshift/postgresql   10,9.6,latest   27 hours ago

Comment 4 Scott Seago 2019-09-25 19:18:08 UTC
OK, This confirms it -- the problem here is exactly as I described above. The required postgresql:9.5 image is not available in the destination cluster, so there's no way applications that depend on it can be migrated without either upgrading the dependencies prior to migration or installing the required images in the openshift namespace on the destination cluster. I'm still not sure whether this is ultimately a CPMA issue (making sure the dest cluster is configured with the right images available) or a documentation issue (so that users know that they have to have them available).

Comment 5 Roshni 2019-10-22 13:59:19 UTC
*** Bug 1749906 has been marked as a duplicate of this bug. ***

Comment 6 John Matthews 2019-12-15 18:15:02 UTC
Moving to our 1.2.0 release to allow more time to decide how we want to handle this.


As of now we have a note in the official docs about the potential for some images to be deprecated between migrations.


https://docs.openshift.com/container-platform/4.2/migration/migrating-3-4/migrating-openshift-3-to-4.html

If your application uses images from the openshift namespace, the required versions of the images must be present on the target cluster. If not, you must update the imagestreamtags references to use an available version that is compatible with your application.

If the imagestreamtags cannot be updated, you can manually upload equivalent images to the application namespaces and update the applications to reference them.

The following imagestreamtags have been removed from OpenShift Container Platform 4.2:

dotnet:1.0, dotnet:1.1, dotnet:2.0

dotnet-runtime:2.0

mariadb:10.1

mongodb:2.4, mongodb:2.6

mysql:5.5, mysql:5.6

nginx:1.8

nodejs:0.10, nodejs:4, nodejs:6

perl:5.16, perl:5.20

php:5.5, php:5.6

postgresql:9.2, postgresql:9.4, postgresql:9.5

python:3.3, python:3.4

ruby:2.0, ruby:2.2

Comment 7 John Matthews 2020-03-28 16:08:23 UTC
Tracking this as a larger RFE for adding more validations in https://issues.redhat.com/browse/MIG-169

Comment 8 Erik Nelson 2021-04-07 20:41:44 UTC
Closing as stale, please reopen if issue persists as of the current release.


Note You need to log in before you can comment on or make changes to this bug.