Bug 1768535 - Builds stalled after migration
Summary: Builds stalled after migration
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Migration Toolkit for Containers
Classification: Red Hat
Component: General
Version: 1.3.0
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: ---
Assignee: Dylan Murray
QA Contact: Sergio
Avital Pinnick
URL:
Whiteboard:
: 1779690 1831615 (view as bug list)
Depends On: 1779690 1831615
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-04 16:31 UTC by Sergio
Modified: 2023-09-18 00:18 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1779690 1831615 (view as bug list)
Environment:
Last Closed: 2022-10-14 19:34:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Sergio 2019-11-04 16:31:05 UTC
Description of problem:
When an application that has been built using openshift BuildConfig is migrated, a build is triggered after the migration and it is stuck in the target cluster.

Version-Release number of selected component (if applicable):
Target cluster:
OCP 4.2
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.2.0-0.nightly-2019-11-01-115323   True        False         12h     Cluster version is 4.2.0-0.nightly-2019-11-01-115323

Source cluster:
OCP 4.1
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-0.nightly-2019-11-01-212340   True        False         12h     Cluster version is 4.1.0-0.nightly-2019-11-01-212340



Controller:
      image: registry.redhat.io/rhcam-1-0/openshift-migration-controller-rhel8:v1.0
      imageID: registry.redhat.io/rhcam-1-0/openshift-migration-controller-rhel8@sha256:fe81b226dd2f79541fac9f2ba8766086d2d93fcd2f8ca0f7efbb86e3ff1f1f42

Velero:
      image: registry.redhat.io/rhcam-1-0/openshift-migration-velero-rhel8:v1.0
      imageID: registry.redhat.io/rhcam-1-0/openshift-migration-velero-rhel8@sha256:e4afec9bf56e75fc7fda793a31a7ed21fa87babf1abd779ef2865085c6cc3449
      image: registry.redhat.io/rhcam-1-0/openshift-migration-plugin-rhel8:v1.0
      imageID: registry.redhat.io/rhcam-1-0/openshift-migration-plugin-rhel8@sha256:f8c18177972624a209cd20277f844d885abd243161b50f96f7a37636b7d2f042


How reproducible:
Always

Steps to Reproduce:
1. $ oc new-project cakephp
2. $ oc new-app cakephp-mysql-persistent
3. Migrate the app

Actual results:
$ oc get pods
NAME                                  READY   STATUS      RESTARTS   AGE
cakephp-mysql-persistent-1-build      0/1     Init:0/2    0          2m16s
cakephp-mysql-persistent-1-deploy     0/1     Completed   0          2m16s
cakephp-mysql-persistent-1-hook-pre   0/1     Completed   0          2m7s
cakephp-mysql-persistent-1-lqp8p      1/1     Running     0          88s
mysql-1-88bl9                         1/1     Running     0          2m11s
mysql-1-deploy                        0/1     Completed   0          2m28s

$ oc describe pod cakephp-mysql-persistent-1-build | grep Warning
  Warning  FailedMount  98s                  kubelet, ip-10-0-62-108.us-east-2.compute.internal  Unable to mount volumes for pod "cakephp-mysql-persistent-1-build_cakephp(fa364038-ff18-11e9-a1d6-020df951b8fc)": timeout expired waiting for volumes to attach or mount for pod "cakephp"/"cakephp-mysql-persistent-1-build". list of unmounted volumes=[build-proxy-ca-bundles]. list of unattached volumes=[buildcachedir buildworkdir builder-dockercfg-hg79t-push builder-dockercfg-hg79t-pull build-system-configs build-ca-bundles build-proxy-ca-bundles container-storage-root build-blob-cache builder-token-qjxhz]
  Warning  FailedMount  93s (x9 over 3m41s)  kubelet, ip-10-0-62-108.us-east-2.compute.internal  MountVolume.SetUp failed for volume "build-proxy-ca-bundles" : configmaps "cakephp-mysql-persistent-1-global-ca" not found

$ oc get cm
NAME                                    DATA   AGE
cakephp-mysql-persistent-1-ca           1      4m
cakephp-mysql-persistent-1-sys-config   0      4m

There is no "cakephp-mysql-persistent-1-global-ca" config map.


Expected results:
The application should be deployed normally, and no build should be stuck.


Additional info:

It seems that now "global-ca" config map is created when a build is executed in 4.2. Since it's not like that in previous versions, the map is not there after the migration and the result is that the build is stuck.

It seems related to: https://bugzilla.redhat.com/show_bug.cgi?id=1745192

Comment 1 Sergio 2019-11-19 15:13:01 UTC
When the migration is done from a 4.2 cluster to a 4.2 cluster, the result of this bug is a failure in the build because of wrong certificates. It seems that we are migrating the configmap with the certificates of the source cluster.

$ oc get pods
NAME                             READY   STATUS      RESTARTS   AGE
jenkins-1-5pzgd                  1/1     Running     0          35m
jenkins-1-deploy                 0/1     Completed   0          35m
mongodb-1-deploy                 0/1     Completed   0          35m
mongodb-1-s74jb                  1/1     Running     0          34m
nodejs-mongodb-example-1-build   0/1     Error       0          34m
(python2_virtual_env) [fedora@preserve-appmigration-workmachine work]$ oc logs nodejs-mongodb-example-1-build
Caching blobs under "/var/cache/blobs".
Warning: Pull failed, retrying in 5s ...
Warning: Pull failed, retrying in 5s ...
Warning: Pull failed, retrying in 5s ...
error: build error: After retrying 2 times, Pull image still failed due to error: while pulling "docker://image-registry.openshift-image-registry.svc:5000/openshift/nodejs" as "image-registry.openshift-image-registry.svc:5000/openshift/nodejs": Error initializing source docker://image-registry.openshift-image-registry.svc:5000/openshift/nodejs:latest: pinging docker registry returned: Get https://image-registry.openshift-image-registry.svc:5000/v2/: x509: certificate signed by unknown authority

Comment 3 Dylan Murray 2019-12-16 17:07:34 UTC
I can confirm that I have reproduced this bug. The build fails with

error: build error: After retrying 2 times, Pull image still failed due to error: while pulling "docker://image-registry.openshift-image-registry.svc:5000/openshift/php" as "image-registry.openshift-image-registry.svc:5000/openshift/php": Error initializing source docker://image-registry.openshift-image-registry.svc:5000/openshift/php:latest: pinging docker registry returned: Get https://image-registry.openshift-image-registry.svc:5000/v2/: x509: certificate signed by unknown authority


After migration.

Comment 4 John Matthews 2019-12-16 17:26:50 UTC
This didn't make it in time for our z-stream release of 1.0.1, aligning to 4.3.0 to go out in our next 1.1.0 release in ~end Jan.

Comment 5 John Matthews 2019-12-16 17:27:36 UTC
*** Bug 1779690 has been marked as a duplicate of this bug. ***

Comment 6 John Matthews 2020-07-27 16:05:43 UTC
*** Bug 1831615 has been marked as a duplicate of this bug. ***

Comment 7 Erik Nelson 2021-04-07 20:58:59 UTC
Sergio, could you confirm this remains an issue with our current release?

Comment 9 John Matthews 2022-10-14 19:34:59 UTC
Please feel free to reopen this bug if you believe it is still relevant.

Comment 10 Red Hat Bugzilla 2023-09-18 00:18:10 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.