Bug 1846263 - Bootstrap should not fail if we can't reach swift storage on openstack
Summary: Bootstrap should not fail if we can't reach swift storage on openstack
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Image Registry
Version: 4.5
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.6.0
Assignee: Ricardo Maraschini
QA Contact: Wenjing Zheng
URL:
Whiteboard:
Depends On:
Blocks: 1894621
TreeView+ depends on / blocked
 
Reported: 2020-06-11 08:48 UTC by Ricardo Maraschini
Modified: 2020-11-04 16:30 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Failure to access Swift storage during image registry operator bootstrap. Consequence: Bootstrap on this scenario was never finished, resulting in the image registry config resource not to be created, thus not allowing the client to fix or change its configuration. Fix: Avoid to fail during bootstrap in case of a problem when accessing the swift storage. What was an error before now is only is logged as such allowing bootstrap to finish and config resource to be created. Result: Image registry operator is now more flexible and if it can't access Swift it bootstraps the internal image registry using a PVC.
Clone Of:
Environment:
Last Closed: 2020-10-27 16:06:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-image-registry-operator pull 569 0 None closed Bug 1846263: Avoiding bootstrap failure on unavailable swift 2020-11-16 07:25:32 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:06:55 UTC

Description Ricardo Maraschini 2020-06-11 08:48:40 UTC
Description of problem:

Image registry controller fails its bootstrap if remove Swift storage can't be reached.

If bootstrap fails no imageregistry config is created, if swift is not accessible bootstrap should succeed so users can configure their storage as a day 2 activity.


Discussions around the problem can be found on the following slack thread:

https://coreos.slack.com/archives/C013VBYBJQH/p1591795133048700

Comment 5 Wenjing Zheng 2020-07-17 07:28:33 UTC
Verified on 4.6.0-0.nightly-2020-07-16-211200:
1.$oc -n openshift-image-registry scale deploy cluster-image-registry-operator --replicas=0
2.$oc delete crd configs.imageregistry.operator.openshift.io
3.$oc -n openshift-image-registry create secret generic image-registry-private-configuration-user --from-literal=REGISTRY_STORAGE_SWIFT_USERNAME=admin --from-literal=REGISTRY_STORAGE_SWIFT_PASSWORD=password
4.$oc -n openshift-image-registry scale deploy cluster-image-registry-operator --replicas=1
5. Then wait for image registry config comes back, image registry is using pvc now:
spec:
  httpSecret: 56a67d73c48c35d6de2860388e4131c0b338ab039fb11b7cf84e602e705a164cfc0c97c05787a2024b700e97d8a5d09512acbf59f7d587ad7c34dfb899429b49
  logging: 2
  managementState: Managed
  proxy: {}
  replicas: 1
  requests:
    read:
      maxWaitInQueue: 0s
    write:
      maxWaitInQueue: 0s
  rolloutStrategy: Recreate
  storage:
    pvc:
      claim: image-registry-storage

$ oc get pods
NAME                                               READY   STATUS    RESTARTS   AGE
cluster-image-registry-operator-565755f494-8fzrr   1/1     Running   0          5m9s
image-registry-5865668cb9-vh9c6                    1/1     Running   0          44s
node-ca-5dfw8                                      1/1     Running   0          4h37m
node-ca-9ktzq                                      1/1     Running   0          4h37m
node-ca-d4s85                                      1/1     Running   0          4h25m
node-ca-s95zv                                      1/1     Running   0          4h37m
node-ca-xbmcf                                      1/1     Running   0          4h30m
node-ca-zfldq                                      1/1     Running   0          4h32m

Comment 7 errata-xmlrpc 2020-10-27 16:06:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.