Bug 1846263

Summary: Bootstrap should not fail if we can't reach swift storage on openstack
Product: OpenShift Container Platform Reporter: Ricardo Maraschini <rmarasch>
Component: Image RegistryAssignee: Ricardo Maraschini <rmarasch>
Status: CLOSED ERRATA QA Contact: Wenjing Zheng <wzheng>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.5CC: aos-bugs, pasik
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Failure to access Swift storage during image registry operator bootstrap. Consequence: Bootstrap on this scenario was never finished, resulting in the image registry config resource not to be created, thus not allowing the client to fix or change its configuration. Fix: Avoid to fail during bootstrap in case of a problem when accessing the swift storage. What was an error before now is only is logged as such allowing bootstrap to finish and config resource to be created. Result: Image registry operator is now more flexible and if it can't access Swift it bootstraps the internal image registry using a PVC.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:06:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1894621    

Description Ricardo Maraschini 2020-06-11 08:48:40 UTC
Description of problem:

Image registry controller fails its bootstrap if remove Swift storage can't be reached.

If bootstrap fails no imageregistry config is created, if swift is not accessible bootstrap should succeed so users can configure their storage as a day 2 activity.


Discussions around the problem can be found on the following slack thread:

https://coreos.slack.com/archives/C013VBYBJQH/p1591795133048700

Comment 5 Wenjing Zheng 2020-07-17 07:28:33 UTC
Verified on 4.6.0-0.nightly-2020-07-16-211200:
1.$oc -n openshift-image-registry scale deploy cluster-image-registry-operator --replicas=0
2.$oc delete crd configs.imageregistry.operator.openshift.io
3.$oc -n openshift-image-registry create secret generic image-registry-private-configuration-user --from-literal=REGISTRY_STORAGE_SWIFT_USERNAME=admin --from-literal=REGISTRY_STORAGE_SWIFT_PASSWORD=password
4.$oc -n openshift-image-registry scale deploy cluster-image-registry-operator --replicas=1
5. Then wait for image registry config comes back, image registry is using pvc now:
spec:
  httpSecret: 56a67d73c48c35d6de2860388e4131c0b338ab039fb11b7cf84e602e705a164cfc0c97c05787a2024b700e97d8a5d09512acbf59f7d587ad7c34dfb899429b49
  logging: 2
  managementState: Managed
  proxy: {}
  replicas: 1
  requests:
    read:
      maxWaitInQueue: 0s
    write:
      maxWaitInQueue: 0s
  rolloutStrategy: Recreate
  storage:
    pvc:
      claim: image-registry-storage

$ oc get pods
NAME                                               READY   STATUS    RESTARTS   AGE
cluster-image-registry-operator-565755f494-8fzrr   1/1     Running   0          5m9s
image-registry-5865668cb9-vh9c6                    1/1     Running   0          44s
node-ca-5dfw8                                      1/1     Running   0          4h37m
node-ca-9ktzq                                      1/1     Running   0          4h37m
node-ca-d4s85                                      1/1     Running   0          4h25m
node-ca-s95zv                                      1/1     Running   0          4h37m
node-ca-xbmcf                                      1/1     Running   0          4h30m
node-ca-zfldq                                      1/1     Running   0          4h32m

Comment 7 errata-xmlrpc 2020-10-27 16:06:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196