Bug 1810490

Summary:	[IPI][OSP] 403 error when creating Swift Image Registry backend without swiftoperator role
Product:	OpenShift Container Platform	Reporter:	Mike Fedosin <mfedosin>
Component:	Image Registry	Assignee:	Mike Fedosin <mfedosin>
Status:	CLOSED ERRATA	QA Contact:	XiuJuan Wang <xiuwang>
Severity:	urgent	Docs Contact:
Priority:	urgent
Version:	4.4	CC:	adam.kaplan, aos-bugs, jiazha, wzheng, xiuwang
Target Milestone:	---	Keywords:	Reopened
Target Release:	4.4.0	Flags:	xiuwang: needinfo-
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1806158	Environment:
Last Closed:	2020-05-13 22:00:53 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1806158, 1811530, 1812071
Bug Blocks:

Description Mike Fedosin 2020-03-05 11:17:18 UTC

+++ This bug was initially created as a clone of Bug #1806158 +++

When I deploy a cluster on OpenStack with Swift enabled, but without swiftoperator role, cluster-image-registry-operator fails to create a container and I get 403 error during the installation.

--- Additional comment from Adam Kaplan on 2020-02-27 21:28:08 UTC ---

`swiftoperator` role is a prerequisite for IPI installs on OpenStack [1]. Customers who don't want to do this should use the UPI flows instead [2]. In 4.4 we will need to update this document to allow customers to use RWO storage via a PVC.

[1] https://docs.openshift.com/container-platform/4.3/installing/installing_openstack/installing-openstack-installer-custom.html#installation-osp-enabling-swift_installing-openstack-installer-custom
[2] https://docs.openshift.com/container-platform/4.3/installing/installing_openstack/installing-openstack-installer-custom.html

Comment 3 XiuJuan Wang 2020-03-11 07:08:07 UTC

Installing the ipi on osp cluster using user disabling swiftoperator role, but it failed and blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1811530.

Even no image-registry clusteroperator deployed.

$oc get pods -n openshift-kube-apiserver
NAME                                            READY   STATUS             RESTARTS   AGE
installer-2-xiuwang-osp1-d6djd-master-2         0/1     Completed          0          55m
installer-3-xiuwang-osp1-d6djd-master-0         0/1     Completed          0          54m
kube-apiserver-xiuwang-osp1-d6djd-master-0      2/4     CrashLoopBackOff   27         54m
kube-apiserver-xiuwang-osp1-d6djd-master-2      2/4     CrashLoopBackOff   28         55m
revision-pruner-2-xiuwang-osp1-d6djd-master-2   0/1     Completed          0          54m

$ oc get co  
NAME                                       VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                                                                 Unknown     Unknown       True       62m
cloud-credential                           4.4.0-0.nightly-2020-03-10-194324   True        False         False      69m
dns                                        4.4.0-0.nightly-2020-03-10-194324   True        False         False      61m
etcd                                       4.4.0-0.nightly-2020-03-10-194324   True        False         False      61m
kube-apiserver                             4.4.0-0.nightly-2020-03-10-194324   False       True          True       62m
kube-controller-manager                    4.4.0-0.nightly-2020-03-10-194324   True        False         True       60m
kube-scheduler                             4.4.0-0.nightly-2020-03-10-194324   True        False         False      60m
kube-storage-version-migrator              4.4.0-0.nightly-2020-03-10-194324   False       False         False      62m
machine-api                                4.4.0-0.nightly-2020-03-10-194324   True        False         False      62m
machine-config                             4.4.0-0.nightly-2020-03-10-194324   True        False         False      61m
network                                    4.4.0-0.nightly-2020-03-10-194324   True        False         False      62m
node-tuning                                4.4.0-0.nightly-2020-03-10-194324   True        False         False      62m
openshift-apiserver                        4.4.0-0.nightly-2020-03-10-194324   False       False         False      62m
openshift-controller-manager                                                   False       True          False      62m
operator-lifecycle-manager                 4.4.0-0.nightly-2020-03-10-194324   True        False         False      61m
operator-lifecycle-manager-catalog         4.4.0-0.nightly-2020-03-10-194324   True        False         False      61m
operator-lifecycle-manager-packageserver                                       False       True          False      61m
service-ca                                 4.4.0-0.nightly-2020-03-10-194324   True        False         False      62m
service-catalog-apiserver                  4.4.0-0.nightly-2020-03-10-194324   True        False         False      62m
service-catalog-controller-manager         4.4.0-0.nightly-2020-03-10-194324   True        False         False      62m

Comment 4 XiuJuan Wang 2020-03-13 10:36:15 UTC

Install ipi on osp without swiftoperator role, the image registry backend changed to pvc default.
But the image registry pod is pending
$oc get pods
NAME                                               READY   STATUS    RESTARTS   AGE
cluster-image-registry-operator-65d4569d77-xk2wj   2/2     Running   0          57m
image-registry-746df5766c-h4xnc                    0/1     Pending   0          57m
node-ca-9zkfp                                      1/1     Running   0          55m
node-ca-gzrdg                                      1/1     Running   0          57m
node-ca-rvkdz                                      1/1     Running   0          57m
node-ca-vx5xp                                      1/1     Running   0          57m

$ oc get pvc  -o yaml 
apiVersion: v1
items:
- apiVersion: v1
  kind: PersistentVolumeClaim
  metadata:
    annotations:
      imageregistry.openshift.io: "true"
      volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/cinder
    creationTimestamp: "2020-03-13T09:29:29Z"
    finalizers:
    - kubernetes.io/pvc-protection
    name: image-registry-storage
    namespace: openshift-image-registry
    resourceVersion: "33656"
    selfLink: /api/v1/namespaces/openshift-image-registry/persistentvolumeclaims/image-registry-storage
    uid: b73541bf-e556-4b8c-a976-77a0a9c690d6
  spec:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 100Gi
    storageClassName: standard
    volumeMode: Filesystem
  status:
    phase: Pending
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

$oc describe pod image-registry-746df5766c-h4xnc
========
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  <unknown>          default-scheduler  Failed to bind volumes: provisioning failed for PVC "image-registry-storage"
  Warning  FailedScheduling  <unknown>          default-scheduler  0/3 nodes are available: 3 node(s) had taints that the pod didn't tolerate.
  Warning  FailedScheduling  <unknown>          default-scheduler  0/4 nodes are available: 4 node(s) had taints that the pod didn't tolerate.
  Warning  FailedScheduling  <unknown>          default-scheduler  AssumePod failed: pod eb739d55-e5b1-409f-bd60-12ac0409f1d6 is in the cache, so can't be assumed
  Warning  FailedScheduling  <unknown>          default-scheduler  AssumePod failed: pod eb739d55-e5b1-409f-bd60-12ac0409f1d6 is in the cache, so can't be assumed
  Warning  FailedScheduling  <unknown>          default-scheduler  Failed to bind volumes: provisioning failed for PVC "image-registry-storage"
  Warning  FailedScheduling  <unknown>          default-scheduler  Failed to bind volumes: provisioning failed for PVC "image-registry-storage"
  Warning  FailedScheduling  <unknown>          default-scheduler  Failed to bind volumes: provisioning failed for PVC "image-registry-storage"
  Warning  FailedScheduling  54m (x5 over 54m)  default-scheduler  0/3 nodes are available: 3 node(s) had taints that the pod didn't tolerate.

$oc get node
NAME                          STATUS   ROLES    AGE   VERSION
wxj-osp1-dxx4r-master-0       Ready    master   69m   v1.17.1
wxj-osp1-dxx4r-master-1       Ready    master   70m   v1.17.1
wxj-osp1-dxx4r-master-2       Ready    master   70m   v1.17.1
wxj-osp1-dxx4r-worker-h2mvk   Ready    worker   61m   v1.17.1

See operator log http://pastebin.test.redhat.com/844534

Comment 5 Adam Kaplan 2020-03-18 18:04:38 UTC

@XiuJuan the image registry operator has no ability to create persistent volumes for PVCs. If nothing on the cluster can auto-provision PVs for a given PVC, then this behavior is to be expected.

Comment 6 Adam Kaplan 2020-03-18 18:07:18 UTC

@Fedosin is there a default PVC auto-provisioner on OpenStack that we assume exists for IPI installs?

Comment 10 errata-xmlrpc 2020-05-13 22:00:53 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581

Comment 11 Red Hat Bugzilla 2023-09-18 00:20:26 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days