Bug 2007717 - ODF 4.9 is failing to deploy
Summary: ODF 4.9 is failing to deploy
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: build
Version: 4.9
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ODF 4.9.0
Assignee: Boris Ranto
QA Contact: Raz Tamir
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-09-24 16:57 UTC by Sridhar Venkat (IBM)
Modified: 2023-08-09 16:37 UTC (History)
6 users (show)

Fixed In Version: odf-operator.v4.9.0-161.ci
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-12-13 17:46:33 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2021:5086 0 None None None 2021-12-13 17:46:43 UTC

Description Sridhar Venkat (IBM) 2021-09-24 16:57:59 UTC
Description of problem (please be detailed as possible and provide log
snippests):
During the deploy of OFD 4.9 the CSV is failing to install.

Version of all relevant components (if applicable):
4.9

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

Yes
Is there any workaround available to the best of your knowledge?
No

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?

1
Can this issue reproducible?
Yes

Can this issue reproduce from the UI?
N/A

If this is a regression, please provide more details to justify this:

Yes. This was working in previous levels.

Steps to Reproduce:
1.Deploy ODF 4.9 on OCP 4.9.
2.
3.


Actual results:

[root@rdr-sri-b76e-lon06-bastion-0 ~]# oc describe pod odf-operator-controller-manager-655446dd6-rgbzd -n openshift-storage
Events:
  Type     Reason          Age                   From               Message
  ----     ------          ----                  ----               -------
  Normal   Scheduled       42m                   default-scheduler  Successfully assigned openshift-storage/odf-operator-controller-manager-655446dd6-rgbzd to rdr-sri-b76e-lon06-worker-0
  Normal   AddedInterface  42m                   multus             Add eth0 [10.128.2.12/23] from openshift-sdn
  Warning  Failed          41m                   kubelet            Failed to pull image "registry.redhat.io/openshift4/ose-kube-rbac-proxy:v4.7.0": rpc error: code = Unknown desc = can't talk to a V1 container registry
  Normal   Pulling         41m                   kubelet            Pulling image "quay.io/rhceph-dev/odf-operator@sha256:55a4880999850da3917934396df84c93ed5678deb035b202b63b94e3883e7544"
  Normal   Pulled          41m                   kubelet            Successfully pulled image "quay.io/rhceph-dev/odf-operator@sha256:55a4880999850da3917934396df84c93ed5678deb035b202b63b94e3883e7544" in 7.796474757s
  Normal   Started         41m                   kubelet            Started container manager
  Normal   Created         41m                   kubelet            Created container manager
  Warning  Failed          39m                   kubelet            Failed to pull image "registry.redhat.io/openshift4/ose-kube-rbac-proxy:v4.7.0": rpc error: code = Unknown desc = reading manifest v4.7.0 in registry.redhat.io/openshift4/ose-kube-rbac-proxy: received unexpected HTTP status: 503 Service Unavailable
  Normal   Pulling         39m (x3 over 42m)     kubelet            Pulling image "registry.redhat.io/openshift4/ose-kube-rbac-proxy:v4.7.0"
  Warning  Failed          37m (x3 over 41m)     kubelet            Error: ErrImagePull
  Warning  Failed          36m (x6 over 41m)     kubelet            Error: ImagePullBackOff
  Warning  Failed          26m (x4 over 37m)     kubelet            Failed to pull image "registry.redhat.io/openshift4/ose-kube-rbac-proxy:v4.7.0": rpc error: code = Unknown desc = pinging container registry registry.redhat.io: invalid status code from registry 503 (Service Unavailable)
  Warning  Failed          9m38s                 kubelet            Failed to pull image "registry.redhat.io/openshift4/ose-kube-rbac-proxy:v4.7.0": rpc error: code = Unknown desc = pinging container registry registry.redhat.io: invalid status code from registry 504 (Gateway Timeout)
  Normal   BackOff         2m33s (x93 over 41m)  kubelet            Back-off pulling image "registry.redhat.io/openshift4/ose-kube-rbac-proxy:v4.7.0"
[root@rdr-sri-b76e-lon06-bastion-0 ~]# oc get csv -A
NAMESPACE                              NAME                                        DISPLAY                       VERSION              REPLACES   PHASE
openshift-local-storage                local-storage-operator.4.9.0-202109071344   Local Storage                 4.9.0-202109071344              Succeeded
openshift-operator-lifecycle-manager   packageserver                               Package Server                0.18.3                          Succeeded
openshift-storage                      noobaa-operator.v4.9.0-158.ci               NooBaa Operator               4.9.0-158.ci                    Succeeded
openshift-storage                      ocs-operator.v4.9.0-158.ci                  OpenShift Container Storage   4.9.0-158.ci                    Succeeded
openshift-storage                      odf-operator.v4.9.0-158.ci                  OpenShift Data Foundation     4.9.0-158.ci                    Failed

Expected results:


Additional info:

Comment 2 Sridhar Venkat (IBM) 2021-09-24 19:38:55 UTC
This is for IBM System P environment.

Comment 3 Nitin Goyal 2021-09-25 16:09:24 UTC
@svenkat Can you pls try this again? It should be able to pull the image. If it does not work pls let us know.

Comment 5 Sridhar Venkat (IBM) 2021-09-28 01:44:39 UTC
We are able to deploy ODF 4.9. The issue is now resolved. This can be closed.

Comment 6 Nitin Goyal 2021-09-28 02:05:20 UTC
@branto Can you pls move this bug to the build component and move it to ONQA so that this can be verified?

Comment 7 Boris Ranto 2021-09-28 07:27:07 UTC
This was fixed by properly overriding the rbac proxy image in the DS builds:

https://gitlab.cee.redhat.com/ceph/rhcs-jenkins-jobs/-/merge_requests/748

Moving directly to verified as per comment #5.

Comment 12 errata-xmlrpc 2021-12-13 17:46:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Data Foundation 4.9.0 enhancement, security, and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:5086


Note You need to log in before you can comment on or make changes to this bug.