Bug 2075581

Summary:	[IBM Z] : ODF 4.11.0-38 deployment leaves the storagecluster in "Progressing" state although all the openshift-storage pods are up and Running
Product:	[Red Hat Storage] Red Hat OpenShift Data Foundation	Reporter:	Sravika <sbalusu>
Component:	ocs-operator	Assignee:	Travis Nielsen <tnielsen>
Status:	CLOSED ERRATA	QA Contact:	Elad <ebenahar>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	4.11	CC:	bniver, jarrpa, madam, muagarwa, nberry, nigoyal, ocs-bugs, odf-bz-bot, prsurve, sostapov, svenkat, tmuthami, tnielsen, vavuthu
Target Milestone:	---
Target Release:	ODF 4.11.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	4.11.0-63	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2022-08-24 13:51:03 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Sravika 2022-04-14 15:47:21 UTC

Description of problem (please be detailed as possible and provide log
snippests):

ODF 4.11.0-38 deployment leaves the storagecluster in "Progressing" state although all the openshift-storage pods are up and Running



 #  oc -n openshift-storage get StorageCluster ocs-storagecluster
NAME                 AGE   PHASE         EXTERNAL   CREATED AT             VERSION
ocs-storagecluster   34m   Progressing              2022-04-14T14:45:50Z   4.11.0

# oc get storagesystem -n openshift-storage
NAMESPACE           NAME                               STORAGE-SYSTEM-KIND                  STORAGE-SYSTEM-NAME
openshift-storage   ocs-storagecluster-storagesystem   storagecluster.ocs.openshift.io/v1   ocs-storagecluster


# oc get cephcluster -n openshift-storage
NAME                             DATADIRHOSTPATH   MONCOUNT   AGE   PHASE   MESSAGE                        HEALTH      EXTERNAL
ocs-storagecluster-cephcluster   /var/lib/rook     3          37m   Ready   Cluster created successfully   HEALTH_OK

#  oc get po -n openshift-storage
NAME                                                              READY   STATUS      RESTARTS      AGE
csi-addons-controller-manager-6fdddd684c-kc7rv                    2/2     Running     0             35m
csi-cephfsplugin-6v8f8                                            3/3     Running     0             34m
csi-cephfsplugin-g2mfl                                            3/3     Running     0             34m
csi-cephfsplugin-mdtcr                                            3/3     Running     0             34m
csi-cephfsplugin-provisioner-694458df4-7trk6                      6/6     Running     0             34m
csi-cephfsplugin-provisioner-694458df4-x8qw7                      6/6     Running     0             34m
csi-rbdplugin-hk5s7                                               4/4     Running     0             34m
csi-rbdplugin-provisioner-5ff764646d-79f6n                        7/7     Running     0             34m
csi-rbdplugin-provisioner-5ff764646d-xm8qj                        7/7     Running     0             34m
csi-rbdplugin-vxczr                                               4/4     Running     0             34m
csi-rbdplugin-z4ssw                                               4/4     Running     0             34m
noobaa-core-0                                                     1/1     Running     0             31m
noobaa-db-pg-0                                                    1/1     Running     0             31m
noobaa-endpoint-54565dbb4b-d9czf                                  1/1     Running     0             29m
noobaa-operator-9bc6685d9-c4hvc                                   1/1     Running     1 (31m ago)   36m
ocs-metrics-exporter-858dd8784d-vhjw9                             1/1     Running     0             36m
ocs-operator-67cbb8dfc5-dtrhj                                     1/1     Running     0             36m
odf-console-5f9bf644cd-85q55                                      1/1     Running     0             36m
odf-operator-controller-manager-6c8ccd88f8-bm72q                  2/2     Running     0             36m
rook-ceph-crashcollector-worker-0.ocsm4205001.lnxne.boe-85z9dwr   1/1     Running     0             32m
rook-ceph-crashcollector-worker-1.ocsm4205001.lnxne.boe-56qs4j9   1/1     Running     0             32m
rook-ceph-crashcollector-worker-2.ocsm4205001.lnxne.boe-56hhfzj   1/1     Running     0             32m
rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-8b9cf9b5gp2dg   2/2     Running     0             32m
rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-59b96569chjjt   2/2     Running     0             32m
rook-ceph-mgr-a-667fd9fd66-7jpfz                                  2/2     Running     0             32m
rook-ceph-mon-a-6bb76f6b79-js4dp                                  2/2     Running     0             34m
rook-ceph-mon-b-6db796ffdc-b4h8h                                  2/2     Running     0             33m
rook-ceph-mon-c-5f89c7bc9d-c6s2x                                  2/2     Running     0             33m
rook-ceph-operator-dd6cccfb8-rg968                                1/1     Running     0             36m
rook-ceph-osd-0-556bc8866-c66md                                   2/2     Running     0             32m
rook-ceph-osd-1-84d4f4b99b-r8jqd                                  2/2     Running     0             32m
rook-ceph-osd-2-5c69c4684f-jsz2z                                  2/2     Running     0             32m
rook-ceph-osd-prepare-636d124a0298dd81229f6875f74008ce-pw6hz      0/1     Completed   0             32m
rook-ceph-osd-prepare-aad2ba905033100da0435c48775d47ff-lkv9g      0/1     Completed   0             32m
rook-ceph-osd-prepare-d578d363cc45f10c7a4e945322c848c1-l7ttt      0/1     Completed   0             32m
rook-ceph-tools-68cf9db877-dnhh6                                  1/1     Running     0             32m

# oc -n openshift-storage rsh rook-ceph-tools-68cf9db877-dnhh6
sh-4.4$ ceph -s
  cluster:
    id:     6ba16244-ba70-4157-afac-be6eabc223d2
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum a,b,c (age 33m)
    mgr: a(active, since 32m)
    mds: 1/1 daemons up, 1 hot standby
    osd: 3 osds: 3 up (since 32m), 3 in (since 32m)

  data:
    volumes: 1/1 healthy
    pools:   11 pools, 177 pgs
    objects: 124 objects, 135 MiB
    usage:   350 MiB used, 1.5 TiB / 1.5 TiB avail
    pgs:     177 active+clean

  io:
    client:   853 B/s rd, 3.7 KiB/s wr, 1 op/s rd, 0 op/s wr

Version of all relevant components (if applicable):
ODF 4.11.0-38 
Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?
Yes

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Deploy OCP 4.11
2. Deploy ODF 4.11.0-38
3.


Actual results:
Storage cluster is in "Progressing" state although all the openshift-storage pods are up and running

Expected results:
Storage cluster should be in "Ready" state

Additional info:

Must-gather logs:
https://drive.google.com/file/d/1jefxEqQsBF5UjvblPvlm8-ADVRSGd-Zf/view?usp=sharing

Comment 2 Mudit Agarwal 2022-04-15 07:40:09 UTC

Nitin, PTAL

Comment 3 Nitin Goyal 2022-04-19 04:24:38 UTC

storageCluster is waiting on noobaa as it is shown in the status of storagecluster

    conditions:
    - lastHeartbeatTime: "2022-04-14T15:27:04Z"
      lastTransitionTime: "2022-04-14T14:49:18Z"
      message: Reconcile completed successfully
      reason: ReconcileCompleted
      status: "True"
      type: ReconcileComplete
    - lastHeartbeatTime: "2022-04-14T14:45:51Z"
      lastTransitionTime: "2022-04-14T14:45:51Z"
      message: Initializing StorageCluster
      reason: Init
      status: "False"
      type: Available
    - lastHeartbeatTime: "2022-04-14T15:27:04Z"
      lastTransitionTime: "2022-04-14T14:45:51Z"
      message: Waiting on Nooba instance to finish initialization
      reason: NoobaaInitializing
      status: "True"
      type: Progressing
    - lastHeartbeatTime: "2022-04-14T14:45:51Z"
      lastTransitionTime: "2022-04-14T14:45:51Z"
      message: Initializing StorageCluster
      reason: Init
      status: "False"
      type: Degraded
    - lastHeartbeatTime: "2022-04-14T14:45:51Z"
      lastTransitionTime: "2022-04-14T14:45:51Z"
      message: Initializing StorageCluster
      reason: Init
      status: Unknown
      type: Upgradeable





While looking at noobaa status it says object storage is not ready

  conditions:
  - lastHeartbeatTime: "2022-04-14T14:49:18Z"
    lastTransitionTime: "2022-04-14T14:49:18Z"
    message: Ceph objectstore user "noobaa-ceph-objectstore-user" is not ready
    reason: TemporaryError
    status: "False"
    type: Available
  - lastHeartbeatTime: "2022-04-14T14:49:18Z"
    lastTransitionTime: "2022-04-14T14:49:18Z"
    message: Ceph objectstore user "noobaa-ceph-objectstore-user" is not ready
    reason: TemporaryError
    status: "True"
    type: Progressing
  - lastHeartbeatTime: "2022-04-14T14:49:18Z"
    lastTransitionTime: "2022-04-14T14:49:18Z"
    message: Ceph objectstore user "noobaa-ceph-objectstore-user" is not ready
    reason: TemporaryError
    status: "False"
    type: Degraded
  - lastHeartbeatTime: "2022-04-14T14:49:18Z"
    lastTransitionTime: "2022-04-14T14:49:18Z"
    message: Ceph objectstore user "noobaa-ceph-objectstore-user" is not ready
    reason: TemporaryError
    status: "False"
    type: Upgradeable





While looking at the cephobjectstore found the connection refused.

status:
  bucketStatus:
    details: 'failed to get details from ceph object user "rook-ceph-internal-s3-user-checker-f770bb8b-fafb-44d7-b909-13581aeb1e46":
      Get "https://rook-ceph-rgw-ocs-storagecluster-cephobjectstore.openshift-storage.svc:443/admin/user?display-name=rook-ceph-internal-s3-user-checker-f770bb8b-fafb-44d7-b909-13581aeb1e46&format=json&uid=rook-ceph-internal-s3-user-checker-f770bb8b-fafb-44d7-b909-13581aeb1e46":
      dial tcp 172.30.244.15:443: connect: connection refused'
    health: Failure
    lastChanged: "2022-04-14T15:29:09Z"
    lastChecked: "2022-04-14T15:30:10Z"
  info:
    endpoint: http://rook-ceph-rgw-ocs-storagecluster-cephobjectstore.openshift-storage.svc:80
    secureEndpoint: https://rook-ceph-rgw-ocs-storagecluster-cephobjectstore.openshift-storage.svc:443
  observedGeneration: 1
  phase: Failure

@tnielsen Can someone from rook pls take a look

Comment 4 Travis Nielsen 2022-04-19 17:49:36 UTC

The rgw deployment status shows:

      message: 'pods "rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-74bcdc586b-"
        is forbidden: error looking up service account openshift-storage/rook-ceph-rgw:
        serviceaccount "rook-ceph-rgw" not found'

The RGW pod failed to start because the new rook-ceph-rgw service account in 4.11 must not be found in the CSV.

Comment 5 Travis Nielsen 2022-04-20 23:03:36 UTC

With the resync to downstream 4.11 in https://github.com/red-hat-storage/rook/pull/370, the next build should now pick up the fix to create the rook-ceph-rgw service account with the csv.

Comment 6 Sridhar Venkat (IBM) 2022-04-22 13:21:48 UTC

Still seeing the problem. The RGW pod is not getting created:

[root@nx124-411-592e-sao01-bastion-0 ~]# oc get csv odf-operator.v4.11.0 -n openshift-storage -o yaml | grep "full_version"
    full_version: 4.11.0-46
[root@nx124-411-592e-sao01-bastion-0 ~]# oc rsh -n openshift-storage rook-ceph-tools-559b64cbb4-r5swn ceph -s
  cluster:
    id:     86f69cae-50b3-4f73-b12b-4c30ba08bad9
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,c,b (age 4h)
    mgr: a(active, since 4h)
    mds: 1/1 daemons up, 1 hot standby
    osd: 3 osds: 3 up (since 4h), 3 in (since 4h)
 
  data:
    volumes: 1/1 healthy
    pools:   11 pools, 177 pgs
    objects: 1.15k objects, 3.5 GiB
    usage:   11 GiB used, 1.5 TiB / 1.5 TiB avail
    pgs:     177 active+clean
 
  io:
    client:   853 B/s rd, 10 KiB/s wr, 1 op/s rd, 1 op/s wr
 
[root@nx124-411-592e-sao01-bastion-0 ~]# oc -n openshift-storage get Pod  -n openshift-storage --selector=app=rook-ceph-rgw
No resources found in openshift-storage namespace.
[root@nx124-411-592e-sao01-bastion-0 ~]#

Comment 7 Sridhar Venkat (IBM) 2022-04-22 13:32:20 UTC

I see 4.11.0-46 was published on 4/20 and the comment by Travis was on the same day, I will look for the next build and check.

Comment 8 Travis Nielsen 2022-04-22 14:34:39 UTC

Thanks for double checking the next build to make sure the fix is in...

Comment 9 Travis Nielsen 2022-04-22 16:44:09 UTC

Per discussion in gchat, another fix was needed for the service account generation in the csv.

Comment 10 Travis Nielsen 2022-04-22 20:14:09 UTC

Follow-up fix is merged downstream now with https://github.com/red-hat-storage/rook/pull/372

Comment 11 Mudit Agarwal 2022-04-23 10:11:16 UTC

We are still wiating for a stable build, the latest build again didn't pass the deployment. Please wait for the next stable build.

Comment 12 Sridhar Venkat (IBM) 2022-04-27 17:40:41 UTC

[root@nx124-411-94c4-syd04-bastion-0 ~]# oc get csv odf-operator.v4.11.0 -o yaml -n openshift-storage | grep full
    full_version: 4.11.0-51
[root@nx124-411-94c4-syd04-bastion-0 ~]# oc -n openshift-storage get Pod  -n openshift-storage --selector=app=rook-ceph-rgw
No resources found in openshift-storage namespace.
[root@nx124-411-94c4-syd04-bastion-0 ~]# 

Checked in driver 51 of 4.11 build, the rgw pod is still missing.

Comment 13 Sridhar Venkat (IBM) 2022-04-27 17:42:36 UTC

[root@nx124-411-94c4-syd04-bastion-0 ~]# oc -n openshift-storage get StorageCluster ocs-storagecluster
NAME                 AGE   PHASE   EXTERNAL   CREATED AT             VERSION
ocs-storagecluster   56m   Ready              2022-04-27T16:44:54Z   4.11.0
[root@nx124-411-94c4-syd04-bastion-0 ~]# 

But storage cluster is good in Ready state.

[root@nx124-411-94c4-syd04-bastion-0 ~]# oc get storagesystem -n openshift-storage
NAME                               STORAGE-SYSTEM-KIND                  STORAGE-SYSTEM-NAME
ocs-storagecluster-storagesystem   storagecluster.ocs.openshift.io/v1   ocs-storagecluster
[root@nx124-411-94c4-syd04-bastion-0 ~]#

[root@nx124-411-94c4-syd04-bastion-0 ~]# oc get cephcluster -n openshift-storage
NAME                             DATADIRHOSTPATH   MONCOUNT   AGE   PHASE   MESSAGE                        HEALTH      EXTERNAL
ocs-storagecluster-cephcluster   /var/lib/rook     3          57m   Ready   Cluster created successfully   HEALTH_OK   
[root@nx124-411-94c4-syd04-bastion-0 ~]#

Comment 14 Travis Nielsen 2022-04-27 17:44:16 UTC

Sridhar What does the following show?
- oc -n openshift-storage describe deploy <rook-ceph-rgw-deployment>
- oc -n openshift-storage get svc

Comment 15 Sravika 2022-04-28 10:14:22 UTC

@tnielsen : Please find the output as follows:

# oc -n openshift-storage describe deploy rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a
Name:                   rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a
Namespace:              openshift-storage
CreationTimestamp:      Thu, 28 Apr 2022 11:38:41 +0200
Labels:                 app=rook-ceph-rgw
                        app.kubernetes.io/component=cephobjectstores.ceph.rook.io
                        app.kubernetes.io/created-by=rook-ceph-operator
                        app.kubernetes.io/instance=ocs-storagecluster-cephobjectstore
                        app.kubernetes.io/managed-by=rook-ceph-operator
                        app.kubernetes.io/name=ceph-rgw
                        app.kubernetes.io/part-of=ocs-storagecluster-cephobjectstore
                        ceph-version=16.2.7-107
                        ceph_daemon_id=ocs-storagecluster-cephobjectstore
                        ceph_daemon_type=rgw
                        rgw=ocs-storagecluster-cephobjectstore
                        rook-version=v4.11.0-0.354c987b60b7e13e92ac0d69e8504f3cb6c11279
                        rook.io/operator-namespace=openshift-storage
                        rook_cluster=openshift-storage
                        rook_object_store=ocs-storagecluster-cephobjectstore
Annotations:            banzaicloud.com/last-applied:
                          {"metadata":{"labels":{"app":"rook-ceph-rgw","app.kubernetes.io/component":"cephobjectstores.ceph.rook.io","app.kubernetes.io/created-by":...
                        deployment.kubernetes.io/revision: 1
Selector:               app=rook-ceph-rgw,ceph_daemon_id=ocs-storagecluster-cephobjectstore,rgw=ocs-storagecluster-cephobjectstore,rook_cluster=openshift-storage,rook_object_store=ocs-storagecluster-cephobje
ctstore
Replicas:               1 desired | 0 updated | 0 total | 0 available | 1 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  1 max unavailable, 0 max surge
Pod Template:
  Labels:           app=rook-ceph-rgw
                    app.kubernetes.io/component=cephobjectstores.ceph.rook.io
                    app.kubernetes.io/created-by=rook-ceph-operator
                    app.kubernetes.io/instance=ocs-storagecluster-cephobjectstore
                    app.kubernetes.io/managed-by=rook-ceph-operator
                    app.kubernetes.io/name=ceph-rgw
                    app.kubernetes.io/part-of=ocs-storagecluster-cephobjectstore
                    ceph_daemon_id=ocs-storagecluster-cephobjectstore
                    ceph_daemon_type=rgw
                    rgw=ocs-storagecluster-cephobjectstore
                    rook.io/operator-namespace=openshift-storage
                    rook_cluster=openshift-storage
                    rook_object_store=ocs-storagecluster-cephobjectstore
  Service Account:  rook-ceph-rgw
  Init Containers:
   chown-container-data-dir:
    Image:      quay.io/rhceph-dev/rhceph@sha256:9f5f2f3444eb3c8aff5b8dde7ac3fe0bfab64a7ee5b90119af717e1e1d76a0eb
    Port:       <none>
    Host Port:  <none>
    Command:
      chown
    Args:
      --verbose
      --recursive
      ceph:ceph
      /var/log/ceph
      /var/lib/ceph/crash
      /var/lib/ceph/rgw/ceph-ocs-storagecluster-cephobjectstore
    Limits:
      cpu:     2
      memory:  4Gi
    Requests:
      cpu:        2
      memory:     4Gi
    Environment:  <none>
    Mounts:
      /etc/ceph from rook-config-override (ro)
      /etc/ceph/keyring-store/ from rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-keyring (ro)
      /var/lib/ceph/crash from rook-ceph-crash (rw)
      /var/lib/ceph/rgw/ceph-ocs-storagecluster-cephobjectstore from ceph-daemon-data (rw)
      /var/log/ceph from rook-ceph-log (rw)
  Containers:
   rgw:
    Image:      quay.io/rhceph-dev/rhceph@sha256:9f5f2f3444eb3c8aff5b8dde7ac3fe0bfab64a7ee5b90119af717e1e1d76a0eb
    Port:       <none>
    Host Port:  <none>
    Command:
      radosgw
    Args:
      --fsid=75dffa61-aa5b-4948-9fe4-55ec8571d348
      --keyring=/etc/ceph/keyring-store/keyring
      --log-to-stderr=true
      --err-to-stderr=true

      /etc/ceph/private from rook-ceph-rgw-cert (ro)
      /etc/ceph/rgw from rook-ceph-rgw-ocs-storagecluster-cephobjectstore-mime-types (ro)
      /var/lib/ceph/crash from rook-ceph-crash (rw)
      /var/lib/ceph/rgw/ceph-ocs-storagecluster-cephobjectstore from ceph-daemon-data (rw)
      /var/log/ceph from rook-ceph-log (rw)
   log-collector:
    Image:      quay.io/rhceph-dev/rhceph@sha256:9f5f2f3444eb3c8aff5b8dde7ac3fe0bfab64a7ee5b90119af717e1e1d76a0eb
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/bash
      -x
      -e
      -m
      -c

      CEPH_CLIENT_ID=ceph-client.rgw.ocs.storagecluster.cephobjectstore.a
      PERIODICITY=24h
      LOG_ROTATE_CEPH_FILE=/etc/logrotate.d/ceph

      if [ -z "$PERIODICITY" ]; then
        PERIODICITY=24h
      fi

      # edit the logrotate file to only rotate a specific daemon log
      # otherwise we will logrotate log files without reloading certain daemons
      # this might happen when multiple daemons run on the same machine
      sed -i "s|*.log|$CEPH_CLIENT_ID.log|" "$LOG_ROTATE_CEPH_FILE"

      while true; do
        sleep "$PERIODICITY"
        echo "starting log rotation"
        logrotate --verbose --force "$LOG_ROTATE_CEPH_FILE"
        echo "I am going to sleep now, see you in $PERIODICITY"
      done
      sed -i "s|*.log|$CEPH_CLIENT_ID.log|" "$LOG_ROTATE_CEPH_FILE"                                                                                                                                    [0/1796]

      while true; do
        sleep "$PERIODICITY"
        echo "starting log rotation"
        logrotate --verbose --force "$LOG_ROTATE_CEPH_FILE"
        echo "I am going to sleep now, see you in $PERIODICITY"
      done

    Environment:  <none>
    Mounts:
      /etc/ceph from rook-config-override (ro)
      /var/lib/ceph/crash from rook-ceph-crash (rw)
      /var/log/ceph from rook-ceph-log (rw)
  Volumes:
   rook-config-override:
    Type:               Projected (a volume that contains injected data from multiple sources)
    ConfigMapName:      rook-config-override
    ConfigMapOptional:  <nil>
   rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-keyring
    Optional:    false
   rook-ceph-log:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/rook/openshift-storage/log
    HostPathType:
   rook-ceph-crash:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/rook/openshift-storage/crash
    HostPathType:
   ceph-daemon-data:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
   rook-ceph-rgw-ocs-storagecluster-cephobjectstore-mime-types:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      rook-ceph-rgw-ocs-storagecluster-cephobjectstore-mime-types
    Optional:  false
   rook-ceph-rgw-cert:
    Type:               Secret (a volume populated by a Secret)
    SecretName:         ocs-storagecluster-cos-ceph-rgw-tls-cert
    Optional:           false
  Priority Class Name:  openshift-user-critical
Conditions:
  Type             Status  Reason
  ----             ------  ------
  Available        True    MinimumReplicasAvailable
  ReplicaFailure   True    FailedCreate
  Progressing      False   ProgressDeadlineExceeded
OldReplicaSets:    <none>
NewReplicaSet:     rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-f98b59bd5 (0/1 replicas created)
Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  30m   deployment-controller  Scaled up replica set rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-f98b59bd5 to 1






# oc -n openshift-storage get svc
NAME                                               TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)                                                    AGE
csi-addons-controller-manager-metrics-service      ClusterIP      172.30.43.175    <none>        8443/TCP                                                   38m
csi-cephfsplugin-metrics                           ClusterIP      172.30.5.60      <none>        8080/TCP,8081/TCP                                          37m
csi-rbdplugin-metrics                              ClusterIP      172.30.43.190    <none>        8080/TCP,8081/TCP                                          37m
noobaa-db-pg                                       ClusterIP      172.30.210.52    <none>        5432/TCP                                                   34m
noobaa-mgmt                                        LoadBalancer   172.30.52.103    <pending>     80:32344/TCP,443:30675/TCP,8445:30630/TCP,8446:31687/TCP   34m
noobaa-operator-service                            ClusterIP      172.30.60.183    <none>        443/TCP                                                    34m
ocs-metrics-exporter                               ClusterIP      172.30.43.168    <none>        8080/TCP,8081/TCP                                          34m
odf-console-service                                ClusterIP      172.30.111.187   <none>        9001/TCP                                                   39m
odf-operator-controller-manager-metrics-service    ClusterIP      172.30.226.23    <none>        8443/TCP                                                   39m
rook-ceph-mgr                                      ClusterIP      172.30.114.236   <none>        9283/TCP                                                   35m
rook-ceph-mon-a                                    ClusterIP      172.30.171.54    <none>        6789/TCP,3300/TCP                                          36m
rook-ceph-mon-b                                    ClusterIP      172.30.17.54     <none>        6789/TCP,3300/TCP                                          36m
rook-ceph-mon-c                                    ClusterIP      172.30.71.96     <none>        6789/TCP,3300/TCP                                          36m
rook-ceph-rgw-ocs-storagecluster-cephobjectstore   ClusterIP      172.30.102.80    <none>        80/TCP,443/TCP                                             35m
s3                                                 LoadBalancer   172.30.7.38      <pending>     80:32395/TCP,443:30633/TCP,8444:30869/TCP,7004:30373/TCP   34m
sts                                                LoadBalancer   172.30.100.71    <pending>     443:30487/TCP                                              34m

Comment 16 Sridhar Venkat (IBM) 2022-04-28 12:59:19 UTC

From my environment:

[root@nx124-411-b853-syd04-bastion-0 ~]# oc get deployment rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a -n openshift-storage
NAME                                                 READY   UP-TO-DATE   AVAILABLE   AGE
rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a   0/1     0            0           7h37m
[root@nx124-411-b853-syd04-bastion-0 ~]# oc get csv -A
NAMESPACE                              NAME                                         DISPLAY                       VERSION               REPLACES   PHASE
openshift-local-storage                local-storage-operator.4.11.0-202204220613   Local Storage                 4.11.0-202204220613              Succeeded
openshift-operator-lifecycle-manager   packageserver                                Package Server                0.19.0                           Succeeded
openshift-storage                      mcg-operator.v4.11.0                         NooBaa Operator               4.11.0                           Succeeded
openshift-storage                      ocs-operator.v4.11.0                         OpenShift Container Storage   4.11.0                           Succeeded
openshift-storage                      odf-csi-addons-operator.v4.11.0              CSI Addons                    4.11.0                           Succeeded
openshift-storage                      odf-operator.v4.11.0                         OpenShift Data Foundation     4.11.0                           Succeeded
[root@nx124-411-b853-syd04-bastion-0 ~]# oc get csv odf-operator.v4.11.0 -n openshift-storage -o yaml | grep "full"
    full_version: 4.11.0-51
[root@nx124-411-b853-syd04-bastion-0 ~]# oc get csv ocs-operator.v4.11.0 -n openshift-storage -o yaml | grep "full"
    full_version: 4.11.0-51
[root@nx124-411-b853-syd04-bastion-0 ~]#


[root@nx124-411-b853-syd04-bastion-0 ~]# oc -n openshift-storage describe deploy rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a
Name:                   rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a
Namespace:              openshift-storage
CreationTimestamp:      Thu, 28 Apr 2022 01:19:41 -0400
Labels:                 app=rook-ceph-rgw
                        app.kubernetes.io/component=cephobjectstores.ceph.rook.io
                        app.kubernetes.io/created-by=rook-ceph-operator
                        app.kubernetes.io/instance=ocs-storagecluster-cephobjectstore
                        app.kubernetes.io/managed-by=rook-ceph-operator
                        app.kubernetes.io/name=ceph-rgw
                        app.kubernetes.io/part-of=ocs-storagecluster-cephobjectstore
                        ceph-version=16.2.7-107
                        ceph_daemon_id=ocs-storagecluster-cephobjectstore
                        ceph_daemon_type=rgw
                        rgw=ocs-storagecluster-cephobjectstore
                        rook-version=v4.11.0-0.354c987b60b7e13e92ac0d69e8504f3cb6c11279
                        rook.io/operator-namespace=openshift-storage
                        rook_cluster=openshift-storage
                        rook_object_store=ocs-storagecluster-cephobjectstore
Annotations:            banzaicloud.com/last-applied:
                          {"metadata":{"labels":{"app":"rook-ceph-rgw","app.kubernetes.io/component":"cephobjectstores.ceph.rook.io","app.kubernetes.io/created-by":...
                        deployment.kubernetes.io/revision: 1
Selector:               app=rook-ceph-rgw,ceph_daemon_id=ocs-storagecluster-cephobjectstore,rgw=ocs-storagecluster-cephobjectstore,rook_cluster=openshift-storage,rook_object_store=ocs-storagecluster-cephobjectstore
Replicas:               1 desired | 0 updated | 0 total | 0 available | 1 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  1 max unavailable, 0 max surge
Pod Template:
  Labels:           app=rook-ceph-rgw
                    app.kubernetes.io/component=cephobjectstores.ceph.rook.io
                    app.kubernetes.io/created-by=rook-ceph-operator
                    app.kubernetes.io/instance=ocs-storagecluster-cephobjectstore
                    app.kubernetes.io/managed-by=rook-ceph-operator
                    app.kubernetes.io/name=ceph-rgw
                    app.kubernetes.io/part-of=ocs-storagecluster-cephobjectstore
                    ceph_daemon_id=ocs-storagecluster-cephobjectstore
                    ceph_daemon_type=rgw
                    rgw=ocs-storagecluster-cephobjectstore
                    rook.io/operator-namespace=openshift-storage
                    rook_cluster=openshift-storage
                    rook_object_store=ocs-storagecluster-cephobjectstore
  Service Account:  rook-ceph-rgw
  Init Containers:
   chown-container-data-dir:
    Image:      quay.io/rhceph-dev/rhceph@sha256:9f5f2f3444eb3c8aff5b8dde7ac3fe0bfab64a7ee5b90119af717e1e1d76a0eb
    Port:       <none>
    Host Port:  <none>
    Command:
      chown
    Args:
      --verbose
      --recursive
      ceph:ceph
      /var/log/ceph
      /var/lib/ceph/crash
      /var/lib/ceph/rgw/ceph-ocs-storagecluster-cephobjectstore
    Limits:
      cpu:     2
      memory:  4Gi
    Requests:
      cpu:        2
      memory:     4Gi
    Environment:  <none>
    Mounts:
      /etc/ceph from rook-config-override (ro)
      /etc/ceph/keyring-store/ from rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-keyring (ro)
      /var/lib/ceph/crash from rook-ceph-crash (rw)
      /var/lib/ceph/rgw/ceph-ocs-storagecluster-cephobjectstore from ceph-daemon-data (rw)
      /var/log/ceph from rook-ceph-log (rw)
  Containers:
   rgw:
    Image:      quay.io/rhceph-dev/rhceph@sha256:9f5f2f3444eb3c8aff5b8dde7ac3fe0bfab64a7ee5b90119af717e1e1d76a0eb
    Port:       <none>
    Host Port:  <none>
    Command:
      radosgw
    Args:
      --fsid=b31cdec5-ff50-4967-b200-81a7c3fbfa1a
      --keyring=/etc/ceph/keyring-store/keyring
      --log-to-stderr=true
      --err-to-stderr=true
      --mon-cluster-log-to-stderr=true
      --log-stderr-prefix=debug 
      --default-log-to-file=false
      --default-mon-cluster-log-to-file=false
      --mon-host=$(ROOK_CEPH_MON_HOST)
      --mon-initial-members=$(ROOK_CEPH_MON_INITIAL_MEMBERS)
      --id=rgw.ocs.storagecluster.cephobjectstore.a
      --setuser=ceph
      --setgroup=ceph
      --foreground
      --rgw-frontends=beast port=8080 ssl_port=443 ssl_certificate=/etc/ceph/private/rgw-cert.pem ssl_private_key=/etc/ceph/private/rgw-key.pem
      --host=$(POD_NAME)
      --rgw-mime-types-file=/etc/ceph/rgw/mime.types
      --rgw-realm=ocs-storagecluster-cephobjectstore
      --rgw-zonegroup=ocs-storagecluster-cephobjectstore
      --rgw-zone=ocs-storagecluster-cephobjectstore
    Limits:
      cpu:     2
      memory:  4Gi
    Requests:
      cpu:      2
      memory:   4Gi
    Liveness:   tcp-socket :8080 delay=10s timeout=1s period=10s #success=1 #failure=3
    Readiness:  http-get http://:8080/swift/healthcheck delay=10s timeout=1s period=10s #success=1 #failure=3
    Startup:    tcp-socket :8080 delay=10s timeout=1s period=10s #success=1 #failure=18
    Environment:
      CONTAINER_IMAGE:                quay.io/rhceph-dev/rhceph@sha256:9f5f2f3444eb3c8aff5b8dde7ac3fe0bfab64a7ee5b90119af717e1e1d76a0eb
      POD_NAME:                        (v1:metadata.name)
      POD_NAMESPACE:                   (v1:metadata.namespace)
      NODE_NAME:                       (v1:spec.nodeName)
      POD_MEMORY_LIMIT:               4294967296 (limits.memory)
      POD_MEMORY_REQUEST:             4294967296 (requests.memory)
      POD_CPU_LIMIT:                  2 (limits.cpu)
      POD_CPU_REQUEST:                2 (requests.cpu)
      ROOK_CEPH_MON_HOST:             <set to the key 'mon_host' in secret 'rook-ceph-config'>             Optional: false
      ROOK_CEPH_MON_INITIAL_MEMBERS:  <set to the key 'mon_initial_members' in secret 'rook-ceph-config'>  Optional: false
    Mounts:
      /etc/ceph from rook-config-override (ro)
      /etc/ceph/keyring-store/ from rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-keyring (ro)
      /etc/ceph/private from rook-ceph-rgw-cert (ro)
      /etc/ceph/rgw from rook-ceph-rgw-ocs-storagecluster-cephobjectstore-mime-types (ro)
      /var/lib/ceph/crash from rook-ceph-crash (rw)
      /var/lib/ceph/rgw/ceph-ocs-storagecluster-cephobjectstore from ceph-daemon-data (rw)
      /var/log/ceph from rook-ceph-log (rw)
   log-collector:
    Image:      quay.io/rhceph-dev/rhceph@sha256:9f5f2f3444eb3c8aff5b8dde7ac3fe0bfab64a7ee5b90119af717e1e1d76a0eb
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/bash
      -x
      -e
      -m
      -c
      
      CEPH_CLIENT_ID=ceph-client.rgw.ocs.storagecluster.cephobjectstore.a
      PERIODICITY=24h
      LOG_ROTATE_CEPH_FILE=/etc/logrotate.d/ceph
      
      if [ -z "$PERIODICITY" ]; then
        PERIODICITY=24h
      fi
      
      # edit the logrotate file to only rotate a specific daemon log
      # otherwise we will logrotate log files without reloading certain daemons
      # this might happen when multiple daemons run on the same machine
      sed -i "s|*.log|$CEPH_CLIENT_ID.log|" "$LOG_ROTATE_CEPH_FILE"
      
      while true; do
        sleep "$PERIODICITY"
        echo "starting log rotation"
        logrotate --verbose --force "$LOG_ROTATE_CEPH_FILE"
        echo "I am going to sleep now, see you in $PERIODICITY"
      done
      
    Environment:  <none>
    Mounts:
      /etc/ceph from rook-config-override (ro)
      /var/lib/ceph/crash from rook-ceph-crash (rw)
      /var/log/ceph from rook-ceph-log (rw)
  Volumes:
   rook-config-override:
    Type:               Projected (a volume that contains injected data from multiple sources)
    ConfigMapName:      rook-config-override
    ConfigMapOptional:  <nil>
   rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-keyring
    Optional:    false
   rook-ceph-log:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/rook/openshift-storage/log
    HostPathType:  
   rook-ceph-crash:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/rook/openshift-storage/crash
    HostPathType:  
   ceph-daemon-data:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
   rook-ceph-rgw-ocs-storagecluster-cephobjectstore-mime-types:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      rook-ceph-rgw-ocs-storagecluster-cephobjectstore-mime-types
    Optional:  false
   rook-ceph-rgw-cert:
    Type:               Secret (a volume populated by a Secret)
    SecretName:         ocs-storagecluster-cos-ceph-rgw-tls-cert
    Optional:           false
  Priority Class Name:  openshift-user-critical
Conditions:
  Type             Status  Reason
  ----             ------  ------
  Available        True    MinimumReplicasAvailable
  Progressing      True    NewReplicaSetAvailable
  ReplicaFailure   True    FailedCreate
OldReplicaSets:    <none>
NewReplicaSet:     rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-6f444f4574 (0/1 replicas created)
Events:            <none>
[root@nx124-411-b853-syd04-bastion-0 ~]# 
[root@nx124-411-b853-syd04-bastion-0 ~]# 
[root@nx124-411-b853-syd04-bastion-0 ~]# oc -n openshift-storage get svc
NAME                                               TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)                                                    AGE
csi-addons-controller-manager-metrics-service      ClusterIP      172.30.18.152    <none>        8443/TCP                                                   7h49m
csi-cephfsplugin-metrics                           ClusterIP      172.30.50.86     <none>        8080/TCP,8081/TCP                                          7h47m
csi-rbdplugin-metrics                              ClusterIP      172.30.235.99    <none>        8080/TCP,8081/TCP                                          7h47m
noobaa-db-pg                                       ClusterIP      172.30.144.11    <none>        5432/TCP                                                   7h38m
noobaa-mgmt                                        LoadBalancer   172.30.121.155   <pending>     80:32409/TCP,443:32675/TCP,8445:31121/TCP,8446:32642/TCP   7h38m
noobaa-operator-service                            ClusterIP      172.30.16.88     <none>        443/TCP                                                    7h38m
ocs-metrics-exporter                               ClusterIP      172.30.149.141   <none>        8080/TCP,8081/TCP                                          7h38m
odf-console-service                                ClusterIP      172.30.189.49    <none>        9001/TCP                                                   7h50m
odf-operator-controller-manager-metrics-service    ClusterIP      172.30.219.94    <none>        8443/TCP                                                   7h51m
rook-ceph-mgr                                      ClusterIP      172.30.166.11    <none>        9283/TCP                                                   7h40m
rook-ceph-mon-h                                    ClusterIP      172.30.171.184   <none>        6789/TCP,3300/TCP                                          3h33m
rook-ceph-mon-i                                    ClusterIP      172.30.173.6     <none>        6789/TCP,3300/TCP                                          3h33m
rook-ceph-mon-j                                    ClusterIP      172.30.246.166   <none>        6789/TCP,3300/TCP                                          3h33m
rook-ceph-rgw-ocs-storagecluster-cephobjectstore   ClusterIP      172.30.2.139     <none>        80/TCP,443/TCP                                             7h39m
s3                                                 LoadBalancer   172.30.137.88    <pending>     80:30549/TCP,443:32312/TCP,8444:30988/TCP,7004:30694/TCP   7h38m
sts                                                LoadBalancer   172.30.252.157   <pending>     443:30302/TCP                                              7h38m
[root@nx124-411-b853-syd04-bastion-0 ~]#

Comment 17 Travis Nielsen 2022-04-28 17:46:00 UTC

One more question, what does this show `oc -n openshift-storage get serviceaccount`? The problem is still the missing rook-ceph-rgw service account.

Comment 18 Travis Nielsen 2022-04-28 17:47:20 UTC

*** Bug 2079975 has been marked as a duplicate of this bug. ***

Comment 19 Sravika 2022-04-29 08:00:07 UTC

# oc -n openshift-storage get serviceaccount
NAME                                   SECRETS   AGE
builder                                2         33m
ceph-nfs-external-provisioner-runner   2         33m
csi-addons-controller-manager          2         32m
default                                2         33m
deployer                               2         33m
noobaa                                 2         33m
noobaa-endpoint                        2         33m
noobaa-odf-ui                          2         33m
ocs-metrics-exporter                   2         33m
ocs-operator                           2         33m
ocs-provider-server                    2         33m
odf-operator-controller-manager        2         33m
rook-ceph-cmd-reporter                 2         33m
rook-ceph-mgr                          2         32m
rook-ceph-osd                          2         33m
rook-ceph-purge-osd                    2         33m
rook-ceph-rgw                          2         33m
rook-ceph-system                       2         33m
rook-csi-cephfs-plugin-sa              2         33m
rook-csi-cephfs-provisioner-sa         2         33m
rook-csi-rbd-plugin-sa                 2         32m
rook-csi-rbd-provisioner-sa            2         32m

Comment 20 Sridhar Venkat (IBM) 2022-05-02 12:24:23 UTC

From IBM Power environment:

[root@nx124-411-2f02-syd04-bastion-0 ~]# oc -n openshift-storage get serviceaccount
NAME                                   SECRETS   AGE
builder                                2         5h33m
ceph-nfs-external-provisioner-runner   2         5h32m
csi-addons-controller-manager          2         5h30m
default                                2         5h33m
deployer                               2         5h33m
noobaa                                 2         5h32m
noobaa-endpoint                        2         5h32m
noobaa-odf-ui                          2         5h32m
ocs-metrics-exporter                   2         5h32m
ocs-operator                           2         5h32m
ocs-provider-server                    2         5h32m
odf-operator-controller-manager        2         5h32m
rook-ceph-cmd-reporter                 2         5h32m
rook-ceph-mgr                          2         5h32m
rook-ceph-osd                          2         5h32m
rook-ceph-purge-osd                    2         5h32m
rook-ceph-rgw                          2         5h32m
rook-ceph-system                       2         5h32m
rook-csi-cephfs-plugin-sa              2         5h32m
rook-csi-cephfs-provisioner-sa         2         5h32m
rook-csi-rbd-plugin-sa                 2         5h32m
rook-csi-rbd-provisioner-sa            2         5h32m
[root@nx124-411-2f02-syd04-bastion-0 ~]#

Same as what Sravika posted above.

Comment 21 Travis Nielsen 2022-05-02 16:50:39 UTC

Ok good to see the rook-ceph-rgw service account has been created. Now there must be a different error preventing the pod from starting. The describe on the deployment doesn't show the errors though. What does describe show for the rgw replicaset? If the rgw pod doesn't exist, the rgw replicaset must give some indication of the error.

Comment 22 Sridhar Venkat (IBM) 2022-05-02 19:49:58 UTC

[root@nx124-411-2f02-syd04-bastion-0 ~]# oc describe replicaset rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-7dfd5d9b98 -n openshift-storage
Name:           rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-7dfd5d9b98
Namespace:      openshift-storage
Selector:       app=rook-ceph-rgw,ceph_daemon_id=ocs-storagecluster-cephobjectstore,pod-template-hash=7dfd5d9b98,rgw=ocs-storagecluster-cephobjectstore,rook_cluster=openshift-storage,rook_object_store=ocs-storagecluster-cephobjectstore
Labels:         app=rook-ceph-rgw
                app.kubernetes.io/component=cephobjectstores.ceph.rook.io
                app.kubernetes.io/created-by=rook-ceph-operator
                app.kubernetes.io/instance=ocs-storagecluster-cephobjectstore
                app.kubernetes.io/managed-by=rook-ceph-operator
                app.kubernetes.io/name=ceph-rgw
                app.kubernetes.io/part-of=ocs-storagecluster-cephobjectstore
                ceph_daemon_id=ocs-storagecluster-cephobjectstore
                ceph_daemon_type=rgw
                pod-template-hash=7dfd5d9b98
                rgw=ocs-storagecluster-cephobjectstore
                rook.io/operator-namespace=openshift-storage
                rook_cluster=openshift-storage
                rook_object_store=ocs-storagecluster-cephobjectstore
Annotations:    banzaicloud.com/last-applied:
                  {"metadata":{"labels":{"app":"rook-ceph-rgw","app.kubernetes.io/component":"cephobjectstores.ceph.rook.io","app.kubernetes.io/created-by":...
                deployment.kubernetes.io/desired-replicas: 1
                deployment.kubernetes.io/max-replicas: 1
                deployment.kubernetes.io/revision: 1
Controlled By:  Deployment/rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a
Replicas:       0 current / 1 desired
Pods Status:    0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:           app=rook-ceph-rgw
                    app.kubernetes.io/component=cephobjectstores.ceph.rook.io
                    app.kubernetes.io/created-by=rook-ceph-operator
                    app.kubernetes.io/instance=ocs-storagecluster-cephobjectstore
                    app.kubernetes.io/managed-by=rook-ceph-operator
                    app.kubernetes.io/name=ceph-rgw
                    app.kubernetes.io/part-of=ocs-storagecluster-cephobjectstore
                    ceph_daemon_id=ocs-storagecluster-cephobjectstore
                    ceph_daemon_type=rgw
                    pod-template-hash=7dfd5d9b98
                    rgw=ocs-storagecluster-cephobjectstore
                    rook.io/operator-namespace=openshift-storage
                    rook_cluster=openshift-storage
                    rook_object_store=ocs-storagecluster-cephobjectstore
  Service Account:  rook-ceph-rgw
  Init Containers:
   chown-container-data-dir:
    Image:      quay.io/rhceph-dev/rhceph@sha256:9f5f2f3444eb3c8aff5b8dde7ac3fe0bfab64a7ee5b90119af717e1e1d76a0eb
    Port:       <none>
    Host Port:  <none>
    Command:
      chown
    Args:
      --verbose
      --recursive
      ceph:ceph
      /var/log/ceph
      /var/lib/ceph/crash
      /var/lib/ceph/rgw/ceph-ocs-storagecluster-cephobjectstore
    Limits:
      cpu:     2
      memory:  4Gi
    Requests:
      cpu:        2
      memory:     4Gi
    Environment:  <none>
    Mounts:
      /etc/ceph from rook-config-override (ro)
      /etc/ceph/keyring-store/ from rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-keyring (ro)
      /var/lib/ceph/crash from rook-ceph-crash (rw)
      /var/lib/ceph/rgw/ceph-ocs-storagecluster-cephobjectstore from ceph-daemon-data (rw)
      /var/log/ceph from rook-ceph-log (rw)
  Containers:
   rgw:
    Image:      quay.io/rhceph-dev/rhceph@sha256:9f5f2f3444eb3c8aff5b8dde7ac3fe0bfab64a7ee5b90119af717e1e1d76a0eb
    Port:       <none>
    Host Port:  <none>
    Command:
      radosgw
    Args:
      --fsid=a6d856ac-f24b-4b9a-bf18-7ba8b4f710d8
      --keyring=/etc/ceph/keyring-store/keyring
      --log-to-stderr=true
      --err-to-stderr=true
      --mon-cluster-log-to-stderr=true
      --log-stderr-prefix=debug 
      --default-log-to-file=false
      --default-mon-cluster-log-to-file=false
      --mon-host=$(ROOK_CEPH_MON_HOST)
      --mon-initial-members=$(ROOK_CEPH_MON_INITIAL_MEMBERS)
      --id=rgw.ocs.storagecluster.cephobjectstore.a
      --setuser=ceph
      --setgroup=ceph
      --foreground
      --rgw-frontends=beast port=8080 ssl_port=443 ssl_certificate=/etc/ceph/private/rgw-cert.pem ssl_private_key=/etc/ceph/private/rgw-key.pem
      --host=$(POD_NAME)
      --rgw-mime-types-file=/etc/ceph/rgw/mime.types
      --rgw-realm=ocs-storagecluster-cephobjectstore
      --rgw-zonegroup=ocs-storagecluster-cephobjectstore
      --rgw-zone=ocs-storagecluster-cephobjectstore
    Limits:
      cpu:     2
      memory:  4Gi
    Requests:
      cpu:      2
      memory:   4Gi
    Liveness:   tcp-socket :8080 delay=10s timeout=1s period=10s #success=1 #failure=3
    Readiness:  http-get http://:8080/swift/healthcheck delay=10s timeout=1s period=10s #success=1 #failure=3
    Startup:    tcp-socket :8080 delay=10s timeout=1s period=10s #success=1 #failure=18
    Environment:
      CONTAINER_IMAGE:                quay.io/rhceph-dev/rhceph@sha256:9f5f2f3444eb3c8aff5b8dde7ac3fe0bfab64a7ee5b90119af717e1e1d76a0eb
      POD_NAME:                        (v1:metadata.name)
      POD_NAMESPACE:                   (v1:metadata.namespace)
      NODE_NAME:                       (v1:spec.nodeName)
      POD_MEMORY_LIMIT:               4294967296 (limits.memory)
      POD_MEMORY_REQUEST:             4294967296 (requests.memory)
      POD_CPU_LIMIT:                  2 (limits.cpu)
      POD_CPU_REQUEST:                2 (requests.cpu)
      ROOK_CEPH_MON_HOST:             <set to the key 'mon_host' in secret 'rook-ceph-config'>             Optional: false
      ROOK_CEPH_MON_INITIAL_MEMBERS:  <set to the key 'mon_initial_members' in secret 'rook-ceph-config'>  Optional: false
    Mounts:
      /etc/ceph from rook-config-override (ro)
      /etc/ceph/keyring-store/ from rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-keyring (ro)
      /etc/ceph/private from rook-ceph-rgw-cert (ro)
      /etc/ceph/rgw from rook-ceph-rgw-ocs-storagecluster-cephobjectstore-mime-types (ro)
      /var/lib/ceph/crash from rook-ceph-crash (rw)
      /var/lib/ceph/rgw/ceph-ocs-storagecluster-cephobjectstore from ceph-daemon-data (rw)
      /var/log/ceph from rook-ceph-log (rw)
   log-collector:
    Image:      quay.io/rhceph-dev/rhceph@sha256:9f5f2f3444eb3c8aff5b8dde7ac3fe0bfab64a7ee5b90119af717e1e1d76a0eb
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/bash
      -x
      -e
      -m
      -c
      
      CEPH_CLIENT_ID=ceph-client.rgw.ocs.storagecluster.cephobjectstore.a
      PERIODICITY=24h
      LOG_ROTATE_CEPH_FILE=/etc/logrotate.d/ceph
      
      if [ -z "$PERIODICITY" ]; then
        PERIODICITY=24h
      fi
      
      # edit the logrotate file to only rotate a specific daemon log
      # otherwise we will logrotate log files without reloading certain daemons
      # this might happen when multiple daemons run on the same machine
      sed -i "s|*.log|$CEPH_CLIENT_ID.log|" "$LOG_ROTATE_CEPH_FILE"
      
      while true; do
        sleep "$PERIODICITY"
        echo "starting log rotation"
        logrotate --verbose --force "$LOG_ROTATE_CEPH_FILE"
        echo "I am going to sleep now, see you in $PERIODICITY"
      done
      
    Environment:  <none>
    Mounts:
      /etc/ceph from rook-config-override (ro)
      /var/lib/ceph/crash from rook-ceph-crash (rw)
      /var/log/ceph from rook-ceph-log (rw)
  Volumes:
   rook-config-override:
    Type:               Projected (a volume that contains injected data from multiple sources)
    ConfigMapName:      rook-config-override
    ConfigMapOptional:  <nil>
   rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-keyring
    Optional:    false
   rook-ceph-log:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/rook/openshift-storage/log
    HostPathType:  
   rook-ceph-crash:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/rook/openshift-storage/crash
    HostPathType:  
   ceph-daemon-data:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
   rook-ceph-rgw-ocs-storagecluster-cephobjectstore-mime-types:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      rook-ceph-rgw-ocs-storagecluster-cephobjectstore-mime-types
    Optional:  false
   rook-ceph-rgw-cert:
    Type:               Secret (a volume populated by a Secret)
    SecretName:         ocs-storagecluster-cos-ceph-rgw-tls-cert
    Optional:           false
  Priority Class Name:  openshift-user-critical
Conditions:
  Type             Status  Reason
  ----             ------  ------
  ReplicaFailure   True    FailedCreate
Events:
  Type     Reason        Age                    From                   Message
  ----     ------        ----                   ----                   -------
  Warning  FailedCreate  4m35s (x139 over 12h)  replicaset-controller  Error creating: pods "rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-7dfd5d9b98-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.volumes[3]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.initContainers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed, spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed, spec.containers[1].securityContext.privileged: Invalid value: true: Privileged containers are not allowed, provider "restricted": Forbidden: not usable by user or serviceaccount, provider "nonroot-v2": Forbidden: not usable by user or serviceaccount, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "noobaa": Forbidden: not usable by user or serviceaccount, provider "noobaa-endpoint": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork-v2": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "rook-ceph": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "rook-ceph-csi": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]
[root@nx124-411-2f02-syd04-bastion-0 ~]#

Comment 23 Travis Nielsen 2022-05-02 20:12:41 UTC

Aha, if you look at the scc, it must not have picked up the new rgw binding that was added here:
https://github.com/rook/rook/pull/9964/files#diff-b218546ccf5d4a03e758f01e20b4eccde26c61450fccb27b1d21f1a122217e67R74

Now we need the OCS operator to update to the latest rook with this update to the scc

Comment 24 Travis Nielsen 2022-05-02 20:43:45 UTC

Update to OCS operator is in progress...

Comment 25 Nitin Goyal 2022-05-05 05:26:59 UTC

*** Bug 2081690 has been marked as a duplicate of this bug. ***

Comment 26 Travis Nielsen 2022-05-05 23:33:09 UTC

This fixed merged, and the latest build should be working now.

Comment 27 Vijay Avuthu 2022-05-06 08:00:16 UTC

Deployment success on vSPhere platform with build: ocs-registry:4.11.0-63

job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/4157/consoleFull

logs:

2022-05-06 12:07:06  06:37:06 - MainThread - ocs_ci.ocs.resources.storage_cluster - INFO - Check if StorageCluster: ocs-storagecluster is in Succeeded phase
2022-05-06 12:07:06  06:37:06 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage get StorageCluster ocs-storagecluster -n openshift-storage -o yaml
2022-05-06 12:07:06  06:37:06 - MainThread - ocs_ci.ocs.ocp - INFO - Resource ocs-storagecluster is in phase: Ready!

2022-05-06 12:07:37  06:37:35 - MainThread - ocs_ci.ocs.resources.storage_cluster - INFO - Verifying ceph health
2022-05-06 12:07:37  06:37:35 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc wait --for condition=ready pod -l app=rook-ceph-tools -n openshift-storage --timeout=120s
2022-05-06 12:07:37  06:37:36 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage get pod -l 'app=rook-ceph-tools' -o jsonpath='{.items[0].metadata.name}'
2022-05-06 12:07:37  06:37:36 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage exec rook-ceph-tools-5cfbd9fdc8-htfgh -- ceph health
2022-05-06 12:07:37  06:37:36 - MainThread - ocs_ci.utility.utils - INFO - Ceph cluster health is HEALTH_OK.

Comment 28 Sravika 2022-05-06 09:48:31 UTC

ODF deployment on IBM Z succeeds with the latest build 4.11.0-63 .

# oc get storagecluster -A
NAMESPACE           NAME                 AGE     PHASE   EXTERNAL   CREATED AT             VERSION
openshift-storage   ocs-storagecluster   6m44s   Ready              2022-05-06T09:35:16Z   4.11.0

# oc get csv odf-operator.v4.11.0 -n openshift-storage -oyaml | grep full_version
    full_version: 4.11.0-63

# oc get po -n openshift-storage | grep rgw
rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-7d7d4d8rsb6g   2/2     Running     0               8m39s


# oc get po -n openshift-storage
NAME                                                              READY   STATUS      RESTARTS        AGE
csi-addons-controller-manager-5988b4d8b6-s2l9b                    2/2     Running     0               12m
csi-cephfsplugin-b776p                                            3/3     Running     0               12m
csi-cephfsplugin-d99tm                                            3/3     Running     0               12m
csi-cephfsplugin-gj8p2                                            3/3     Running     0               12m
csi-cephfsplugin-provisioner-75bcbb797-bdkhq                      6/6     Running     0               12m
csi-cephfsplugin-provisioner-75bcbb797-qqggf                      6/6     Running     0               12m
csi-rbdplugin-b9jkg                                               4/4     Running     0               12m
csi-rbdplugin-cndhd                                               4/4     Running     0               12m
csi-rbdplugin-provisioner-5b9f4659f8-qbvgl                        7/7     Running     0               12m
csi-rbdplugin-provisioner-5b9f4659f8-wc9kd                        7/7     Running     0               12m
csi-rbdplugin-zlb96                                               4/4     Running     0               12m
noobaa-core-0                                                     1/1     Running     0               8m51s
noobaa-db-pg-0                                                    1/1     Running     0               8m51s
noobaa-endpoint-79cf94ddc9-8xjd9                                  1/1     Running     0               7m5s
noobaa-operator-555bb8d4-rtq65                                    1/1     Running     1 (8m50s ago)   13m
ocs-metrics-exporter-85744bfc5d-gzpxt                             1/1     Running     0               13m
ocs-operator-5778c4c589-h8hsv                                     1/1     Running     0               13m
odf-console-7f84466d46-l2594                                      1/1     Running     0               13m
odf-operator-controller-manager-54b75784c7-5rnlb                  2/2     Running     0               13m
rook-ceph-crashcollector-worker-0.ocsm4205001.lnxero1.boe-2lzl8   1/1     Running     0               9m21s
rook-ceph-crashcollector-worker-1.ocsm4205001.lnxero1.boe-842hj   1/1     Running     0               9m56s
rook-ceph-crashcollector-worker-2.ocsm4205001.lnxero1.boe-ft2w7   1/1     Running     0               9m19s
rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-6c947c8dzhrbx   2/2     Running     0               9m21s
rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-7dcc78c9qmhgs   2/2     Running     0               9m19s
rook-ceph-mgr-a-577896b47f-8r7db                                  2/2     Running     0               10m
rook-ceph-mon-a-769d6d576-j845l                                   2/2     Running     0               11m
rook-ceph-mon-b-5c8c84f7bc-djsmd                                  2/2     Running     0               10m
rook-ceph-mon-c-677d84597-qlp8w                                   2/2     Running     0               10m
rook-ceph-operator-55f596475d-krlx8                               1/1     Running     0               13m
rook-ceph-osd-0-78fff9f694-zd69k                                  2/2     Running     0               9m36s
rook-ceph-osd-1-8484c487cb-xmzgk                                  2/2     Running     0               9m37s
rook-ceph-osd-2-f776cd8f6-5vwnt                                   2/2     Running     0               9m35s
rook-ceph-osd-prepare-11714a315d0086dac219451384576567-twpnp      0/1     Completed   0               9m51s
rook-ceph-osd-prepare-54a70174d45406e740930e7995d8fed4-kf9tp      0/1     Completed   0               9m51s
rook-ceph-osd-prepare-e2a03e34e2eb44d41486e5c31a6ca356-d6mqz      0/1     Completed   0               9m51s
rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-7d7d4d8rsb6g   2/2     Running     0               9m6s
rook-ceph-tools-5cfbd9fdc8-ql5c2                                  1/1     Running     0               9m19s

Comment 29 Sridhar Venkat (IBM) 2022-05-06 12:42:12 UTC

It is looking good for IBM Power platform as well:

[root@nx124-411-402a-syd04-bastion-0 ~]# oc -n openshift-storage get Pod  -n openshift-storage --selector=app=rook-ceph-rgw
NAME                                                              READY   STATUS    RESTARTS   AGE
rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-6b48d7488lnj   2/2     Running   0          72m
[root@nx124-411-402a-syd04-bastion-0 ~]# oc get csv odf-operator.v4.11.0 -n openshift-storage -o yaml | grep "full"
    full_version: 4.11.0-63
[root@nx124-411-402a-syd04-bastion-0 ~]#

Comment 35 Elad 2022-07-19 15:14:47 UTC

Moving to VERIFIED based on comment #29

Comment 37 errata-xmlrpc 2022-08-24 13:51:03 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.11.0 security, enhancement, & bugfix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:6156