1850954 – OSD limits / requests are not imposed for LSO OCS installations

Bug 1850954 - OSD limits / requests are not imposed for LSO OCS installations

Summary: OSD limits / requests are not imposed for LSO OCS installations

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat OpenShift Container Storage
Classification:	Red Hat Storage
Component:	ocs-operator
Sub Component:
Version:	4.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Jose A. Rivera
QA Contact:	Raz Tamir
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-06-25 09:11 UTC by Elvir Kuric
Modified:	2023-09-15 00:33 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-02-08 15:29:38 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Elvir Kuric 2020-06-25 09:11:06 UTC

Description of problem (please be detailed as possible and provide log
snippests):

Installed OCP v4.5 and OCS v4.4 on aws m5.4xlarge with below files [1], [2], [3], [4] by executing 
# oc create -f [1]/[2]/[3]/[4] after node labeling 

In this installation there were 20 OSDs per OCS node ( m5.4xlarge ), so total planned 60 OSDs on 3 nodes. 
I got 59 OSDs started but last one failed due insufficeind cpu.

The limits / requests for OSDs were not imposed as shown below 


# oc describe node ip-10-0-217-151.eu-central-1.compute.internal |grep osd
  openshift-storage                       rook-ceph-osd-36-847d7b74b8-26rtm                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-37-6c95d74dbf-99ql7                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-41-64d956747b-h8v8g                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-42-6cb76fdd88-r2v6x                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-44-8c568d994-kp57c                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-45-77b8d86799-vj4lh                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-46-6fdfbfd74f-c54fv                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-47-fb544b459-vbj9b                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-48-7d68c66978-zzj74                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-49-66948f8b5d-dbgf2                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-50-597f7c8b48-8zv4q                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-51-5cc757cbf4-tb9wr                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-52-7f4b75dbc7-5vcfv                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-53-79466797f4-wz6jp                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-54-5977f9fbd7-6rslz                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-55-84cff6cdb-p85qq                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-56-745bddfdfc-6hhwr                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-57-65564f96cf-mc2tr                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-58-7f74bc6b46-9b7nx                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  openshift-storage                       rook-ceph-osd-59-559645869c-fjvp8                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m


Version of all relevant components (if applicable):
OCP v4.5 + OCS v4.4 (  quay.io/rhceph-dev/ocs-olm-operator:latest-4.4 ) 


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

yes


Is there any workaround available to the best of your knowledge?
NA


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
3


Can this issue reproducible?
Yes


Can this issue reproduce from the UI?
NA


If this is a regression, please provide more details to justify this:
NA

Steps to Reproduce:
1. Install OCP v4.5 + OCS v4.4 - OCS v4.4 using below files 
2. check an node where OSDs are scheduled and check limits / requests section 



Actual results:

OSD pods limits/requets are not imposed 


[1] 
cat 1_redhat-src-catalog.yaml 
apiVersion: v1
kind: Namespace
metadata:
  name: local-storage
---
apiVersion: operators.coreos.com/v1alpha2
kind: OperatorGroup
metadata:
  name: local-operator-group
  namespace: local-storage
spec:
  targetNamespaces:
    - local-storage
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: local-storage-operator
  namespace: local-storage
spec:
  channel: "4.4"
  installPlanApproval: Automatic
  name: local-storage-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  

[2] 

# cat e2_example_localpv_cr.yaml 
apiVersion: local.storage.openshift.io/v1
kind: LocalVolume
metadata:
 name: local-disks
 namespace: local-storage
spec:
 logLevel: Normal
 managementState: Managed
 nodeSelector:
   nodeSelectorTerms:
   - matchExpressions:
     - key: kubernetes.io/hostname
       operator: In
       values: 
       - ip-10-0-151-15
       - ip-10-0-171-234
       - ip-10-0-217-151
 storageClassDevices:
 - devicePaths:
   - /dev/nvme1n1
   - /dev/nvme2n1 
   - /dev/nvme3n1
   - /dev/nvme4n1
   - /dev/nvme5n1
   - /dev/nvme6n1
   - /dev/nvme7n1
   - /dev/nvme8n1
   - /dev/nvme9n1
   - /dev/nvme10n1
   - /dev/nvme11n1
   - /dev/nvme12n1
   - /dev/nvme13n1
   - /dev/nvme14n1
   - /dev/nvme15n1
   - /dev/nvme16n1
   - /dev/nvme17n1
   - /dev/nvme18n1
   - /dev/nvme19n1
   - /dev/nvme20n1
   storageClassName: localblock-sc


[3] 
# cat 3_deploy-with-olm.yaml 
---
apiVersion: v1
kind: Namespace
metadata:
  labels:
    openshift.io/cluster-monitoring: "true"
  name: openshift-storage
spec: {}
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: openshift-storage-operatorgroup
  namespace: openshift-storage
spec:
  serviceAccount:
    metadata:
      creationTimestamp: null
  targetNamespaces:
  - openshift-storage
---
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: ocs-catalogsource
  namespace: openshift-marketplace
spec:
  displayName: OpenShift Container Storage
  icon:
    base64data: PHN2ZyBpZD0iTGF5ZXJfMSIgZGF0YS1uYW1lPSJMYXllciAxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAxOTIgMTQ1Ij48ZGVmcz48c3R5bGU+LmNscy0xe2ZpbGw6I2UwMDt9PC9zdHlsZT48L2RlZnM+PHRpdGxlPlJlZEhhdC1Mb2dvLUhhdC1Db2xvcjwvdGl0bGU+PHBhdGggZD0iTTE1Ny43Nyw2Mi42MWExNCwxNCwwLDAsMSwuMzEsMy40MmMwLDE0Ljg4LTE4LjEsMTcuNDYtMzAuNjEsMTcuNDZDNzguODMsODMuNDksNDIuNTMsNTMuMjYsNDIuNTMsNDRhNi40Myw2LjQzLDAsMCwxLC4yMi0xLjk0bC0zLjY2LDkuMDZhMTguNDUsMTguNDUsMCwwLDAtMS41MSw3LjMzYzAsMTguMTEsNDEsNDUuNDgsODcuNzQsNDUuNDgsMjAuNjksMCwzNi40My03Ljc2LDM2LjQzLTIxLjc3LDAtMS4wOCwwLTEuOTQtMS43My0xMC4xM1oiLz48cGF0aCBjbGFzcz0iY2xzLTEiIGQ9Ik0xMjcuNDcsODMuNDljMTIuNTEsMCwzMC42MS0yLjU4LDMwLjYxLTE3LjQ2YTE0LDE0LDAsMCwwLS4zMS0zLjQybC03LjQ1LTMyLjM2Yy0xLjcyLTcuMTItMy4yMy0xMC4zNS0xNS43My0xNi42QzEyNC44OSw4LjY5LDEwMy43Ni41LDk3LjUxLjUsOTEuNjkuNSw5MCw4LDgzLjA2LDhjLTYuNjgsMC0xMS42NC01LjYtMTcuODktNS42LTYsMC05LjkxLDQuMDktMTIuOTMsMTIuNSwwLDAtOC40MSwyMy43Mi05LjQ5LDI3LjE2QTYuNDMsNi40MywwLDAsMCw0Mi41Myw0NGMwLDkuMjIsMzYuMywzOS40NSw4NC45NCwzOS40NU0xNjAsNzIuMDdjMS43Myw4LjE5LDEuNzMsOS4wNSwxLjczLDEwLjEzLDAsMTQtMTUuNzQsMjEuNzctMzYuNDMsMjEuNzdDNzguNTQsMTA0LDM3LjU4LDc2LjYsMzcuNTgsNTguNDlhMTguNDUsMTguNDUsMCwwLDEsMS41MS03LjMzQzIyLjI3LDUyLC41LDU1LC41LDc0LjIyYzAsMzEuNDgsNzQuNTksNzAuMjgsMTMzLjY1LDcwLjI4LDQ1LjI4LDAsNTYuNy0yMC40OCw1Ni43LTM2LjY1LDAtMTIuNzItMTEtMjcuMTYtMzAuODMtMzUuNzgiLz48L3N2Zz4=
    mediatype: image/svg+xml
  image: quay.io/rhceph-dev/ocs-olm-operator:latest-4.4
  publisher: Red Hat
  sourceType: grpc
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: ocs-subscription
  namespace: openshift-storage
spec:
  channel: stable-4.4
  name: ocs-operator
  source: ocs-catalogsource
  sourceNamespace: openshift-marketplace

[4]

cat  4_3-nodes-small-scale-ocs-cluster-cr.yaml 
apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
  namespace: openshift-storage
  name: example-storagecluster
spec:
  manageNodes: false
  monDataDirHostPath: /var/lib/rook
  storageDeviceSets:
    - name: example-deviceset-0
      count: 20
      replica: 1
      resources:
         requests: {}
         limits: {}
      placement:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: cluster.ocs.openshift.io/openshift-storage
                    operator: Exists
                  - key: topology.rook.io/rack
                    operator: In
                    values:
                      - rack0
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - podAffinityTerm:
                labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - rook-ceph-osd
                topologyKey: topology.rook.io/rack
              weight: 100
      dataPVCTemplate:
        spec:
          storageClassName: localblock-sc
          accessModes:
            - ReadWriteOnce
          volumeMode: Block
          resources:
            requests:
              storage: 372Gi
      portable: false
    - name: example-deviceset-1
      count: 20
      replica: 1
      resources:
         requests: {}
         limits: {}
      placement:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: cluster.ocs.openshift.io/openshift-storage
                    operator: Exists
                  - key: topology.rook.io/rack
                    operator: In
                    values:
                      - rack1
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - podAffinityTerm:
                labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - rook-ceph-osd
                topologyKey: topology.rook.io/rack
              weight: 100
      dataPVCTemplate:
        spec:
          storageClassName: localblock-sc
          accessModes:
            - ReadWriteOnce
          volumeMode: Block
          resources:
            requests:
              storage: 372Gi
      portable: false
    - name: example-deviceset-2
      count: 20
      replica: 1
      resources:
         requests: {}
         limits: {}
      placement:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: cluster.ocs.openshift.io/openshift-storage
                    operator: Exists
                  - key: topology.rook.io/rack
                    operator: In
                    values:
                      - rack2
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - podAffinityTerm:
                labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - rook-ceph-osd
                topologyKey: topology.rook.io/rack
              weight: 100
      dataPVCTemplate:
        spec:
          storageClassName: localblock-sc
          accessModes:
            - ReadWriteOnce
          volumeMode: Block
          resources:
            requests:
              storage: 372Gi
      portable: false

Comment 2 Elvir Kuric 2020-06-25 12:31:50 UTC

retested on new cluster with only 3 OSDs and I see limits/requests imposed 

# oc describe node ip-10-0-132-125.eu-central-1.compute.internal | grep osd
  openshift-storage                       rook-ceph-osd-0-7767589b7d-bt2p6                                   1 (6%)        2 (12%)     4Gi (6%)         8Gi (13%)      2m8s
  openshift-storage                       rook-ceph-osd-1-649f9b9f69-nmr2l                                   1 (6%)        2 (12%)     4Gi (6%)         8Gi (13%)      108s
  openshift-storage                       rook-ceph-osd-2-859656db56-2ngtp                                   1 (6%)        2 (12%)     4Gi (6%)         8Gi (13%)      88s

will retest again with more OSDs as I am not sure what caused problem I reported - only difference between these two tests is number of OSDs.

Comment 3 Elvir Kuric 2020-06-25 13:16:18 UTC

with 15 OSDs / node OSD limits/requests  are  not set 

# oc describe node ip-10-0-132-125.eu-central-1.compute.internal
Name:               ip-10-0-132-125.eu-central-1.compute.internal
Roles:              worker
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=m5.4xlarge
                    beta.kubernetes.io/os=linux
                    cluster.ocs.openshift.io/openshift-storage=
                    computenode=true
                    failure-domain.beta.kubernetes.io/region=eu-central-1
                    failure-domain.beta.kubernetes.io/zone=eu-central-1a
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-10-0-132-125
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/worker=
                    node.kubernetes.io/instance-type=m5.4xlarge
                    node.openshift.io/os_id=rhcos
                    topology.kubernetes.io/region=eu-central-1
                    topology.kubernetes.io/zone=eu-central-1a
                    topology.rook.io/rack=rack0
Annotations:        csi.volume.kubernetes.io/nodeid: {"openshift-storage.rbd.csi.ceph.com":"ip-10-0-132-125.eu-central-1.compute.internal"}
                    machine.openshift.io/machine: openshift-machine-api/ekuric-cz-dfc2t-worker-eu-central-1a-b78dg
                    machineconfiguration.openshift.io/currentConfig: rendered-worker-6c51e8ad865c75f80d04846fe575a5c1
                    machineconfiguration.openshift.io/desiredConfig: rendered-worker-6c51e8ad865c75f80d04846fe575a5c1
                    machineconfiguration.openshift.io/reason: 
                    machineconfiguration.openshift.io/state: Done
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Thu, 25 Jun 2020 10:14:10 +0000
Taints:             <none>
Unschedulable:      false
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Thu, 25 Jun 2020 13:09:49 +0000   Thu, 25 Jun 2020 10:14:10 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Thu, 25 Jun 2020 13:09:49 +0000   Thu, 25 Jun 2020 10:14:10 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Thu, 25 Jun 2020 13:09:49 +0000   Thu, 25 Jun 2020 10:14:10 +0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Thu, 25 Jun 2020 13:09:49 +0000   Thu, 25 Jun 2020 10:15:21 +0000   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:   10.0.132.125
  Hostname:     ip-10-0-132-125.eu-central-1.compute.internal
  InternalDNS:  ip-10-0-132-125.eu-central-1.compute.internal
Capacity:
 attachable-volumes-aws-ebs:  25
 cpu:                         16
 ephemeral-storage:           104305644Ki
 hugepages-1Gi:               0
 hugepages-2Mi:               0
 memory:                      65142676Ki
 pods:                        250
Allocatable:
 attachable-volumes-aws-ebs:  25
 cpu:                         15500m
 ephemeral-storage:           95054339528
 hugepages-1Gi:               0
 hugepages-2Mi:               0
 memory:                      63991700Ki
 pods:                        250
System Info:
 Machine ID:                              ec250543790fc4618e2cc8c36c245f5e
 System UUID:                             ec250543-790f-c461-8e2c-c8c36c245f5e
 Boot ID:                                 7c51a384-a74a-436e-bf4a-e335f5c9fe61
 Kernel Version:                          4.18.0-193.9.1.el8_2.x86_64
 OS Image:                                Red Hat Enterprise Linux CoreOS 45.82.202006190229-0 (Ootpa)
 Operating System:                        linux
 Architecture:                            amd64
 Container Runtime Version:               cri-o://1.18.1-13.dev.rhaos4.5.git6d00f64.el8
 Kubelet Version:                         v1.18.3+91d0edd
 Kube-Proxy Version:                      v1.18.3+91d0edd
PodCIDR:                                  10.128.5.0/24
ProviderID:                               aws:///eu-central-1a/i-08aecdca43bdb98d3
Non-terminated Pods:                      (34 in total)
  Namespace                               Name                                                               CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                               ----                                                               ------------  ----------  ---------------  -------------  ---
  local-storage                           local-disks-local-diskmaker-kvtft                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m28s
  local-storage                           local-disks-local-provisioner-8sbtk                                0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m28s
  local-storage                           local-storage-operator-7d5f5bfcdc-ch4ff                            0 (0%)        0 (0%)      0 (0%)           0 (0%)         11m
  openshift-cluster-node-tuning-operator  tuned-pmvhb                                                        10m (0%)      0 (0%)      50Mi (0%)        0 (0%)         3h
  openshift-cluster-storage-operator      csi-snapshot-controller-6cb687f646-pdc6q                           10m (0%)      0 (0%)      50Mi (0%)        0 (0%)         178m
  openshift-dns                           dns-default-m2pc2                                                  65m (0%)      0 (0%)      110Mi (0%)       512Mi (0%)     3h
  openshift-image-registry                node-ca-z45pn                                                      10m (0%)      0 (0%)      10Mi (0%)        0 (0%)         3h
  openshift-machine-config-operator       machine-config-daemon-strx2                                        40m (0%)      0 (0%)      100Mi (0%)       0 (0%)         3h
  openshift-monitoring                    node-exporter-twsf7                                                9m (0%)       0 (0%)      210Mi (0%)       0 (0%)         3h
  openshift-monitoring                    thanos-querier-6574698497-tlbvc                                    8m (0%)       0 (0%)      72Mi (0%)        0 (0%)         178m
  openshift-multus                        multus-76h9m                                                       10m (0%)      0 (0%)      150Mi (0%)       0 (0%)         3h
  openshift-sdn                           ovs-xtvwv                                                          100m (0%)     0 (0%)      400Mi (0%)       0 (0%)         3h
  openshift-sdn                           sdn-5mkb6                                                          100m (0%)     0 (0%)      200Mi (0%)       0 (0%)         3h
  openshift-storage                       aws-s3-provisioner-5889d665c7-hmsv9                                0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m19s
  openshift-storage                       csi-cephfsplugin-m8ccr                                             0 (0%)        0 (0%)      0 (0%)           0 (0%)         6m55s
  openshift-storage                       csi-cephfsplugin-provisioner-b44f64fd8-lb5rg                       0 (0%)        0 (0%)      0 (0%)           0 (0%)         6m55s
  openshift-storage                       csi-rbdplugin-ldkzx                                                0 (0%)        0 (0%)      0 (0%)           0 (0%)         6m55s
  openshift-storage                       rook-ceph-crashcollector-ip-10-0-132-125-57b7647f9c-lp56b          0 (0%)        0 (0%)      0 (0%)           0 (0%)         6m26s
  openshift-storage                       rook-ceph-drain-canary-81ddaa1b00fdb10757b104fffdef58c8-5d9rk74    0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m43s
  openshift-storage                       rook-ceph-mgr-a-84fb9f9669-md7rr                                   1 (6%)        1 (6%)      3Gi (4%)         3Gi (4%)       5m50s
  openshift-storage                       rook-ceph-mon-a-56c78cb9f-rkg6v                                    1 (6%)        1 (6%)      2Gi (3%)         2Gi (3%)       6m27s
  openshift-storage                       rook-ceph-operator-7d4d6bc6cd-tqfhf                                0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m22s
  openshift-storage                       rook-ceph-osd-0-585f6bccff-txg6g                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m43s
  openshift-storage                       rook-ceph-osd-1-6f76f94f44-6jphk                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m41s
  openshift-storage                       rook-ceph-osd-10-694c486f4b-xxtbs                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m20s
  openshift-storage                       rook-ceph-osd-11-79cc8744f6-mwvg9                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m37s
  openshift-storage                       rook-ceph-osd-2-796bd4b4b6-4whqh                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m35s
  openshift-storage                       rook-ceph-osd-3-5d7869f55c-5td9t                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m33s
  openshift-storage                       rook-ceph-osd-4-669df679cf-2lmv6                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m31s
  openshift-storage                       rook-ceph-osd-5-5856d8bb9d-cz9b9                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m25s
  openshift-storage                       rook-ceph-osd-6-8587d9cc6b-bqzj9                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m22s
  openshift-storage                       rook-ceph-osd-7-7b9fdd848-plxvp                                    0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m26s
  openshift-storage                       rook-ceph-osd-8-6bf67c87d5-m8zfv                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m29s
  openshift-storage                       rook-ceph-osd-9-5b7bcfc864-c86mb                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m39s
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests      Limits
  --------                    --------      ------
  cpu                         2362m (15%)   2 (12%)
  memory                      6472Mi (10%)  5632Mi (9%)
  ephemeral-storage           0 (0%)        0 (0%)
  attachable-volumes-aws-ebs  0             0
Events:                       <none>

Comment 4 Yaniv Kaul 2020-06-30 10:31:29 UTC

Michael, who's looking at this?

Comment 5 Jose A. Rivera 2020-07-02 14:02:25 UTC

No one's looking into this at this time. This is also a scale test for something that is outside our current support statement, so I'm going to say this is not critical to the release. Moving this to OCS 4.6.

That said... Elvir, I don't entirely understand the issue. You set resource requests and limits to nothing, and one of the nodes ran out of CPU for the OSDs. This is expected behavior. What's the problem?

Comment 6 Elvir Kuric 2020-07-02 14:12:55 UTC

(In reply to Jose A. Rivera from comment #5)
> No one's looking into this at this time. This is also a scale test for
> something that is outside our current support statement, so I'm going to say
> this is not critical to the release. Moving this to OCS 4.6.
> 
> That said... Elvir, I don't entirely understand the issue. You set resource
> requests and limits to nothing, and one of the nodes ran out of CPU for the
and expecting it will pick up default values as it does for 3 OSDs per node. 

> OSDs. This is expected behavior. What's the problem?
The problem is that OSDs run, but on node itself there is not set limits/requests for these OSD pods 

  openshift-storage                       rook-ceph-osd-10-694c486f4b-xxtbs                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m20s
  openshift-storage                       rook-ceph-osd-11-79cc8744f6-mwvg9                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m37s
  openshift-storage                       rook-ceph-osd-2-796bd4b4b6-4whqh                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m35s
  openshift-storage                       rook-ceph-osd-3-5d7869f55c-5td9t                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m33s
  openshift-storage                       rook-ceph-osd-4-669df679cf-2lmv6                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m31s
  openshift-storage                       rook-ceph-osd-5-5856d8bb9d-cz9b9                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m25s
  openshift-storage                       rook-ceph-osd-6-8587d9cc6b-bqzj9                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m22s
  openshift-storage                       rook-ceph-osd-7-7b9fdd848-plxvp                                    0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m26s
  openshift-storage                       rook-ceph-osd-8-6bf67c87d5-m8zfv                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m29s
  openshift-storage                       rook-ceph-osd-9-5b7bcfc864-c86mb                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m39s

so far I see only this when there are multiple OSDs per node.
I understand that this test might be  at limit for this machine type, but these pods will not start at all if they do not get requested resources.

Comment 8 Jose A. Rivera 2020-07-06 16:56:04 UTC

The default resources only occur if you specify NOTHING for the resources field. You specify something, that is empty requests and limits, as such the defaults will not apply.

Comment 9 Kyle Bader 2020-07-29 15:40:46 UTC

So if you replaced -

      resources:
         requests: {}
         limits: {}

With this - 

      resources: {}

And the result was no request / limits on the OSD pods, then that would indeed be a bug. If the result is that requests / limits are present on the OSD pods, then not a bug.

Comment 10 Jose A. Rivera 2020-10-12 14:49:44 UTC

Since nothing has been brought up about this since July, I'm moving this to OCS 4.7.

Elvir, please confirm whether this worked for you or not.

Comment 11 Jose A. Rivera 2021-02-08 15:29:38 UTC

Since this has been hanging around for months, it seems that this is not a critical or even recurring issue. As such, closing this BS as NOTABUG.

Elvir, if you still need us to look at this, feel free to reopen the BZ.

Comment 12 Red Hat Bugzilla 2023-09-15 00:33:12 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days

Note You need to log in before you can comment on or make changes to this bug.