Bug 1966662
| Summary: | cannot set just CPU or just memory in StorageDeviceSet resources | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Ben England <bengland> |
| Component: | ocs-operator | Assignee: | Jose A. Rivera <jrivera> |
| Status: | CLOSED WONTFIX | QA Contact: | Raz Tamir <ratamir> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.8 | CC: | ekuric, jhopper, jrivera, kramdoss, madam, mmuench, ocs-bugs, odf-bz-bot, owasserm, sostapov |
| Target Milestone: | --- | Keywords: | Performance |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-10-11 15:50:57 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Ben England
2021-06-01 15:57:41 UTC
ocs-op is setting the resources, José, has anything changed for the OSD recently? Thanks! also see bz https://bugzilla.redhat.com/show_bug.cgi?id=1850954 , which was closed as NOTABUG. THey resolved by saying Elko should have specified resources: {}, but I didn't specify memory resources at all and this happened to me. I did specify CPU resources. Specifically I put: storageDeviceSets: - config: {} count: 12 dataPVCTemplate: metadata: {} spec: accessModes: - ReadWriteOnce resources: requests: storage: "1" storageClassName: localblock volumeMode: Block name: ocs-deviceset-nvmdevs replica: 1 resources: limits: cpu: 10 requests: cpu: 5 BTW, how would Kubernetes be able to schedule pods correctly or avoid an OOM kill situation if it does not know what memory is required by the Ceph OSDs? (In reply to Ben England from comment #4) > also see bz https://bugzilla.redhat.com/show_bug.cgi?id=1850954 , which was > closed as NOTABUG. THey resolved by saying Elko should have specified > resources: {}, but I didn't specify memory resources at all and this > happened to me. I did specify CPU resources. Specifically I put: > > storageDeviceSets: > - config: {} > count: 12 > dataPVCTemplate: > metadata: {} > spec: > accessModes: > - ReadWriteOnce > resources: > requests: > storage: "1" > storageClassName: localblock > volumeMode: Block > name: ocs-deviceset-nvmdevs > replica: 1 > resources: > limits: > cpu: 10 > requests: > cpu: 5 > > BTW, how would Kubernetes be able to schedule pods correctly or avoid an OOM > kill situation if it does not know what memory is required by the Ceph OSDs? I see, the bug is that you only overridden the cpu request/limits but this affected/cleared the OSD memory setting as well. Two action items: - doc bug to make sure users know they have to set the memory requests/limits as well (for older versions) - Fix it in OCS Operator @Jose, can you look at it? I don't see anything that needs fixing. With an explicitly defined Resources field, the behavior of completely overriding the defaults is working as intended. There is no way (that I'm aware of) to determine whether an empty field means use default or use nothing. I guess we can try and make that more explicit? But I don't even know if we have documentation for this in the first place, since I don't know if we officially support this in GA. Maybe a KCS is in order. Leaving this open for now, but moving to ODF 4.9. (In reply to Orit Wasserman from comment #5) > (In reply to Ben England from comment #4) > > also see bz https://bugzilla.redhat.com/show_bug.cgi?id=1850954 , which was > > closed as NOTABUG. THey resolved by saying Elko should have specified > > resources: {}, but I didn't specify memory resources at all and this > > happened to me. I did specify CPU resources. Specifically I put: > > > > storageDeviceSets: > > - config: {} > > count: 12 > > dataPVCTemplate: > > metadata: {} > > spec: > > accessModes: > > - ReadWriteOnce > > resources: > > requests: > > storage: "1" > > storageClassName: localblock > > volumeMode: Block > > name: ocs-deviceset-nvmdevs > > replica: 1 > > resources: > > limits: > > cpu: 10 > > requests: > > cpu: 5 > > > > BTW, how would Kubernetes be able to schedule pods correctly or avoid an OOM > > kill situation if it does not know what memory is required by the Ceph OSDs? > > I see, the bug is that you only overridden the cpu request/limits but this > affected/cleared the OSD memory setting as well. > Two action items: > - doc bug to make sure users know they have to set the memory > requests/limits as well (for older versions) > - Fix it in OCS Operator > > @Jose, can you look at it? @Orit - if there is no limit set for memory, does the OSD pod go ahead and consume all of the available memory in the system? My understanding was that it shouldn't go beyond 5G irrespective of whether a limit is defined or not. (In reply to krishnaram Karthick from comment #7) > (In reply to Orit Wasserman from comment #5) > > (In reply to Ben England from comment #4) > > > also see bz https://bugzilla.redhat.com/show_bug.cgi?id=1850954 , which was > > > closed as NOTABUG. THey resolved by saying Elko should have specified > > > resources: {}, but I didn't specify memory resources at all and this > > > happened to me. I did specify CPU resources. Specifically I put: > > > > > > storageDeviceSets: > > > - config: {} > > > count: 12 > > > dataPVCTemplate: > > > metadata: {} > > > spec: > > > accessModes: > > > - ReadWriteOnce > > > resources: > > > requests: > > > storage: "1" > > > storageClassName: localblock > > > volumeMode: Block > > > name: ocs-deviceset-nvmdevs > > > replica: 1 > > > resources: > > > limits: > > > cpu: 10 > > > requests: > > > cpu: 5 > > > > > > BTW, how would Kubernetes be able to schedule pods correctly or avoid an OOM > > > kill situation if it does not know what memory is required by the Ceph OSDs? > > > > I see, the bug is that you only overridden the cpu request/limits but this > > affected/cleared the OSD memory setting as well. > > Two action items: > > - doc bug to make sure users know they have to set the memory > > requests/limits as well (for older versions) > > - Fix it in OCS Operator > > > > @Jose, can you look at it? > > @Orit - if there is no limit set for memory, does the OSD pod go ahead and > consume all of the available memory in the system? > My understanding was that it shouldn't go beyond 5G irrespective of whether > a limit is defined or not. The main issue is not the limits but the empty requests, this means k8s my place other pods that will use memory and will not leave enough for the OSD. OSD memory usage depends on the workload type and disk size could be more than 5G in your case. I have an AWS cluster where I can experiment with this further. Plan is to try reproducing the null memory limit using the same CR as before (CPU specified, no memory specified), then try setting the memory limit explicitly, then try not specifying resources at all. I think it's just an easily fixed bug in the rook-ceph-operator about how it interprets the resources: section of the storagecluster CR, but have to prove it. I reproduced and isolated the problem in AWS with a similar set of OCS storagecluster CRs. The good news is that this problem can be worked around by editing the live storagecluster CR and ocs-operator will redeploy the OSDs with the right limits.
The problem is exactly what I thought. There are actually 4 cases to consider:
- neither memory or CPU resources specified, "resources: {}" - no problem
- CPU specified, but not memory - memory is unbounded
- both CPU and memory specified - no problem
- memory specified but not CPU - CPU is unbounded
What you want is that if one resource is specified, the other defaults just as it does in case where neither resource is specified, correct? This is a much more intuitive, surprise-free behavior.
when neither memory nor CPU is specified:
storageDeviceSets:
- resources: {}
...
I get:
$ ocos describe node | awk '/CPU/||/ceph-osd/'
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
openshift-storage rook-ceph-osd-0-547788c87c-r622l 2 (12%) 2 (12%) 5Gi (8%) 5Gi (8%)
openshift-storage rook-ceph-osd-2-84c49b497d-6jbss 2 (12%) 2 (12%) 5Gi (8%) 5Gi (8%)
openshift-storage rook-ceph-osd-1-677445dbb6-5b649 2 (12%) 2 (12%) 5Gi (8%) 5Gi (8%)
when CPU is specified but not memory:
storageDeviceSets:
- resources:
limits:
cpu: 4
requests:
cpu: 2
...
I get:
ocos describe node | awk '/CPU/||/ceph-osd/'
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
openshift-storage rook-ceph-osd-0-5d6bdf4f78-qs4zk 2 (12%) 4 (25%) 0 (0%) 0 (0%)
openshift-storage rook-ceph-osd-1-84bb6c56b9-qnkws 2 (12%) 4 (25%) 0 (0%) 0 (0%)
openshift-storage rook-ceph-osd-2-69c469ff64-dvh8p 2 (12%) 4 (25%) 0 (0%) 0 (0%)
When both CPU and memory are specified, such as:
- resources:
limits:
cpu: 4
memory: "6Gi"
requests:
cpu: 2
memory: "6Gi"
...
I get
ocos describe node | awk '/CPU/||/ceph-osd/' | tee aws-cpu-mem-limit-specified.log
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
openshift-storage rook-ceph-osd-0-77bbbf45c4-npjdr 2 (12%) 4 (25%) 6Gi (9%) 6Gi (9%) 97s
openshift-storage rook-ceph-osd-1-5b97995cf5-tmjcr 2 (12%) 4 (25%) 6Gi (9%) 6Gi (9%) 98s
openshift-storage rook-ceph-osd-2-cb7ccb8-w8r4g 2 (12%) 4 (25%) 6Gi (9%) 6Gi (9%) 92s
when memory is specified but not CPU:
storageDeviceSets:
- resources:
limits:
memory: "6Gi"
requests:
memory: "6Gi"
...
I get:
$ ocos describe node | awk '/CPU/||/ceph-osd/'
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
openshift-storage rook-ceph-osd-1-6d598b6744-glfqs 0 (0%) 0 (0%) 6Gi (9%) 6Gi (9%)
openshift-storage rook-ceph-osd-2-56d5c5db9f-c9r8r 0 (0%) 0 (0%) 6Gi (9%) 6Gi (9%)
openshift-storage rook-ceph-osd-0-859774c898-4gqcq 0 (0%) 0 (0%) 6Gi (9%) 6Gi (9%)
The fix is pretty simple, see here where defaulting for OSD memory and CPU limits happens: https://github.com/openshift/ocs-operator/blob/master/controllers/storagecluster/cephcluster.go#L517 if resources.Requests == nil && resources.Limits == nil { resources = defaults.DaemonResources["osd"] } It clearly shows that defaulting only happens if both resources.Requests and resources.Limits are nil. What it should do in pseudocode is: if resources CPU request is undefined: default to defaults.DaemonResources["osd"].requests.cpu if resources memory request is undefined: default to defaults.DaemonResources["osd"].requests.memory if resources CPU limit is undefined: default to defaults.DaemonResources["osd"].limits.cpu if resources memory limit is undefined: default to defaults.DaemonResources["osd"].limits.memory if resources.CPU.requests > resources.CPU.limits: error requests must be <= limits if resources.memory.requests > resources.memory.limits: error requests must be <= limits Make sense? This way, the user can default any subset of these 4 fields and it will still provide a sane answer. Why workrequests so hard at it? The user may want to override the CPU request/limit without overriding the memory request/limit and vice versa, and may not know enough to specify both of them correctly. Since there's been no further discussion nor customer inputs, closing this as WONTFIX. If there's any further demand for this, feel free to reopen this BZ. |