Bug 1827157
| Summary: | OSD hitting default CPU limit on AWS i3en.2xlarge instances limiting performance | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Container Storage | Reporter: | Manoj Pillai <mpillai> |
| Component: | ocs-operator | Assignee: | Jose A. Rivera <jarrpa> |
| Status: | CLOSED ERRATA | QA Contact: | krishnaram Karthick <kramdoss> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.3 | CC: | ebenahar, ekuric, jarrpa, kramdoss, madam, mbukatov, muagarwa, ocs-bugs, owasserm, sostapov |
| Target Milestone: | --- | Keywords: | AutomationBackLog, Performance |
| Target Release: | OCS 4.6.0 | ||
| Hardware: | Unspecified | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-12-17 06:22:30 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Manoj Pillai
2020-04-23 11:29:15 UTC
(In reply to Manoj Pillai from comment #0) Repeating the same on an fio random write test, I get an improvement in IOPS of ~56% going from OSDs with cpu limit 2 to OSDs with CPU limit 3. The CPU usage when the test is running with limit 3: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 429126 167 20 0 5466568 2.4g 34692 S 277.8 3.9 41:12.24 ceph-osd 26172 1000140+ 20 0 1551596 1.3g 43684 S 15.9 2.1 22:24.34 promethe+ 1412 root 20 0 2856260 246516 96412 S 8.8 0.4 14:40.98 hyperkube PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 243448 167 20 0 5567764 2.4g 34136 S 284.0 3.9 42:23.24 ceph-osd PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 322644 167 20 0 5639956 2.5g 35392 S 284.4 4.0 45:00.73 ceph-osd The outcome I'm looking for here is that the OSD CPU limit picked by OCS should not leave something on the table as far as performance is concerned. A single static limit will probably not work well given the range of devices and configurations OCS needs to handle. (In reply to Manoj Pillai from comment #3) > The outcome I'm looking for here is that the OSD CPU limit picked by OCS > should not leave something on the table as far as performance is concerned. > A single static limit will probably not work well given the range of devices > and configurations OCS needs to handle. The other way around: we should limit the range of devices and configuration we support (ESPECIALLY in the cloud!) and be opinionated about the deployment, not leaving it to the user to choose whatever they feel may be the correct settings, as they are likely clueless about it. We can and should consider 2 deployment options: 1. Converged and dedicated - are the OCS nodes running also app workloads, or are they dedicated to OCS pods? 2. OSD-only nodes vs. OCS nodes - do we want to run the control plane pods (MGR, MONs, RGWs to some extent?, MDS, Noobaa, ...?) on other workers, and have some nodes dedicated to OSDs only. This should give us a large enough matrix of options, and add to it the different possible machine types (just the M5 4xlarge and i3en.2xlarge...) and you get enough options, I think, for deployment. An alternative to this large matrix is 'performance' and 'balanced' profiles for the OSDs perhaps. In general, we are only able to provide static resource limits, they need to be manually changed. And we don't want to introduce the ability to arbitrarily change these values, that puts them at high risk of not being supported. Having an option of various tunings for performance vs. resource consumption may be viable. However, since nothing is crashing and no data is being lost, this is not a blocker and not something we need to consider right away. Moving this to OCS 4.6. (In reply to Yaniv Kaul from comment #4) > (In reply to Manoj Pillai from comment #3) > > The outcome I'm looking for here is that the OSD CPU limit picked by OCS > > should not leave something on the table as far as performance is concerned. > > A single static limit will probably not work well given the range of devices > > and configurations OCS needs to handle. > > The other way around: we should limit the range of devices and configuration > we support (ESPECIALLY in the cloud!) and be opinionated about the > deployment, not leaving it to the user to choose > whatever they feel may be the correct settings, as they are likely clueless > about it. Agreed, it should not be left to the user to choose the correct settings. OCS should choose, and that choice should be based on some knowledge of the configuration. An operator after all is expected to encode the knowledge of a smart admin. > We can and should consider 2 deployment options: > 1. Converged and dedicated - are the OCS nodes running also app workloads, > or are they dedicated to OCS pods? > 2. OSD-only nodes vs. OCS nodes - do we want to run the control plane pods > (MGR, MONs, RGWs to some extent?, MDS, Noobaa, ...?) on other workers, and > have some nodes dedicated to OSDs only. > > This should give us a large enough matrix of options, and add to it the > different possible machine types (just the M5 4xlarge and i3en.2xlarge...) > and you get enough options, I think, for deployment. AFAICT, our docs are currently specifying minimum requirements for OCS nodes, they are not enumerating the supported instances. Enumerating the supported instances would probably make this specific problem somewhat simpler. Not sure if you're saying we should do that. > > An alternative to this large matrix is 'performance' and 'balanced' profiles > for the OSDs perhaps. Is any of the above work-in-progress or it needs to start? (In reply to Manoj Pillai from comment #6) > (In reply to Yaniv Kaul from comment #4) > > > > An alternative to this large matrix is 'performance' and 'balanced' profiles > > for the OSDs perhaps. > > Is any of the above work-in-progress or it needs to start? This is not being worked on right now, and it is not on our radar for the immediate future. As such, moving this to OCS 4.7. My mistake, we have a JIRA for this already: https://issues.redhat.com/browse/KNIP-1472 As such, bringing it back to OCS 4.6 and moving it to MODIFIED. (In reply to Jose A. Rivera from comment #9) > My mistake, we have a JIRA for this already: > https://issues.redhat.com/browse/KNIP-1472 > > As such, bringing it back to OCS 4.6 and moving it to MODIFIED. I don't understand how KNIP-1472 provides a complete solution for what is asked here. can you please help me to understand how are we addressing the 'performance' part with KNIP-1472? IIUC, KNIP-1472 provides a way to deploy OCS with fewer resources (for entry-level deployments). Low CPU request, Lower performance. But, Do we have a way for more CPU for OCS deployments that need better performance? Based on comment#14 moving this bug to assigned as I believe there is still some work left. This is not a release blocker, moving it out of 4.6 till we have a clarity. (In reply to krishnaram Karthick from comment #15) > Based on comment#14 moving this bug to assigned as I believe there is still > some work left. Isn't the same (manual) way to deploy OCS with fewer resources allows you to also deploy it with more resources? (In reply to Yaniv Kaul from comment #17) > (In reply to krishnaram Karthick from comment #15) > > Based on comment#14 moving this bug to assigned as I believe there is still > > some work left. > > Isn't the same (manual) way to deploy OCS with fewer resources allows you to > also deploy it with more resources? Yes, but I'd at least like to see the recommended values if someone expects the best performance out of OCS. We ideally want our operator to handle all of this as Manoj has requested in comment#6 (In reply to krishnaram Karthick from comment #18) > (In reply to Yaniv Kaul from comment #17) > > (In reply to krishnaram Karthick from comment #15) > > > Based on comment#14 moving this bug to assigned as I believe there is still > > > some work left. > > > > Isn't the same (manual) way to deploy OCS with fewer resources allows you to > > also deploy it with more resources? > > Yes, but I'd at least like to see the recommended values if someone expects > the best performance out of OCS. We ideally want our operator to handle all > of this as Manoj has requested in comment#6 I'd argue that this is a default work item that we need to complete - the operator doesn't know right now what the user wants - a dedicated node for storage, consume all resources, have all (OSDs, MDS, Noobaa) on the nodes, or just OSDs, etc. We'll enable some kind of simple configuration 'profiles' in the future, and those will make the decision. Right now, let's go with the command line manual settings. Please ensure it works. (In reply to Yaniv Kaul from comment #19) > (In reply to krishnaram Karthick from comment #18) > > (In reply to Yaniv Kaul from comment #17) > > > (In reply to krishnaram Karthick from comment #15) > > > > Based on comment#14 moving this bug to assigned as I believe there is still > > > > some work left. > > > > > > Isn't the same (manual) way to deploy OCS with fewer resources allows you to > > > also deploy it with more resources? > > > > Yes, but I'd at least like to see the recommended values if someone expects > > the best performance out of OCS. We ideally want our operator to handle all > > of this as Manoj has requested in comment#6 > > I'd argue that this is a default work item that we need to complete - the > operator doesn't know right now what the user wants - a dedicated node for > storage, consume all resources, have all (OSDs, MDS, Noobaa) on the nodes, > or just OSDs, etc. > > We'll enable some kind of simple configuration 'profiles' in the future, and > those will make the decision. Right now, let's go with the command line > manual settings. Please ensure it works. Manoj has already tried this as part of the original test. i.e., by trying to have the CPU request set to 3. What we don't have for the 'performance' profile is a recommendation for CPU & RAM configurations like we have for the 'balanced' profile in KNIP-1472. (In reply to krishnaram Karthick from comment #20) > (In reply to Yaniv Kaul from comment #19) > > > > We'll enable some kind of simple configuration 'profiles' in the future, and > > those will make the decision. Right now, let's go with the command line > > manual settings. Please ensure it works. > > Manoj has already tried this as part of the original test. i.e., by trying > to have the CPU request set to 3. > What we don't have for the 'performance' profile is a recommendation for CPU > & RAM configurations like we have for the 'balanced' profile in KNIP-1472. See Also: https://bugzilla.redhat.com/show_bug.cgi?id=1828883#c8 In that case, a CPU limit of 5 was roughly the right setting for a 'performance profile'. Hopefully, you can build on that. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.6.0 security, bug fix, enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5605 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |