Bug 1465501
Summary: | [RFE] Provide an OpenShift application node cluster object | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat CloudForms Management Engine | Reporter: | Peter McGowan <pmcgowan> | ||||||
Component: | Providers | Assignee: | Loic Avenel <lavenel> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | Dave Johnson <dajohnso> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 5.8.0 | CC: | fdupont, fsimonce, gblomqui, jfrey, jhardy, jmarc, ncatling, obarenbo, pmcgowan, slopez | ||||||
Target Milestone: | GA | Keywords: | FutureFeature | ||||||
Target Release: | cfme-future | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2018-07-01 18:38:44 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | Container Management | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1490131 | ||||||||
Bug Blocks: | |||||||||
Attachments: |
|
Description
Peter McGowan
2017-06-27 14:20:44 UTC
*** Bug 1465499 has been marked as a duplicate of this bug. *** (In reply to Peter McGowan from comment #0) > If we could have a way of modelling OpenShift nodes as a cluster, (rolling > up node C&U metrics into cluster metrics), Peter we do have rollups for the cluster. If you go into the C&U of the provider you should see the graphs. Each provider is a cluster, I don't think we need a new object to represent it (or at least we should have a good reason to introduce it). Is that enough for this RFE? Hi Federico I only see this for the underlying provider (e.g. VMware) cluster, not the cluster of OpenShift app nodes (which may be VMs running on the underlying hardware). To be able to make scaling decisions for the OpenShift app nodes, we need to be able to see utilization stats for the group of app nodes (possibly VMs) that make up the app node cluster. Created attachment 1293313 [details] Screenshot from 2017-06-30 18-10-56.png (In reply to Peter McGowan from comment #4) > Hi Federico > > I only see this for the underlying provider (e.g. VMware) cluster, not the > cluster of OpenShift app nodes (which may be VMs running on the underlying > hardware). Peter, the screenshot I am attaching shows you the OpenShift C&U of the entire provider (cluster). As mentioned above it's in the OpenShift Provider page under "Monitoring => Utilization". Let me know if you can't find it. (In reply to Peter McGowan from comment #4) > To be able to make scaling decisions for the OpenShift app nodes, we need to > be able to see utilization stats for the group of app nodes (possibly VMs) > that make up the app node cluster. This feature (nodes elasticity) is currently scheduled as an OpenShift feature. Created attachment 1293787 [details]
Cluster utilization alert screenshot
It would be useful to be able to create a cluster utilization alert, as shown in the screenshot. This doesn't seem to be possible as the OpenShift cluster doesn't appear in the list of Cluster / Deployment Roles (presumably because it's not of type ems_cluster).
(In reply to Peter McGowan from comment #7) > Created attachment 1293787 [details] > Cluster utilization alert screenshot > > It would be useful to be able to create a cluster utilization alert, as > shown in the screenshot. This doesn't seem to be possible as the OpenShift > cluster doesn't appear in the list of Cluster / Deployment Roles (presumably > because it's not of type ems_cluster). Peter, to clarify, you want to use current CloudForms Alert mechanism to generate an Alert and react to it? Loic, that's correct. Specifically I'd like to be able to create an alert on real-time performance utilization of an OpenShift cluster. This alert might possibly be used to scale out (or scale back) the cluster, but might also be useful to send into an external monitoring system. (In reply to Federico Simoncelli from comment #6) > (In reply to Peter McGowan from comment #4) > > To be able to make scaling decisions for the OpenShift app nodes, we need to > > be able to see utilization stats for the group of app nodes (possibly VMs) > > that make up the app node cluster. > > This feature (nodes elasticity) is currently scheduled as an OpenShift > feature. Federico, Do you have any pointer to that feature ? I would like to understand the underlying technical implementation, to see how we could leverage it from CloudForms. Thanks. (In reply to Fabien Dupont from comment #10) > (In reply to Federico Simoncelli from comment #6) > > (In reply to Peter McGowan from comment #4) > > > To be able to make scaling decisions for the OpenShift app nodes, we need to > > > be able to see utilization stats for the group of app nodes (possibly VMs) > > > that make up the app node cluster. > > > > This feature (nodes elasticity) is currently scheduled as an OpenShift > > feature. > > Federico, > > Do you have any pointer to that feature ? I would like to understand the > underlying technical implementation, to see how we could leverage it from > CloudForms. Hi Fabian, you can find the documentation here: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler Thank you Federico. My first remark is that it only covers AWS and GCE. Hence, it doesn't take into account other infrastructures, such as Microsoft Azure, OpenStack and virtualization platforms. We are able to manage all these environments. IMHO, that would be of great value to provide our customers the ability to (auto) scale their OpenShift platforms in a consistent way across all platforms. To achieve that, we have to identify the relevant metrics: CPU, memory, storage I/O, network I/O, pods per nodes (correlated to overcommit), etc... and events: impossible to deploy a container because the nodes with its node selector are full, etc... to have a clear status of the whole platform and make a decision. Then, we can trigger the automation to achieve this: scale out/in the nodes, relabel empty nodes ?, run the scaleup.yml playbook, evacuate and delete nodes, etc... |