Bug 1960671
| Summary: | Add nodeAffinity to CSI provisioner pods to bring them up only on OCS labelled nodes | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Neha Berry <nberry> |
| Component: | ocs-operator | Assignee: | Jose A. Rivera <jrivera> |
| Status: | CLOSED WONTFIX | QA Contact: | Elad <ebenahar> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.8 | CC: | jrivera, madam, muagarwa, ocs-bugs, odf-bz-bot, sostapov, tnielsen |
| Target Milestone: | --- | Keywords: | AutomationBackLog |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-01-20 16:00:22 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Neha Berry
2021-05-14 14:43:34 UTC
Agreed we should have the CSI provisioners only run on OCS nodes. This setting just needs to be added to the configmap rook-ceph-operator-config CSI_PROVISIONER_NODE_AFFINITY: "cluster.ocs.openshift.io/openshift-storage" I completely disagree with this, as I disagree with constraining anything that's not OSD Pods in general. It really limits our ability to take advantage of additional resources in the rest of the cluster. Besides that, I see no reason why an admin *should* care what nodes they're running on, best I know they only care because *we tell them to care*. That said, we're already doing it, so might as well be consistent. Hopefully I can get this reverted or at least made optional in the future. :P Not sure if this should go into 4.8 yet, but giving devel_ack+. If we already don't have a PR for this then I would suggest to move this out of 4.8 Even with no PR, I don't think it's valid for us to try and resolve this for a few reasons: 1. It's not quite as simple as just adding NodeAffinity, because we have to consider the scenario where we have no OCS nodes (e.g. external RHGS) 2. This does nothing to actually conserve resources. The number of provisioner pods does not change. If anything, this restricts our ability to make use of available resources in OCP, leading to a guaranteed reduction in resources on our actual storage nodes 3. This reduces the resiliency of the CSI driver. In the event where the two OCS nodes hosting the provisioner Pods go down, they may be entirely unavailable for several minutes, in which case several CSI functions will be entirely unavailable. From a pure statistical point of view, it increases our chances to survive node failures the more nodes we have available for scheduling. While we still need consensus with QE, since it is not a regression it is not dire enough to warrant attention for OCS 4.8. Moving to ODF 4.9. I understand this particular issue is still leading to repeat customer confusion, but I still stand by my statements. This is not a technical problem, it's one of perception and documentation. As such, I think the doc BZ https://bugzilla.redhat.com/show_bug.cgi?id=1960066 is more appropriate. On a more technical note, we just had an ocs-operator BZ triage meeting with QE. We are in general consensus that it makes no sense to constraint *any* OCS Pods that aren't Ceph OSD Pods by default. With the current subscription model of charging for OCS SKUs on a per-core basis, it really doesn't matter what *specific nodes* most of our Pods are running on. The idea that isolating them as such would provide more stability for either us or other applications is not really valid, as it only constraints the potential compute resources we could use to recover from failure scenarios. OSD Pods are an exception since they are by far our most cumbersome resource, and in many cases have to be limited to the reach of their associated PVs anyway. All other Pods should be fairly lightweight by comparison and not mess with the stability of other applications or nodes. *If* we were to be doing any default NodeAffinity for non-OSD Pods it should really be to the master nodes, because we really are infrastructure even if OCP doesn't recognize us as such. Of course, we should absolutely allow admins to *optionally* constraint any and all Pods to arbitrary nodes. They can do this today, it just needs to be better documented if we want to officially support it. Given the need for further discussion, and the long-standing fact we're still not blocked by this, moving this to ODF 4.10 and giving devel_ack-. Oops, too far! It's bee well over 6 months since this has been raised. I have continued to strongly oppose it and thus far no one has had any substantial arguments against it. :) As such, closing this as WONTFIX. |