Had some issues installing OCS4.2.1 on IBM Cloud. One of the issues is that IBM Cloud set root-dir for kubelet at /var/data/kubelet. the csi plugin pods don't really get this change and are trying to use the default /var/lib/kubelet. The Rook operator have the option to set ROOK_CSI_KUBELET_DIR_PATH, so we just need to expose that in the OCS operator. This of course might surface in other cloud providers so might as well do it.
Is that with UPI or IPI? Why is it set differently? How does OCP cope with that?
UPI of course. I don't think there's IPI for IBM Cloud. it is set different because that's how it has been used in all IBM environments for a while (not only Cloud). Not sure how they install (most likely Ansible) so they provide the path they want. I can ask how they install, however, this is something that might show up in other places as well (other clouds or on-prem bare-metal).
Eric - this is what we've briefly discussed at LX. Should we 'auto-detect' this, or can they move to a standard location?
If this is really needed, 1. OLM already supports overriding Env Vars in deployments. 2. Rook operator will be using Config Map to mix and match with Env Vars for any CSI configuration. Do we still need to expose it in OCS or will the already available ways work?
This will be available in Rook-Ceph 1.3 for OCS 4.5. We still need to discuss if we want to automate something via the ocs-operator, or if we want to leverage a functionality of the OLM to override environment variables as part of hte Subscription process. Either way, ACKing this for 4.5.
*** Bug 1823417 has been marked as a duplicate of this bug. ***
@umanga any decision here? Is this still on track?
Yes, we have added default config to ocs-operator which admin can change. Currently, we are blocked on upgrading Rook to v1.3. It should be done in time for 4.5.
Thanks! Acking
if the fix works then there is no need to test in IBM Cloud, you just need to test it on a k8s cluster that have kubelet in a different than default path. we cab also provide also some RC of 4.5 (when available) and they can test themselves - if needed.
Removing needinfo as per comments #14 and #15
The idea for the variable is to help automation. This workaround is known (this is how they install it right now in IBM cloud), but they can't continue to spin clusters manually.
*** Bug 1857114 has been marked as a duplicate of this bug. ***
Removing needonfo which came as part of the duped BZ.
IIUC on each of our deployment we have currently we have root-dir for kubelet in /var/lib/kubelet but on IBM ROKS they have actually different one (/var/data/kubelet). Haven't done this change of kublet dir path on some existing deployed cluster. We can take a look how to change this, I thought that this is dependent on some specific package installed which can be different on ROKS system. Do you have some documentation how to change this kublet root-dir on our regular cluster on AWS for example? Thanks
I have asked Akash on Friday to update this BZ that this feature is not needed anymore for ROKS cluster running on OCP 4.4 and higher. Please Akash confirm it here and I think this feature can be removed IMO if what we discussed in mail thread an on slack is not needed anymore for you. Thanks, Petr
OCP 4.4 in IBM Cloud is now having kubelet path as "/var/lib/kubelet" as needed by OCS. So we do not need to validate this fix any more as indicated by Petr.
Moving this BZ to VERIFIED based on latest 4.5 regression cycle results and the fact that the new variable is not needed to be changed with OCP 4.4 anymore. ===================== ocs-operator.v4.5.0-518.ci OCP 4.5.0-0.nightly-2020-08-07-024812
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Container Storage 4.5.0 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3754