Bug 2249735
| Summary: | [4.14 clone] the multus network address detection job does not derive placement configs from CephCluster "all" placement | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Mudit Agarwal <muagarwa> |
| Component: | rook | Assignee: | Blaine Gardner <brgardne> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Oded <oviner> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.14 | CC: | bkunal, brgardne, clacroix, etamir, kramdoss, mcaldeir, muagarwa, nberry, odf-bz-bot, oviner, pbalogh, rcyriac, smulay, srai, tnielsen |
| Target Milestone: | --- | ||
| Target Release: | ODF 4.14.5 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | 4.14.1-8 | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 2249678 | Environment: | |
| Last Closed: | 2024-03-18 11:40:50 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 2249678 | ||
| Bug Blocks: | |||
|
Description
Mudit Agarwal
2023-11-15 04:20:46 UTC
This is ready to be merged for 4.14.z here, whenever it is appropriate to do so: https://github.com/red-hat-storage/rook/pull/537 Hi Blaine,
Can you check my test procedure and answer the question in section 6?
Test Process:
1.Install OCP4.14
2.Install ODF4.14
3.Running validation tool
4.Install Storage clutser with multus
5.Taint all nodes in the openshift cluster
kubectl taint nodes --all node-role.kubernetes.io/storage=true:NoSchedule
6.Set toleration in CephCluster: [question: Do I need to add the item to tolerations or tolerations list size will be one?
for example:
placement:
all:
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/storage
operator: Equal
value: "true"
OR:
placement:
all:
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/storage
operator: Equal
value: "true"
- effect: NoSchedule
key: node.ocs.openshift.io/storage
operator: Equal
value: "true"
7. Verify rook-ceph-network-*-canary job Completed
$ oc get jobs -n openshift-storage
I have some small concerns, and I think I can answer your question as well:
2. I think this is probably a typo. ODF shouldn't be installed until after running the validation tool and tainting nodes. Instead, this is the time to apply the new taint to the nodes.
- - As a note, the important thing here is that the taint used is not the default taint/toleration built into ODF ("node.ocs.openshift.io/storage=true:NoSchedule")
3. Yes, and an additional need: The validation tool will need to be configured with a toleration for the taint. The latest tool version on the KCS supports a config file for configuring tolerations. `rook multus validation config converged` will print out a config file that's documented with comments that you can use as a starting point. Ping me if you need more help setting up the config file.
4. This is good, with one caveat: The install must use the 'cluster' Multus network. It doesn't matter if 'public' is used or not, but 'cluster' must be used.
5. As noted, this should be step 2
6. I have one concern in addition to trying answering your question:
- My concern: Depending on ocs-operator's reconcile strategy, ocs-operator might override the CephCluster placement settings. Setting the toleration via StorageCluster during the initial deployment seems like it might be the best place to specify this. Hopefully that means that there won't be any CI behavior changes based on the ocs-op reconcile strategy.
- It seems best to me to only specify a single toleration. It's simpler, plus doing that should also ensure that the test isn't implicitly using the default toleration as well -- helping prevent any false positives if there were to be a regression in the future.
- Thus, this probably becomes step 2+4+6 all in one: "Install ODF4.14 with Multus cluster network and custom 'all' placement"
7. Yes, exactly. This is an important validation to check that there is no regression when upgrading from one ODF version to the next, so also make sure this test is run for upgrades if that isn't part of the current plan.
- For upgrades, you can make sure it is the correct canary job (i.e., not an older version of the job) by ensuring the canary job is configured with the same RCHS/Ceph container image as the CephCluster spec.
8. Additionally, this environment can be used for other test suites, and it is a good idea to use the non-default environment for them to ensure there aren't other errors as well. I assume that is part of the plan, but it seemed worth mentioning.
That all would make the new procedure:
1. Install OCP
2. Apply a unique taint to all non-control-plane nodes
3. Run multus validation tool with toleration config (important to also ensure that there are no errors with validation tool)
4. Install ODF and StorageCluster with
- Multus 'cluster' network
- Custom 'all' toleration for unique taint from step 2
5. Verify rook-ceph-network-cluster-canary job "Completed" with the expected RHCS container image
6. Continue with other ODF test suites.
As an overall note: the test I've suggested assumes the whole cluster is only storage nodes with no worker nodes. This is valid, but I also understand that there could be CI automations that expect one or more worker nodes. If the test needs worker nodes, the procedure will have to factor in adding a node label and placement selector as well.
There is no option to install ODF4.14.1 when all worker nodes tainted
kubectl taint nodes argo005.ceph.redhat.com node-role.kubernetes.io/storage=true:NoSchedule
node/argo005.ceph.redhat.com tainted
$ kubectl describe nodes argo005.ceph.redhat.com | grep Taints
Taints: node-role.kubernetes.io/storage=true:NoSchedule
$ oc get job 1bd180a90a1d205118da2402a530a9c94838fd0a90283339b7e5c68602f3757 -n openshift-marketplace
NAME COMPLETIONS DURATION AGE
1bd180a90a1d205118da2402a530a9c94838fd0a90283339b7e5c68602f3757 0/1 21h 21h
$ oc describe job 1bd180a90a1d205118da2402a530a9c94838fd0a90283339b7e5c68602f3757 -n openshift-marketplace
Name: 1bd180a90a1d205118da2402a530a9c94838fd0a90283339b7e5c68602f3757
Namespace: openshift-marketplace
Selector: batch.kubernetes.io/controller-uid=2758a2bc-bb6c-4a44-9032-ddf9930e4db6
Labels: batch.kubernetes.io/controller-uid=2758a2bc-bb6c-4a44-9032-ddf9930e4db6
batch.kubernetes.io/job-name=1bd180a90a1d205118da2402a530a9c94838fd0a90283339b7e5c68602f3757
controller-uid=2758a2bc-bb6c-4a44-9032-ddf9930e4db6
job-name=1bd180a90a1d205118da2402a530a9c94838fd0a90283339b7e5c68602f3757
Annotations: batch.kubernetes.io/job-tracking:
Parallelism: 1
Completions: 1
Completion Mode: NonIndexed
Start Time: Wed, 29 Nov 2023 17:53:48 +0200
Active Deadline Seconds: 600s
Pods Statuses: 0 Active (0 Ready) / 0 Succeeded / 1 Failed
Pod Template:
Labels: batch.kubernetes.io/controller-uid=2758a2bc-bb6c-4a44-9032-ddf9930e4db6
batch.kubernetes.io/job-name=1bd180a90a1d205118da2402a530a9c94838fd0a90283339b7e5c68602f3757
controller-uid=2758a2bc-bb6c-4a44-9032-ddf9930e4db6
job-name=1bd180a90a1d205118da2402a530a9c94838fd0a90283339b7e5c68602f3757
Init Containers:
util:
Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8af0a4afdd1d4b263f8365a765bbab04fe8b271710a52b394b285dd29497143a
Port: <none>
Host Port: <none>
Command:
/bin/cp
-Rv
/bin/cpb
/util/cpb
Requests:
cpu: 10m
memory: 50Mi
Environment: <none>
Mounts:
/util from util (rw)
pull:
Image: quay.io/rhceph-dev/odf4-odf-operator-bundle@sha256:d4c5bf429fed12ff3a3330d56fcb80af3651ed5edc73f3080cbf3aa614554e6b
Port: <none>
Host Port: <none>
Command:
/util/cpb
/bundle
Requests:
cpu: 10m
memory: 50Mi
Environment: <none>
Mounts:
/bundle from bundle (rw)
/util from util (rw)
Containers:
extract:
Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2ebbbc7f05e939be5adfd0220304888d422cedf8a6807b6ac4da531d2ed6e88a
Port: <none>
Host Port: <none>
Command:
opm
alpha
bundle
extract
-m
/bundle/
-n
openshift-marketplace
-c
1bd180a90a1d205118da2402a530a9c94838fd0a90283339b7e5c68602f3757
-z
Requests:
cpu: 10m
memory: 50Mi
Environment:
CONTAINER_IMAGE: quay.io/rhceph-dev/odf4-odf-operator-bundle@sha256:d4c5bf429fed12ff3a3330d56fcb80af3651ed5edc73f3080cbf3aa614554e6b
Mounts:
/bundle from bundle (rw)
Volumes:
bundle:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
util:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
Events: <none>
$ oc describe pod redhat-operators-4xkhr -n openshift-marketplace
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 11m (x259 over 21h) default-scheduler 0/6 nodes are available: 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }, 3 node(s) had untolerated taint {node-role.kubernetes.io/storage: true}. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling..
Verification of this fix won't happen in 4.14.1 timeline. It was agreed to move the bug to 4.14.2 for verification. I had a chat with Oded today to help get this test running. I had suggested that the ODF document linked -> [1] <- seems like it is the right one to allow ODF to be deployed onto nodes that have custom taints. Oded said the procedure was not working. Oded was also unable to find anyone on the QE team who was familiar with testing that feature. Given that, it seems worth asking whether ODF supports users supplying their own taints/tolerations, affinities, or node selectors. @etamir, @bkunal is this supported for customers? It seems to me that it is at least partially supported since the StorageCluster spec has a `placement` configuration that allows specifying custom placement. But the procedure for allowing the ODF/OCS operators to run on custom-placed nodes is possibly untested. [1] https://docs.openshift.com/container-platform/4.14/nodes/scheduling/nodes-scheduler-taints-tolerations.html#nodes-scheduler-taints-tolerations-projects_nodes-scheduler-taints-tolerations ----- In the mantime, I think Oded can continue to test this by modifying the procedure. 1. Install OCP with 4 nodes 2. Reserve 3 of the 4 nodes for the StorageCluster using unique taints and node labels (i.e., not the preferred ODF ones) - On 3 nodes, apply these: - kubectl taint nodes <node names> custom-storage=true:NoSchedule - kubectl label nodes <node names> custom-storage=true 3. Install ODF without installing the StorageCluster yet - All ODF operators should schedule to the node that does not have the above taint+label 4. Configure Network Attachment Definition(s) 5. Run multus validation tool 6. Install StorageCluster with the following modification to the spec placement: all: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: custom-storage operator: In values: - "true" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/storage operator: Equal value: "true" 7. Verify rook-ceph-network-cluster-canary job "Completed" with the expected RHCS container image using the `--watch` flag of the kubectl command (I suggest json output for parsing) - kubectl -n openshift-storage get job rook-ceph-network-cluster-canary --watch -o json QE need clarification and hence more time to verify the bug. It was decided to move the bug to 4.14.3 here - https://chat.google.com/room/AAAAREGEba8/1WifqGfpy5U Hi, I am working with this KCS https://access.redhat.com/articles/6408481 Procedure: 1. Install OCP with 6 worker nodes and 3 master nodes $ oc get nodes NAME STATUS ROLES AGE VERSION compute-0 Ready worker 8h v1.27.8+4fab27b compute-1 Ready worker 8h v1.27.8+4fab27b compute-2 Ready worker 8h v1.27.8+4fab27b compute-3 Ready worker 8h v1.27.8+4fab27b compute-4 Ready worker 8h v1.27.8+4fab27b compute-5 Ready worker 8h v1.27.8+4fab27b control-plane-0 Ready control-plane,master 9h v1.27.8+4fab27b control-plane-1 Ready control-plane,master 9h v1.27.8+4fab27b control-plane-2 Ready control-plane,master 9h v1.27.8+4fab27b 2. Install ODF Operator 3. Install StorageCluster with multus [cluster-net + public-net] 4. Add taint to comute-0 node $ kubectl taint nodes compute-0 custom-storage=true:NoSchedule $ kubectl label nodes compute-0 custom-storage=true 5.Edit Storagecluster placement: all: tolerations: - effect: NoSchedule key: custom-storage operator: In value: "true" mds: tolerations: - effect: NoSchedule key: custom-storage operator: In value: "true" noobaa-core: tolerations: - effect: NoSchedule key: custom-storage operator: In value: "true" rgw: tolerations: - effect: NoSchedule key: custom-storage operator: In value: "true" 6.run "oc get pods -w" and "oc get job -w" $ oc get jobs -w NAME COMPLETIONS DURATION AGE rook-ceph-network-public-canary 0/1 0s rook-ceph-network-cluster-canary 0/1 0s rook-ceph-network-cluster-canary 0/1 0s 0s rook-ceph-network-public-canary 0/1 0s 0s rook-ceph-network-cluster-canary 0/1 4s 4s rook-ceph-network-cluster-canary 0/1 5s 5s rook-ceph-network-public-canary 0/1 5s 5s rook-ceph-network-cluster-canary 0/1 6s 6s rook-ceph-network-public-canary 0/1 6s 6s rook-ceph-network-public-canary 0/1 7s 7s $ oc get pods -w rook-ceph-network-public-canary-z7dhd 0/1 Pending 0 0s rook-ceph-network-cluster-canary-lvhfn 0/1 Pending 0 0s rook-ceph-network-cluster-canary-lvhfn 0/1 Pending 0 0s rook-ceph-network-public-canary-z7dhd 0/1 Pending 0 0s rook-ceph-network-public-canary-z7dhd 0/1 Pending 0 0s rook-ceph-network-cluster-canary-lvhfn 0/1 Pending 0 0s rook-ceph-network-cluster-canary-lvhfn 0/1 Init:0/1 0 0s rook-ceph-network-public-canary-z7dhd 0/1 Init:0/1 0 0s rook-ceph-network-cluster-canary-lvhfn 0/1 Init:0/1 0 1s rook-ceph-network-cluster-canary-lvhfn 0/1 Init:0/1 0 2s rook-ceph-network-public-canary-z7dhd 0/1 Init:0/1 0 2s rook-ceph-network-public-canary-z7dhd 0/1 Init:0/1 0 3s rook-ceph-network-cluster-canary-lvhfn 0/1 PodInitializing 0 4s rook-ceph-network-cluster-canary-lvhfn 0/1 PodInitializing 0 4s rook-ceph-network-cluster-canary-lvhfn 0/1 Terminating 0 4s rook-ceph-network-cluster-canary-lvhfn 0/1 Terminating 0 5s rook-ceph-network-public-canary-z7dhd 0/1 PodInitializing 0 5s rook-ceph-network-public-canary-z7dhd 0/1 PodInitializing 0 5s rook-ceph-network-public-canary-z7dhd 0/1 Terminating 0 5s rook-ceph-network-cluster-canary-lvhfn 0/1 Terminating 0 6s rook-ceph-network-cluster-canary-lvhfn 0/1 Terminating 0 6s rook-ceph-network-public-canary-z7dhd 0/1 Terminating 0 6s rook-ceph-network-cluster-canary-lvhfn 0/1 Terminating 0 6s rook-ceph-network-cluster-canary-lvhfn 0/1 Terminating 0 6s rook-ceph-network-public-canary-z7dhd 0/1 Terminating 0 7s rook-ceph-network-public-canary-z7dhd 0/1 Terminating 0 7s rook-ceph-network-public-canary-z7dhd 0/1 Terminating 0 7s rook-ceph-network-public-canary-z7dhd 0/1 Terminating 0 7s Blaine, Can you check this procedure? Moving the bug to 4.14.4 as we are doing a quick 4.14.3 to include a critical fix at RGW (2254303) before to shutdown I added the flag, please update the doc text The rook-ceph operator pod in pending state because:
---- ------ ---- ---- -------
Warning FailedScheduling 4m29s default-scheduler 0/6 nodes are available: 3 node(s) had untolerated taint {custom-storage: true}, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling..
1.Install ODF4.14.4 operator:
$ oc get csv -n openshift-storage
NAME DISPLAY VERSION REPLACES PHASE
mcg-operator.v4.14.4-rhodf NooBaa Operator 4.14.4-rhodf mcg-operator.v4.14.3-rhodf Succeeded
ocs-operator.v4.14.4-rhodf OpenShift Container Storage 4.14.4-rhodf ocs-operator.v4.14.3-rhodf Succeeded
odf-csi-addons-operator.v4.14.4-rhodf CSI Addons 4.14.4-rhodf odf-csi-addons-operator.v4.14.3-rhodf Succeeded
odf-operator.v4.14.4-rhodf OpenShift Data Foundation 4.14.4-rhodf odf-operator.v4.14.3-rhodf Succeeded
2.Create public and privat nad:
---
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
name: public-net
namespace: default
labels: {}
annotations: {}
spec:
config: '{ "cniVersion": "0.3.1", "type": "macvlan", "master": "br-ex", "mode": "bridge", "ipam": { "type": "whereabouts", "range": "192.168.20.0/24" } }'
---
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
name: cluster-net
namespace: default
labels: {}
annotations: {}
spec:
config: '{ "cniVersion": "0.3.1", "type": "macvlan", "master": "br-ex", "mode": "bridge", "ipam": { "type": "whereabouts", "range": "192.168.30.0/24" } }'
3.Taint nodes:
kubectl taint nodes compute-0 custom-storage=true:NoSchedule
kubectl label nodes compute-0 custom-storage=true
kubectl taint nodes compute-1 custom-storage=true:NoSchedule
kubectl label nodes compute-1 custom-storage=true
kubectl taint nodes compute-2 custom-storage=true:NoSchedule
kubectl label nodes compute-2 custom-storage=true
4.Apply storagesystem
---
apiVersion: odf.openshift.io/v1alpha1
kind: StorageSystem
metadata:
name: ocs-storagecluster-storagesystem
namespace: openshift-storage
spec:
kind: storagecluster.ocs.openshift.io/v1
name: ocs-storagecluster
namespace: openshift-storage
5.Create thin storage class:
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "false"
name: thin-csi-odf
parameters:
StoragePolicyName: "vSAN Default Storage Policy"
provisioner: csi.vsphere.vmware.com
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
6.Create Storagecluster:
---
apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
name: ocs-storagecluster
namespace: openshift-storage
spec:
resources:
mds:
Limits: null
Requests: null
mgr:
Limits: null
Requests: null
mon:
Limits: null
Requests: null
noobaa-core:
Limits: null
Requests: null
noobaa-db:
Limits: null
Requests: null
noobaa-endpoint:
limits:
cpu: 1
memory: 500Mi
requests:
cpu: 1
memory: 500Mi
rgw:
Limits: null
Requests: null
storageDeviceSets:
- count: 1
dataPVCTemplate:
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 256Gi
storageClassName: thin-csi-odf
volumeMode: Block
name: ocs-deviceset
placement:
all:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: custom-storage
operator: In
values:
- "true"
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/storage
operator: Equal
value: "true"
network:
provider: multus
selectors:
cluster: default/cluster-net
public: default/public-net
portable: true
replica: 3
resources:
Limits: null
Requests: null
---
7.Check rook-ceph operator pod
$ oc get pod rook-ceph-operator-7b7b6b8d5c-q6kzt
NAME READY STATUS RESTARTS AGE
rook-ceph-operator-7b7b6b8d5c-q6kzt 0/1 Pending 0 3m8s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 4m29s default-scheduler 0/6 nodes are available: 3 node(s) had untolerated taint {custom-storage: true}, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling..
I ran this procedure here https://bugzilla.redhat.com/show_bug.cgi?id=2249735#c23. But Blaine thinks this is not the right process The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |