Bug 1933184 - openshift-cluster-csi-drivers DaemonSets should use maxUnavailable: 10%
Summary: openshift-cluster-csi-drivers DaemonSets should use maxUnavailable: 10%
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.8
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.8.0
Assignee: Jan Safranek
QA Contact: Qin Ping
URL:
Whiteboard:
Depends On:
Blocks: 1996070
TreeView+ depends on / blocked
 
Reported: 2021-02-25 21:11 UTC by W. Trevor King
Modified: 2021-08-20 13:00 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of: 1933174
Environment:
Last Closed: 2021-07-27 22:48:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift aws-ebs-csi-driver-operator pull 114 0 None closed Bug 1933184: Add maxUnavailable to DaemonSets 2021-03-05 23:56:41 UTC
Github openshift csi-driver-manila-operator pull 92 0 None closed Bug 1933184: Add maxUnavailable to DaemonSets 2021-03-05 23:56:42 UTC
Github openshift gcp-pd-csi-driver-operator pull 15 0 None closed Bug 1933184: Add maxUnavailable to DaemonSets 2021-03-05 23:56:43 UTC
Github openshift openstack-cinder-csi-driver-operator pull 28 0 None closed Bug 1933184: Add maxUnavailable to DaemonSets 2021-03-05 23:56:44 UTC
Github openshift ovirt-csi-driver-operator pull 50 0 None closed Bug 1933184: Add maxUnavailable to DaemonSets 2021-03-05 23:56:45 UTC
Github openshift ovirt-csi-driver-operator pull 51 0 None closed Bug 1933184: Fix maxUnavailable value to 10% 2021-03-05 23:56:47 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:48:48 UTC

Description W. Trevor King 2021-02-25 21:11:24 UTC
+++ This bug was initially created as a clone of Bug #1933174 +++

+++ This bug was initially created as a clone of Bug #1933173 +++

It's currently maxUnavailable: 1, but we want maxUnavailable: 10%, so we scale better on clusters with large node-counts.  Checking a recent 4.8 CI release:

$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-4.8/1364411741282242560/artifacts/e2e-gcp/daemonsets.json | jq -r '.items[] | select(.spec.template.spec.nodeSelector["node-role.kubernetes.io/master"] != "" and .spec.updateStrategy.rollingUpdate.maxUnavailable != "10%") | .metadata.namespace + " " + .metadata.name + " " + (.spec.template.spec.nodeSelector | tostring) + " " + (.spec.updateStrategy | tostring)'
...
openshift-cluster-csi-drivers gcp-pd-csi-driver-node {"kubernetes.io/os":"linux"} {"rollingUpdate":{"maxUnavailable":1},"type":"RollingUpdate"}
...

Like bug 1933159, but different DaemonSet.  Might be other CSI DaemonSets besides the GCP one; I just audited a GCP job.

Comment 3 Qin Ping 2021-03-08 05:47:20 UTC
Verified aws, gcp and manila PRs with 4.8.0-0.nightly-2021-03-06-055252. 
With a cluster with 3 masters and 9 worker nodes, there are 2 pods created by ds are updated at the same time.
NAME                                            READY   STATUS        RESTARTS   AGE
aws-ebs-csi-driver-controller-b4f46757f-w9tvr   6/6     Running       0          25m
aws-ebs-csi-driver-node-2hk8n                   3/3     Running       0          4m4s
aws-ebs-csi-driver-node-572zk                   3/3     Running       0          3m56s
aws-ebs-csi-driver-node-76mpt                   0/3     Terminating   0          3m43s
aws-ebs-csi-driver-node-dspl7                   3/3     Running       0          3m30s
aws-ebs-csi-driver-node-dwbqr                   0/3     Terminating   0          3m37s
aws-ebs-csi-driver-node-rh5hm                   3/3     Running       0          4m
aws-ebs-csi-driver-node-vpxkg                   3/3     Running       0          2m44s
aws-ebs-csi-driver-node-vsxwf                   3/3     Running       0          2m42s
aws-ebs-csi-driver-node-w4fts                   3/3     Running       0          2m57s
aws-ebs-csi-driver-node-wbhdr                   3/3     Running       0          3m10s
aws-ebs-csi-driver-node-xz5nc                   3/3     Running       0          3m52s
aws-ebs-csi-driver-node-z25hw                   3/3     Running       0          3m24s
aws-ebs-csi-driver-operator-65c745fb6-p6q55     1/1     Running       0          25m

Checked ovirt operator in this payload image has the fix too, so I'll mark this bug as verified.

Comment 6 errata-xmlrpc 2021-07-27 22:48:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.