1713479 – During upgrade, LocalStorageCapacityIsolation feature gate is turned on temporarily

Bug 1713479 - During upgrade, LocalStorageCapacityIsolation feature gate is turned on temporarily

Summary: During upgrade, LocalStorageCapacityIsolation feature gate is turned on tempo...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	kube-apiserver
Sub Component:
Version:	4.2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	urgent
Target Milestone:	---
Target Release:	4.2.0
Assignee:	Lukasz Szaszkiewicz
QA Contact:	Xingxing Xia
Docs Contact:
URL:
Whiteboard:
Depends On:	1713207
Blocks:
TreeView+	depends on / blocked

Reported:	2019-05-23 19:59 UTC by Clayton Coleman
Modified:	2023-09-14 05:29 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1713207
Environment:
Last Closed:	2019-10-16 06:29:21 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:2922	0	None	None	None	2019-10-16 06:29:35 UTC

Description Clayton Coleman 2019-05-23 19:59:21 UTC

During an upgrade, it looks like LocalStorageCapacityIsolation is "on" long enough for the deployment (copied below) to be created on the cluster with sizeLimit = 1Gi, which then causes the deployment to hot loop due to the other bug.

We need to identify why the gate was enabled during the upgrade test.

https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_support-operator/9/pull-ci-openshift-support-operator-master-e2e-aws-upgrade/7

support-operator has sizeLimit: 1Gi set, but it shouldn't.

-----------------------------

+++ This bug was initially created as a clone of Bug #1713207 +++

Attempting to merge the support operator has triggered some form of bug in the replica set controller - the first time the operator deployment is updated it goes into an infinite loop of collisions, creating and deleting the pod endlessly.  This is 100% reproducible on update from the first to second deployment (so maybe deploy support operator and then tweak its image location to be to a identical mirror).  

https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_support-operator/9/pull-ci-openshift-support-operator-master-e2e-aws-upgrade/7

...
I0523 04:02:56.675984       1 sync.go:251] Found a hash collision for deployment "support-operator" - bumping collisionCount (6->7) to resolve it
I0523 04:02:56.676020       1 deployment_controller.go:484] Error syncing deployment openshift-support/support-operator: replicasets.apps "support-operator-84cbf58c9c" already exists
...

later

...
I0523 04:08:52.718873       1 sync.go:251] Found a hash collision for deployment "support-operator" - bumping collisionCount (81->82) to resolve it
I0523 04:08:52.718912       1 deployment_controller.go:484] Error syncing deployment openshift-support/support-operator: replicasets.apps "support-operator-599dc4958" already exists
I0523 04:08:52.737010       1 replica_set.go:477] Too few replicas for ReplicaSet openshift-support/support-operator-6dd85c97cc, need 1, creating 1
I0523 04:08:52.737826       1 event.go:209] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-support", Name:"support-operator", UID:"7fca421a-7d0c-11e9-abe1-129a16ed0c20", APIVersion:"apps/v
1", ResourceVersion:"32661", FieldPath:""}): type: 'Normal' reason: 'ScalingReplicaSet' Scaled up replica set support-operator-6dd85c97cc to 1
I0523 04:08:52.746483       1 event.go:209] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"openshift-support", Name:"support-operator-6dd85c97cc", UID:"79c39631-7d10-11e9-b30f-0af47f16c66e", APIVers
ion:"apps/v1", ResourceVersion:"32662", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: support-operator-6dd85c97cc-pzcwm
I0523 04:08:52.758531       1 deployment_controller.go:484] Error syncing deployment openshift-support/support-operator: Operation cannot be fulfilled on deployments.apps "support-operator": the object has
been modified; please apply your changes to the latest version and try again
I0523 04:08:52.775403       1 deployment_controller.go:484] Error syncing deployment openshift-support/support-operator: Operation cannot be fulfilled on replicasets.apps "support-operator-6dd85c97cc": the
object has been modified; please apply your changes to the latest version and try again
I0523 04:08:52.791419       1 replica_set.go:516] Too many replicas for ReplicaSet openshift-support/support-operator-6dd85c97cc, need 0, deleting 1
I0523 04:08:52.791472       1 controller_utils.go:598] Controller support-operator-6dd85c97cc deleting pod openshift-support/support-operator-6dd85c97cc-pzcwm
I0523 04:08:52.792028       1 event.go:209] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-support", Name:"support-operator", UID:"7fca421a-7d0c-11e9-abe1-129a16ed0c20", APIVersion:"apps/v
1", ResourceVersion:"32663", FieldPath:""}): type: 'Normal' reason: 'ScalingReplicaSet' Scaled down replica set support-operator-6dd85c97cc to 0
I0523 04:08:52.804290       1 event.go:209] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"openshift-support", Name:"support-operator-6dd85c97cc", UID:"79c39631-7d10-11e9-b30f-0af47f16c66e", APIVers
ion:"apps/v1", ResourceVersion:"32671", FieldPath:""}): type: 'Normal' reason: 'SuccessfulDelete' Deleted pod: support-operator-6dd85c97cc-pzcwm
I0523 04:09:04.205503       1 sync.go:251] Found a hash collision for deployment "support-operator" - bumping collisionCount (82->83) to resolve it
I0523 04:09:04.205548       1 deployment_controller.go:484] Error syncing deployment openshift-support/support-operator: replicasets.apps "support-operator-6dd85c97cc" already exists
I0523 04:09:04.227864       1 replica_set.go:477] Too few replicas for ReplicaSet openshift-support/support-operator-575c956f88, need 1, creating 1
I0523 04:09:04.228601       1 event.go:209] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-support", Name:"support-operator", UID:"7fca421a-7d0c-11e9-abe1-129a16ed0c20", APIVersion:"apps/v
1", ResourceVersion:"32751", FieldPath:""}): type: 'Normal' reason: 'ScalingReplicaSet' Scaled up replica set support-operator-575c956f88 to 1
I0523 04:09:04.241293       1 event.go:209] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"openshift-support", Name:"support-operator-575c956f88", UID:"809c4333-7d10-11e9-b30f-0af47f16c66e", APIVers
ion:"apps/v1", ResourceVersion:"32752", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: support-operator-575c956f88-9cdvq
I0523 04:09:04.242974       1 deployment_controller.go:484] Error syncing deployment openshift-support/support-operator: Operation cannot be fulfilled on deployments.apps "support-operator": the object has
been modified; please apply your changes to the latest version and try again
I0523 04:09:04.259644       1 deployment_controller.go:484] Error syncing deployment openshift-support/support-operator: Operation cannot be fulfilled on replicasets.apps "support-operator-575c956f88": the
object has been modified; please apply your changes to the latest version and try again
I0523 04:09:04.276331       1 replica_set.go:516] Too many replicas for ReplicaSet openshift-support/support-operator-575c956f88, need 0, deleting 1
...

One of the earlier chunks.

I0523 04:02:56.174997       1 deployment_controller.go:484] Error syncing deployment openshift-image-registry/image-registry: Operation cannot be fulfilled on replicasets.apps "image-registry-5f788b4c79": the object has been modified; please apply your changes to the latest version and try again
I0523 04:02:56.176247       1 sync.go:251] Found a hash collision for deployment "support-operator" - bumping collisionCount (5->6) to resolve it
I0523 04:02:56.176270       1 deployment_controller.go:484] Error syncing deployment openshift-support/support-operator: replicasets.apps "support-operator-578db4fdf8" already exists
I0523 04:02:56.176681       1 replica_set.go:516] Too many replicas for ReplicaSet openshift-marketplace/certified-operators-6f96675f4, need 0, deleting 1
I0523 04:02:56.176729       1 controller_utils.go:598] Controller certified-operators-6f96675f4 deleting pod openshift-marketplace/certified-operators-6f96675f4-jxvdg
I0523 04:02:56.182351       1 event.go:209] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-marketplace", Name:"certified-operators", UID:"4c0b1ed2-7d0d-11e9-8cad-0e4e0ac820e6", APIVersion:"apps/v1", ResourceVersion:"21199", FieldPath:""}): type: 'Normal' reason: 'ScalingReplicaSet' Scaled down replica set certified-operators-6f96675f4 to 0
I0523 04:02:56.205528       1 event.go:209] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-support", Name:"support-operator", UID:"7fca421a-7d0c-11e9-abe1-129a16ed0c20", APIVersion:"apps/v1", ResourceVersion:"21204", FieldPath:""}): type: 'Normal' reason: 'ScalingReplicaSet' Scaled up replica set support-operator-84cbf58c9c to 1
I0523 04:02:56.210491       1 replica_set.go:477] Too few replicas for ReplicaSet openshift-support/support-operator-84cbf58c9c, need 1, creating 1
I0523 04:02:56.215001       1 replica_set.go:477] Too few replicas for ReplicaSet openshift-image-registry/image-registry-5f788b4c79, need 1, creating 1
I0523 04:02:56.215658       1 event.go:209] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-image-registry", Name:"image-registry", UID:"4adaec14-7d0d-11e9-ae07-0af47f16c66e", APIVersion:"apps/v1", ResourceVersion:"21183", FieldPath:""}): type: 'Normal' reason: 'ScalingReplicaSet' Scaled up replica set image-registry-5f788b4c79 to 1
I0523 04:02:56.229165       1 deployment_controller.go:484] Error syncing deployment openshift-marketplace/certified-operators: Operation cannot be fulfilled on deployments.apps "certified-operators": the object has been modified; please apply your changes to the latest version and try again
I0523 04:02:56.236074       1 deployment_controller.go:484] Error syncing deployment openshift-image-registry/image-registry: Operation cannot be fulfilled on deployments.apps "image-registry": the object has been modified; please apply your changes to the latest version and try again
I0523 04:02:56.236160       1 deployment_controller.go:484] Error syncing deployment openshift-support/support-operator: Operation cannot be fulfilled on deployments.apps "support-operator": the object has been modified; please apply your changes to the latest version and try again
I0523 04:02:56.242832       1 event.go:209] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"openshift-support", Name:"support-operator-84cbf58c9c", UID:"a53fb46f-7d0f-11e9-b30f-0af47f16c66e", APIVersion:"apps/v1", ResourceVersion:"21213", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: support-operator-84cbf58c9c-fdpqd
I0523 04:02:56.266646       1 event.go:209] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"openshift-marketplace", Name:"certified-operators-6f96675f4", UID:"a50342ab-7d0f-11e9-b30f-0af47f16c66e", APIVersion:"apps/v1", ResourceVersion:"21205", FieldPath:""}): type: 'Normal' reason: 'SuccessfulDelete' Deleted pod: certified-operators-6f96675f4-jxvdg
I0523 04:02:56.266688       1 event.go:209] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-support", Name:"support-operator", UID:"7fca421a-7d0c-11e9-abe1-129a16ed0c20", APIVersion:"apps/v1", ResourceVersion:"21218", FieldPath:""}): type: 'Normal' reason: 'ScalingReplicaSet' Scaled down replica set support-operator-84cbf58c9c to 0
I0523 04:02:56.316681       1 replica_set.go:516] Too many replicas for ReplicaSet openshift-support/support-operator-84cbf58c9c, need 0, deleting 1
I0523 04:02:56.316846       1 controller_utils.go:598] Controller support-operator-84cbf58c9c deleting pod openshift-support/support-operator-84cbf58c9c-fdpqd
I0523 04:02:56.332971       1 deployment_controller.go:484] Error syncing deployment openshift-support/support-operator: Operation cannot be fulfilled on deployments.apps "support-operator": the object has been modified; please apply your changes to the latest version and try again
I0523 04:02:56.408169       1 event.go:209] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"openshift-image-registry", Name:"image-registry-5f788b4c79", UID:"a05fffb6-7d0f-11e9-b30f-0af47f16c66e", APIVersion:"apps/v1", ResourceVersion:"21216", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: image-registry-5f788b4c79-xq6w8
I0523 04:02:56.574779       1 event.go:209] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"openshift-support", Name:"support-operator-84cbf58c9c", UID:"a53fb46f-7d0f-11e9-b30f-0af47f16c66e", APIVersion:"apps/v1", ResourceVersion:"21225", FieldPath:""}): type: 'Normal' reason: 'SuccessfulDelete' Deleted pod: support-operator-84cbf58c9c-fdpqd
I0523 04:02:56.675984       1 sync.go:251] Found a hash collision for deployment "support-operator" - bumping collisionCount (6->7) to resolve it
I0523 04:02:56.676020       1 deployment_controller.go:484] Error syncing deployment openshift-support/support-operator: replicasets.apps "support-operator-84cbf58c9c" already exists
I0523 04:02:56.738921       1 replica_set.go:477] Too few replicas for ReplicaSet openshift-support/support-operator-6954696768, need 1, creating 1
I0523 04:02:56.745524       1 event.go:209] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-support", Name:"support-operator", UID:"7fca421a-7d0c-11e9-abe1-129a16ed0c20", APIVersion:"apps/v1", ResourceVersion:"21312", FieldPath:""}): type: 'Normal' reason: 'ScalingReplicaSet' Scaled up replica set support-operator-6954696768 to 1
I0523 04:02:56.759711       1 event.go:209] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"openshift-support", Name:"support-operator-6954696768", UID:"a591891a-7d0f-11e9-b30f-0af47f16c66e", APIVersion:"apps/v1", ResourceVersion:"21318", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: support-operator-6954696768-ctvrw
I0523 04:02:56.794452       1 event.go:209] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-marketplace", Name:"community-operators", UID:"4c45e496-7d0d-11e9-8cad-0e4e0ac820e6", APIVersion:"apps/v1", ResourceVersion:"21320", FieldPath:""}): type: 'Normal' reason: 'ScalingReplicaSet' Scaled down replica set community-operators-5b69c4fbff to 0

Setting to urgent because this blocks rolling out support operator - a workaround would let us drop this to high.  Does not appear to be a 4.1 issue, just 4.2 post-rebase.

--- Additional comment from Clayton Coleman on 2019-05-23 14:55:49 EDT ---

I tracked this down:

ReplicaSet spec

{"containers":[{"args":["start","-v=4","--config=/etc/support-operator/server.yaml"],"env":[{"name":"POD_NAME","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"metadata.name"}}},{"name":"RELEASE_VERSION","value":"0.0.1-2019-05-23-032421"}],"image":"registry.svc.ci.openshift.org/ci-op-pwcc6sq3/stable@sha256:44fe273f63edcec5f1e3bf999c4f08d34a5db02b426a03105657b1db3a5aeffb","imagePullPolicy":"IfNotPresent","name":"operator","ports":[{"containerPort":8443,"name":"https","protocol":"TCP"}],"resources":{"requests":{"cpu":"10m","memory":"30Mi"}},"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"FallbackToLogsOnError","volumeMounts":[{"mountPath":"/var/lib/support-operator","name":"snapshots"}]}],"dnsPolicy":"ClusterFirst","nodeSelector":{"beta.kubernetes.io/os":"linux","node-role.kubernetes.io/master":""},"priorityClassName":"system-cluster-critical","restartPolicy":"Always","schedulerName":"default-scheduler","securityContext":{},"serviceAccount":"operator","serviceAccountName":"operator","terminationGracePeriodSeconds":30,"tolerations":[{"effect":"NoSchedule","key":"node-role.kubernetes.io/master","operator":"Exists"},{"effect":"NoExecute","key":"node.kubernetes.io/unreachable","operator":"Exists","tolerationSeconds":900},{"effect":"NoExecute","key":"node.kubernetes.io/not-ready","operator":"Exists","tolerationSeconds":900}],"volumes":[{"emptyDir":{},"name":"snapshots"}]}

Deployment spec

{"containers":[{"args":["start","-v=4","--config=/etc/support-operator/server.yaml"],"env":[{"name":"POD_NAME","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"metadata.name"}}},{"name":"RELEASE_VERSION","value":"0.0.1-2019-05-23-032421"}],"image":"registry.svc.ci.openshift.org/ci-op-pwcc6sq3/stable@sha256:44fe273f63edcec5f1e3bf999c4f08d34a5db02b426a03105657b1db3a5aeffb","imagePullPolicy":"IfNotPresent","name":"operator","ports":[{"containerPort":8443,"name":"https","protocol":"TCP"}],"resources":{"requests":{"cpu":"10m","memory":"30Mi"}},"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"FallbackToLogsOnError","volumeMounts":[{"mountPath":"/var/lib/support-operator","name":"snapshots"}]}],"dnsPolicy":"ClusterFirst","nodeSelector":{"beta.kubernetes.io/os":"linux","node-role.kubernetes.io/master":""},"priorityClassName":"system-cluster-critical","restartPolicy":"Always","schedulerName":"default-scheduler","securityContext":{},"serviceAccount":"operator","serviceAccountName":"operator","terminationGracePeriodSeconds":30,"tolerations":[{"effect":"NoSchedule","key":"node-role.kubernetes.io/master","operator":"Exists"},{"effect":"NoExecute","key":"node.kubernetes.io/unreachable","operator":"Exists","tolerationSeconds":900},{"effect":"NoExecute","key":"node.kubernetes.io/not-ready","operator":"Exists","tolerationSeconds":900}],"volumes":[{"emptyDir":{"sizeLimit":"1Gi"},"name":"snapshots"}]}

Only difference is emptyDir: {sizeLimit: "1Gi"}

Also, we create 487 replica sets (which are still around at the end of the run), which means that revision history (10) is not working for some reason.  That is a separate bug.

--- Additional comment from Clayton Coleman on 2019-05-23 14:57:34 EDT ---

Here's the pod spec:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: support-operator
  namespace: openshift-support
spec:
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app: support-operator
  template:
    metadata:
      labels:
        app: support-operator
    spec:
      serviceAccountName: operator
      priorityClassName: system-cluster-critical
      nodeSelector:
        beta.kubernetes.io/os: linux
        node-role.kubernetes.io/master: ""
      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
        operator: Exists
      - effect: NoExecute
        key: node.kubernetes.io/unreachable
        operator: Exists
        tolerationSeconds: 900
      - effect: NoExecute
        key: node.kubernetes.io/not-ready
        operator: Exists
        tolerationSeconds: 900
      volumes:
      - name: snapshots
        emptyDir:
          sizeLimit: 1Gi
      containers:
      - name: operator
        image: quay.io/openshift/origin-support-operator:latest
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - name: snapshots
          mountPath: /var/lib/support-operator
        ports:
        - containerPort: 8443
          name: https
        resources:
          requests:
            cpu: 10m
            memory: 30Mi
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: RELEASE_VERSION
          value: "0.0.1-snapshot"
        args:
        - start
        - -v=4
        - --config=/etc/support-operator/server.yaml

This is going to stay urgent because I'm not positive this isn't a 1.13 bug too.

--- Additional comment from Clayton Coleman on 2019-05-23 15:44:17 EDT ---

Ok, so the deployment had sizeLimit set (which means LocalStorageCapacityIsolation was on long enough for it to get set).  Then LocalStorageCapacityIsolation got turned off, and replica set went into hot loop

https://github.com/kubernetes/kubernetes/issues/57167

--- Additional comment from Clayton Coleman on 2019-05-23 15:49:58 EDT ---

So the bugs that need to be tracked down:

1. Why does the hot loop not get revision cleanup?
2. Why LocalStorageCapacityIsolation is off in 4.1 (tested just now on a 4.1 cluster)?
3. Why, during an upgrade from 4.2 to 4.2 when the operator is installed, is sizeLimit allowed to be created, but then immediately turned off?
4. How can we fix the broken deployment / replica set hot loop - this means if a user toggles certain feature gates (those that control DisabledFields) you could cause a bunch of deployments to go haywire?

Comment 1 Clayton Coleman 2019-05-23 21:10:33 UTC

This likely is an issue in 4.1 and could potentially be impactful.  Once we identify the cause we will need to backport.

Urgent because flag gates going on during upgrade is bad (if true)

Comment 2 Michal Fojtik 2019-05-24 08:48:09 UTC

(In reply to Clayton Coleman from comment #1)
> This likely is an issue in 4.1 and could potentially be impactful.  Once we
> identify the cause we will need to backport.
> 
> Urgent because flag gates going on during upgrade is bad (if true)

It looks like that feature gate is set off by the scheduler operator. If support operator or anything else is faster than scheduler operator
there might be revision of kube apiserver with that gate on (because it is default on and beta in kube).

That feature gate belongs to scheduler logically, so I don't think we want to move it to kube apiserver.
One option could be to make support operator observe the kubeapiserver config and wait for that gate to be off before creating replicas?

/cc David

xrefs:

https://github.com/openshift/api/blob/master/config/v1/types_feature.go#L68
https://github.com/openshift/cluster-kube-scheduler-operator/blob/master/pkg/operator/target_config_reconciler_v311_00.go#L207

Comment 3 Michal Fojtik 2019-05-24 09:02:18 UTC

> One option could be to make support operator observe the kubeapiserver config and wait for that gate to be off before creating replicas?

Bad suggestion. We probably don't want to introduce something like this into other operators, there should be some generic fix.

Comment 4 Michal Fojtik 2019-05-24 09:05:45 UTC

From the event logs, I don't see evidence of the feature gate configuration changing after/prior to upgrade.

│2019-05-23 05:41:08 +0200 CEST to 2019-05-23 05:41:08 +0200 CEST (1) "openshift-kube-apiserver-operator" ObserveFeatureFlagsUpdated Updated apiServerArguments.feature-gates to ExperimentalCriticalPodAnnotation=true,RotateKubeletServerCe│
rtificate=true,SupportPodPidsLimit=true,LocalStorageCapacityIsolation=false

....

│2019-05-23 05:47:42 +0200 CEST to 2019-05-23 05:47:42 +0200 CEST (1) "openshift-kube-apiserver-operator" ObservedConfigChanged Writing updated observed config: {"admissionPluginConfig":{"network.openshift.io/RestrictedEndpointsAdmission│
":{"configuration":{"restrictedCIDRs":["10.128.0.0/14","172.30.0.0/16"]}}},"apiServerArguments":{"cloud-provider":["aws"],"feature-gates":["ExperimentalCriticalPodAnnotation=true","RotateKubeletServerCertificate=true","SupportPodPidsLimi│
t=true","LocalStorageCapacityIsolation=false"]},"


....

│2019-05-23 05:58:25 +0200 CEST to 2019-05-23 05:58:25 +0200 CEST (2) "openshift-kube-apiserver-operator" OperatorVersionChanged clusteroperator/kube-apiserver version "raw-internal" changed from "0.0.1-2019-05-23-032300" to "0.0.1-2019-│
05-23-032421"

....

│2019-05-23 05:58:30 +0200 CEST to 2019-05-23 05:58:30 +0200 CEST (1) "openshift-kube-apiserver-operator" ConfigMapUpdated Updated ConfigMap/kube-apiserver-pod -n openshift-kube-apiserver: cause by changes in data.pod.yaml               │
│2019-05-23 05:58:30 +0200 CEST to 2019-05-23 05:58:30 +0200 CEST (1) "openshift-kube-apiserver-operator" RevisionTriggered new revision 7 triggered by "configmap/kube-apiserver-pod has changed"

....

(no evidence of changing/flipping the feature gate config)... I can see a room for a race before scheduler make the initial change, which should happen shortly after install, so it should not cause problems during upgrade.

Comment 5 Lukasz Szaszkiewicz 2019-06-14 07:39:51 UTC

we decided to explicitly set feature gates in bootstrap-config-overrides for kube-apiserver-operator, link to PR https://github.com/openshift/cluster-kube-apiserver-operator/pull/506

Comment 7 Xingxing Xia 2019-07-25 07:41:27 UTC

Verified in 4.2.0-0.nightly-2019-07-21-222447 to 4.2.0-0.nightly-2019-07-22-160516 upgrade:
Because of comment 5's explicit fix, the LocalStorageCapacityIsolation is always false. In addition, per https://github.com/kubernetes/kubernetes/issues/57167 , created a deployment who's spec.template is:
      volumes:
      - emptyDir:
          sizeLimit: "0"
        name: foo

Didn't meet the issue 57167.
Finally, observed log of openshift-kube-apiserver-operator, didn't see LocalStorageCapacityIsolation changing.
So moving bug to VERIFIED.

Comment 9 errata-xmlrpc 2019-10-16 06:29:21 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922

Comment 10 Red Hat Bugzilla 2023-09-14 05:29:06 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days

Note You need to log in before you can comment on or make changes to this bug.