Description of problem:
This issue can be addressed on IPI on OSP or IPI on BM since they have static pods(coredns, keepalived and mdns-publisher) which have requests.resources
related bug: https://bugzilla.redhat.com/show_bug.cgi?id=1753067
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Create clusterautoscaler, machineautoscaler cr
2. oc adm new-project openshift-kni-infra
3. Create pod to scale up the cluster
- name: busybox
- "echo 'this should be in the logs' && sleep 86400"
4. Check pod
After a while, some pod will OutOfmemory
$ oc get pod
NAME READY STATUS RESTARTS AGE
cluster-autoscaler-default-5dd4b8d85-dtrzz 1/1 Running 0 23h
cluster-autoscaler-operator-59b86c4d95-4r5wb 1/1 Running 0 24h
machine-api-controllers-776587cf7d-9ddqx 3/3 Running 0 24h
machine-api-operator-5bc8f8df49-pnf4c 1/1 Running 0 24h
scale-up-5f76786964-24tlg 1/1 Running 0 9m35s
scale-up-5f76786964-252lw 0/1 OutOfmemory 0 56s
scale-up-5f76786964-2cvmm 0/1 OutOfmemory 0 38s
scale-up-5f76786964-2k7cc 0/1 OutOfmemory 0 4m39s
scale-up-5f76786964-2kngx 0/1 OutOfmemory 0 64s
scale-up-5f76786964-2wmk2 0/1 OutOfmemory 0 60s
scale-up-5f76786964-2z5jc 1/1 Running 0 9m35s
scale-up-5f76786964-4fbv4 0/1 OutOfmemory 0 4m33s
scale-up-5f76786964-4fdlx 0/1 OutOfmemory 0 49s
scale-up-5f76786964-4n5c4 1/1 Running 0 9m35s
scale-up-5f76786964-4n8tr 0/1 OutOfmemory 0 73s
Autoscaler could work well.
This issue happens after applied the workaround which mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1753067#c3
can you share autoscaler logs?
Created attachment 1616672 [details]
Does every node require all three (coredns, keepalived and mdns-publisher) static pods?
@ Joel Smith, I think you are right, static pods' mirror pods are created when the scheduler may have already scheduled other workloads that result in the workload pod are in OutOfmemory status, autoscaler is working as expected. Just cluster autoscaler only handles the pod in the pending state, but the added workload is always outofmemory.
I would be curious if the following patch helps this issue . This BZ was created at around the same time as  merged.
I didn't manage to test the backport yet, but I'll try to do it next sprint.
This appears to be fixed in 4.5, based upon my testing.
Whether because of https://github.com/openshift/origin/pull/23812 or something else, the current behavior is that a static pod will preempt a pod that has been scheduled to a node if the node doesn't have enough resources for the static pod.
I have tried a few scenarios for autoscaling. All include static pods on all worker nodes. I have not seen a pod in status OutOfMemory. They are correctly in Pending when they trigger a scaling event and eventually deploy to the new node. Testing on latest released (4.5.2).
I would say this is verified, but I am seeing unexpected behavior with the autoscaler: multiple nodes get spun up when only one should be need to satisfy memory requests and removing the reproducer deployment doesn't scale back down completely. I am double checking my math and what the expected behavior of autoscaler in latest code.
I am going to stand by my statement that this has been verified as fixed. All other issues are unrelated and I can track them down separately. I am a little concerned that it is not clear what actually fixed it, but it is fixed.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.