Bug 1912975
Summary: | Containers stuck in ContainerCreating creating 1000 namespaces on 100 nodes with 1000 deployments | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Mike Fiedler <mifiedle> | ||||
Component: | Networking | Assignee: | Mohamed Mahmoud <mmahmoud> | ||||
Networking sub component: | ovn-kubernetes | QA Contact: | Anurag saxena <anusaxen> | ||||
Status: | CLOSED UPSTREAM | Docs Contact: | |||||
Severity: | high | ||||||
Priority: | high | CC: | aconstan, anbhat, vpickard | ||||
Version: | 4.6.z | ||||||
Target Milestone: | --- | ||||||
Target Release: | 4.8.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2021-05-14 16:27:43 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1883917, 1908472 | ||||||
Attachments: |
|
Description
Mike Fiedler
2021-01-05 17:38:31 UTC
Created attachment 1744664 [details]
journal from one node in the cluster this bz is being reported on.
Unfortunately the cluster degraded to the point that the API became unavailable and I could not get must gather. The masters were inaccessible from an ssh bastion but i was able to get the journal off of 1 worker. Let me know what else is needed for the next repro of this issue.
Reproduced on 4.6.0-0.nightly-2021-01-18-070340. Still blocks verification of bug 1883917 reassigning to Ben since I'm on leave, please reassign to someone in the team. Could not reproduce this on 4.8.0-0.nightly-2021-05-13-222446 Created 2000 pods in 1000 namespaces Created 5000 pods in 2500 namespaces. CNI Request ADD latency increased significantly by the end of this run to ~12s but everything started succesfully The error event and ContainerCreating issue described in this bug were not seen. Closing as fixed upstream. |