Bug 1814187
Summary: | [RHV] Master becomes NotReady after several days running | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Wei Sun <wsun> | ||||
Component: | Node | Assignee: | Ryan Phillips <rphillips> | ||||
Status: | CLOSED NOTABUG | QA Contact: | Jan Zmeskal <jzmeskal> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 4.4 | CC: | aos-bugs, jcall, jokerman, jzmeskal, lsvaty, lxia, pelauter, rgolan, rphillips, schoudha, scuppett, wsun, wzheng, xxia | ||||
Target Milestone: | --- | Keywords: | Reopened, TestBlocker, TestBlockerForLayeredProduct | ||||
Target Release: | 4.4.z | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | 1811924 | Environment: | |||||
Last Closed: | 2020-04-15 18:20:41 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1802687, 1811924 | ||||||
Bug Blocks: | |||||||
Attachments: |
|
Comment 1
Wei Sun
2020-03-17 10:21:46 UTC
4.4.0-0.nightly-2020-02-27-020932 is too old. There is a fix in the attached bug that merged. *** This bug has been marked as a duplicate of bug 1802687 *** Bug 1802687 and 1800319 have same title and target release as 4.5. Search https://bugzilla.redhat.com/buglist.cgi?classification=Red%20Hat&j_top=OR&list_id=10920800&product=OpenShift%20Container%20Platform&product=OpenShift%20Online&query_format=advanced&short_desc=A%20pod%20that%20gradually%20leaks&short_desc_type=regexp didn't find bugs for target release 4.4, could you help clarify if 4.4 bug tracker exists? Thanks Can one of you increase the log level of kubelet to see if we can get more info there? Created attachment 1675002 [details]
Journal logs from master-0 with kubelet log level 6
We are no longer seeing this issue since we redeployed the cluster with faster storage. This issue is NOT a blocker for GA of OCP 4.4 on RHV IPI The problem was identified on performance side, bumping the specs to (and beyond) recommended IOPS, RAM, CPU, Disk size, solved these issues. This issue is not RHV specific, and if reproduced should be solved with OCP Performance team or documented as updated minimal requirements. Only thing to consider here is proper error handling in case of insufficient requirements as reported had, which should be tracked in a specific bug for that if neccessary. |