Bug 1459265
Summary: | journactl on node repeats: du and find on following dirs took | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Phil Cameron <pcameron> |
Component: | Node | Assignee: | Seth Jennings <sjenning> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Xiaoli Tian <xtian> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.5.0 | CC: | aos-bugs, aos-storage-staff, decarr, eparis, fshaikh, gblomqui, jokerman, mmccomas, schoudha |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | 3.11.z | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-06-18 14:09:37 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Phil Cameron
2017-06-06 16:47:15 UTC
Can you provide dump of PV, PVC and PODs in use? No PV/PVC configured. # oc get po NAME READY STATUS RESTARTS AGE docker-registry-4-2hdmp 1/1 Running 0 1h hello-rc-c9m05 1/1 Running 0 4d hello-rc is "hello openshift!" This occurs because the node is running low on resources (https://github.com/kubernetes/kubernetes/issues/42164) which can easily happen because of https://bugzilla.redhat.com/show_bug.cgi?id=1459252; so I would say https://bugzilla.redhat.com/show_bug.cgi?id=1459252 is the root cause and this is just a symptom This is a Dell R730 24 physical CPUs, 256G memory 10Ge networking. Which resource is running short? top: Tasks: 505 total, 2 running, 503 sleeping, 0 stopped, 0 zombie %Cpu(s): 2.8 us, 1.3 sy, 0.0 ni, 95.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 26386145+total, 20777251+free, 37240100 used, 18848852 buff/cache KiB Swap: 0 total, 0 free, 0 used. 21836817+avail Mem Is this trying to delete something on disk? If so where/what is it? No, it's cadvisor keeping track of filesystem stats and taking too long for some reason. It's out of scope of storage, I think this is a metrics issue bounding this to Solly on the kube team to further debug Spoke with Solly Ross. Problem was cause by go v1.8.1 build. files/directories were created and not cleaned up. Based on this the message is correct. So not a bug. This was fixed in cadvisor https://github.com/google/cadvisor/pull/1766 for OCP 3.7+ |