Bug 1753067
Summary: | [IPI][OSP] Kubelet fail to admit pod with insufficient memory due to coredns, keepalived and mdns-publisher have no pods in kube-apiserver | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | weiwei jiang <wjiang> | |
Component: | Installer | Assignee: | Yossi Boaron <yboaron> | |
Installer sub component: | OpenShift on OpenStack | QA Contact: | weiwei jiang <wjiang> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | high | |||
Priority: | high | CC: | aos-bugs, bperkins, calfonso, jokerman, mfedosin, mpatel, schoudha, tsedovic, zhsun | |
Version: | 4.2.0 | |||
Target Milestone: | --- | |||
Target Release: | 4.3.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1757390 (view as bug list) | Environment: | ||
Last Closed: | 2020-01-23 11:06:16 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1757390 |
Description
weiwei jiang
2019-09-18 02:01:26 UTC
Found the root cause here that, coredns, keepalived and mdns-publisher pods take 3Gi additional memory space from the worker. so make the calculation failed. sh-4.4# cat /etc/kubernetes/manifests/* | grep -A 3 resources: resources: {} volumeMounts: - name: kubeconfig mountPath: "/etc/kubernetes/kubeconfig" -- resources: requests: cpu: 150m memory: 1Gi -- resources: {} volumeMounts: - name: resource-dir mountPath: "/config" -- resources: requests: cpu: 150m memory: 1Gi -- resources: {} volumeMounts: - name: kubeconfig mountPath: "/etc/kubernetes/kubeconfig" -- resources: requests: cpu: 150m memory: 1Gi Found a workaround that after I create the ns manually, all things work well. oc adm new-project openshift-kni-infra Verified on 4.2.0-0.nightly-2019-09-19-040356 ➜ ~ oc get pods -n openshift-openstack-infra NAME READY STATUS RESTARTS AGE coredns-share-0919a-qn4pn-master-0 1/1 Running 2 2m56s coredns-share-0919a-qn4pn-master-2 1/1 Running 3 118m coredns-share-0919a-qn4pn-worker-9mw25 1/1 Running 3 118m coredns-share-0919a-qn4pn-worker-gzt8w 0/1 Pending 0 2s coredns-share-0919a-qn4pn-worker-v7g8v 1/1 Running 3 2m56s haproxy-share-0919a-qn4pn-master-0 2/2 Running 0 2m56s haproxy-share-0919a-qn4pn-master-2 2/2 Running 2 118m keepalived-share-0919a-qn4pn-master-0 1/1 Running 0 2m56s keepalived-share-0919a-qn4pn-master-2 1/1 Running 1 118m keepalived-share-0919a-qn4pn-worker-9mw25 1/1 Running 1 118m keepalived-share-0919a-qn4pn-worker-gzt8w 0/1 Pending 0 2s keepalived-share-0919a-qn4pn-worker-v7g8v 1/1 Running 1 2m56s mdns-publisher-share-0919a-qn4pn-master-0 1/1 Running 0 2m56s mdns-publisher-share-0919a-qn4pn-master-2 1/1 Running 1 118m mdns-publisher-share-0919a-qn4pn-worker-9mw25 1/1 Running 1 118m mdns-publisher-share-0919a-qn4pn-worker-gzt8w 0/1 Pending 0 2s mdns-publisher-share-0919a-qn4pn-worker-v7g8v 1/1 Running 1 2m56s ➜ ~ oc get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES h-1-265hh 0/1 Pending 0 17s <none> <none> <none> <none> h-1-4sjmn 0/1 Pending 0 17s <none> <none> <none> <none> h-1-678m8 0/1 Pending 0 17s <none> <none> <none> <none> h-1-8tsgh 0/1 Pending 0 17s <none> <none> <none> <none> h-1-9md7j 1/1 Running 0 17s 10.128.2.20 share-0919a-qn4pn-worker-v7g8v <none> <none> h-1-c957g 0/1 Pending 0 17s <none> <none> <none> <none> h-1-cj6mk 1/1 Running 0 17s 10.128.2.22 share-0919a-qn4pn-worker-v7g8v <none> <none> h-1-ctpd8 0/1 Pending 0 17s <none> <none> <none> <none> h-1-deploy 1/1 Running 0 29s 10.131.0.28 share-0919a-qn4pn-worker-9mw25 <none> <none> h-1-h7rzz 0/1 Pending 0 17s <none> <none> <none> <none> h-1-hvh9v 1/1 Running 0 17s 10.131.0.29 share-0919a-qn4pn-worker-9mw25 <none> <none> h-1-jvnjw 1/1 Running 0 17s 10.131.0.30 share-0919a-qn4pn-worker-9mw25 <none> <none> h-1-nwdlx 1/1 Running 0 17s 10.131.0.31 share-0919a-qn4pn-worker-9mw25 <none> <none> h-1-pkmmm 0/1 Pending 0 17s <none> <none> <none> <none> h-1-pppsl 1/1 Running 0 17s 10.128.2.23 share-0919a-qn4pn-worker-v7g8v <none> <none> h-1-px7ls 0/1 Pending 0 17s <none> <none> <none> <none> h-1-r7cbl 0/1 Pending 0 17s <none> <none> <none> <none> h-1-rn5tg 1/1 Running 0 17s 10.128.2.21 share-0919a-qn4pn-worker-v7g8v <none> <none> h-1-rz2x2 1/1 Running 0 17s 10.131.0.32 share-0919a-qn4pn-worker-9mw25 <none> <none> h-1-tnxlb 0/1 Pending 0 17s <none> <none> <none> <none> h-1-x8bxq 0/1 Pending 0 17s <none> <none> <none> <none> Since this is targeted to 4.3.0, so need to wait 4.3 nightly build to have a try then. Checked with 4.3.0-0.nightly-2019-10-15-021732, and the issue is fixed. ➜ ~ oc get pods -n openshift-openstack-infra -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES coredns-qe-wj-6lx69-master-0 1/1 Running 0 45m 192.168.0.29 qe-wj-6lx69-master-0 <none> <none> coredns-qe-wj-6lx69-master-1 1/1 Running 0 45m 192.168.0.15 qe-wj-6lx69-master-1 <none> <none> coredns-qe-wj-6lx69-master-2 1/1 Running 0 46m 192.168.0.20 qe-wj-6lx69-master-2 <none> <none> coredns-qe-wj-6lx69-worker-64svj 1/1 Running 0 29m 192.168.0.35 qe-wj-6lx69-worker-64svj <none> <none> coredns-qe-wj-6lx69-worker-g7pvh 1/1 Running 0 37m 192.168.0.12 qe-wj-6lx69-worker-g7pvh <none> <none> coredns-qe-wj-6lx69-worker-hdgql 1/1 Running 0 37m 192.168.0.41 qe-wj-6lx69-worker-hdgql <none> <none> haproxy-qe-wj-6lx69-master-0 2/2 Running 0 45m 192.168.0.29 qe-wj-6lx69-master-0 <none> <none> haproxy-qe-wj-6lx69-master-1 2/2 Running 0 45m 192.168.0.15 qe-wj-6lx69-master-1 <none> <none> haproxy-qe-wj-6lx69-master-2 2/2 Running 0 45m 192.168.0.20 qe-wj-6lx69-master-2 <none> <none> keepalived-qe-wj-6lx69-master-0 1/1 Running 0 45m 192.168.0.29 qe-wj-6lx69-master-0 <none> <none> keepalived-qe-wj-6lx69-master-1 1/1 Running 0 45m 192.168.0.15 qe-wj-6lx69-master-1 <none> <none> keepalived-qe-wj-6lx69-master-2 1/1 Running 0 45m 192.168.0.20 qe-wj-6lx69-master-2 <none> <none> keepalived-qe-wj-6lx69-worker-64svj 1/1 Running 0 29m 192.168.0.35 qe-wj-6lx69-worker-64svj <none> <none> keepalived-qe-wj-6lx69-worker-g7pvh 1/1 Running 0 37m 192.168.0.12 qe-wj-6lx69-worker-g7pvh <none> <none> keepalived-qe-wj-6lx69-worker-hdgql 1/1 Running 0 37m 192.168.0.41 qe-wj-6lx69-worker-hdgql <none> <none> mdns-publisher-qe-wj-6lx69-master-0 1/1 Running 0 45m 192.168.0.29 qe-wj-6lx69-master-0 <none> <none> mdns-publisher-qe-wj-6lx69-master-1 1/1 Running 0 46m 192.168.0.15 qe-wj-6lx69-master-1 <none> <none> mdns-publisher-qe-wj-6lx69-master-2 1/1 Running 0 45m 192.168.0.20 qe-wj-6lx69-master-2 <none> <none> mdns-publisher-qe-wj-6lx69-worker-64svj 1/1 Running 0 29m 192.168.0.35 qe-wj-6lx69-worker-64svj <none> <none> mdns-publisher-qe-wj-6lx69-worker-g7pvh 1/1 Running 0 37m 192.168.0.12 qe-wj-6lx69-worker-g7pvh <none> <none> mdns-publisher-qe-wj-6lx69-worker-hdgql 1/1 Running 0 37m 192.168.0.41 qe-wj-6lx69-worker-hdgql <none> <none> ➜ ~ oc get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES h-1-8vsk7 1/1 Running 0 40s 10.131.0.34 qe-wj-6lx69-worker-hdgql <none> <none> h-1-ctdnp 1/1 Running 0 40s 10.129.2.25 qe-wj-6lx69-worker-64svj <none> <none> h-1-deploy 1/1 Running 0 48s 10.129.2.24 qe-wj-6lx69-worker-64svj <none> <none> h-1-dkbzb 0/1 Pending 0 40s <none> <none> <none> <none> h-1-fhckn 1/1 Running 0 40s 10.128.2.27 qe-wj-6lx69-worker-g7pvh <none> <none> h-1-gxj98 1/1 Running 0 40s 10.128.2.28 qe-wj-6lx69-worker-g7pvh <none> <none> h-1-mhddx 1/1 Running 0 40s 10.131.0.35 qe-wj-6lx69-worker-hdgql <none> <none> h-1-njdrm 1/1 Running 0 40s 10.128.2.29 qe-wj-6lx69-worker-g7pvh <none> <none> h-1-w477k 1/1 Running 0 40s 10.129.2.27 qe-wj-6lx69-worker-64svj <none> <none> h-1-x27zn 0/1 Pending 0 40s <none> <none> <none> <none> h-1-z5vwf 1/1 Running 0 40s 10.129.2.26 qe-wj-6lx69-worker-64svj <none> <none> ➜ ~ oc version Client Version: v4.3.0 Server Version: 4.3.0-0.nightly-2019-10-15-021732 Kubernetes Version: v1.16.0-beta.2+a6ff814 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062 |