Bug 1388867
Summary: | node service restart failed when a pod is running on this node | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Johnny Liu <jialiu> | ||||
Component: | Networking | Assignee: | Dan Williams <dcbw> | ||||
Status: | CLOSED ERRATA | QA Contact: | Meng Bo <bmeng> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 3.4.0 | CC: | aos-bugs, bbennett, dakini, ekuric, eparis, haowang, jeder, jialiu, tdawson, vlaad, wmeng | ||||
Target Milestone: | --- | Keywords: | TestBlocker | ||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | aos-scalability-34 | ||||||
Fixed In Version: | Doc Type: | No Doc Update | |||||
Doc Text: |
undefined
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-01-18 12:46:12 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1303130 | ||||||
Attachments: |
|
Description
Johnny Liu
2016-10-26 10:39:23 UTC
It is not related to plugin type, the problem exists in both subnet and multitenant env. The valuable logs from my viewpoint are: Oct 25 08:26:01 ip-172-18-24-156.ec2.internal atomic-openshift-node[92648]: I1025 08:26:01.979679 92648 kubelet.go:2240] skipping pod synchronization - [SDN pod network is not ready] Oct 25 08:26:31 ip-172-18-24-156.ec2.internal atomic-openshift-node[92648]: I1025 08:26:31.980867 92648 kubelet.go:2240] skipping pod synchronization - [SDN pod network is not ready] Oct 25 08:26:36 ip-172-18-24-156.ec2.internal atomic-openshift-node[92648]: I1025 08:26:36.981065 92648 kubelet.go:2240] skipping pod synchronization - [SDN pod network is not ready] Oct 25 08:27:02 ip-172-18-24-156.ec2.internal atomic-openshift-node[92947]: I1025 08:27:02.257550 92947 kubelet.go:2240] skipping pod synchronization - [network state unknown container runtime is down] Seems that the node/kubelet cannot get the correct pod status or cannot bring the existing pods up after restarting. Is this a dupe of https://bugzilla.redhat.com/show_bug.cgi?id=1388556 ? Any chance you get get more of the node's logs, and better yet with --loglevel=5 ? (In reply to Dan Williams from comment #3) > Any chance you get get more of the node's logs, and better yet with > --loglevel=5 ? The node logs was gotten at --loglevel=5. I believe this should be fixed by https://github.com/openshift/origin/pull/11613 and more specifically https://github.com/openshift/origin/pull/11613/commits/d861f0630f5888756516277e6e5800a83089208c Can't be MODIFIED until the PR is merged. This has been merged into ose and is in OSE v3.4.0.22 or newer. Verified this bug with atomic-openshift-3.4.0.22-1.git.0.5c56720.el7.x86_64, and PASS. Now re-start node successfully. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0066 |