Bug 1451192
Summary: | atomic-openshift-node service entered failed state after restarting container-engine in containerized environment | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Gan Huang <ghuang> |
Component: | Installer | Assignee: | Giuseppe Scrivano <gscrivan> |
Status: | CLOSED ERRATA | QA Contact: | Gan Huang <ghuang> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 3.6.0 | CC: | aos-bugs, jokerman, mmccomas |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-08-10 05:24:06 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1450307 | ||
Bug Blocks: |
Description
Gan Huang
2017-05-16 05:53:41 UTC
atomic-openshift-node needs to wait for other services (openvswitch, atomic-openshift-master, ovsdb-server, ovs-vswitchd...) to be started before it can be loaded so it takes some time. After a while it gets loaded correctly for me. Can you verify that? I am going to do a change to the container-engine container to use systemd-notify so that it notifies systemd exactly when it is ready, although it won't change that atomic-openshift-node requires some time to be ready after container-engine is restarted. atomic-openshift-node won't get active any more after restarting container-engine. openvswitch got active in about 18 seconds, atomic-openshift-master needs 28 seconds, and atomic-openshift-node never got active in my testing. Looks like it's the issue: https://github.com/coreos/bugs/issues/1395#issuecomment-224741608 After modifying /etc/systemd/system/atomic-openshift-node.service: -Requires=openvswitch.service +Wants=openvswitch.service atomic-openshift-node was able to get active automatically in 39 seconds. Hopefully useful for you. Thanks for investigating it. I've opened a PR to add that patch: https://github.com/openshift/openshift-ansible/pull/4213 I've tested it locally and it still works for me (the node container restarts after some time). Verified with openshift-ansible-3.6.98-1.git.0.e651d65.el7.noarch.rpm atomic-openshift-node service got active after a while when restarting container-engine service. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1716 |