Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1508346 - [CRI-O] Restart cri-o service encounter failure
[CRI-O] Restart cri-o service encounter failure
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Containers (Show other bugs)
3.7.0
Unspecified Unspecified
medium Severity medium
: ---
: 3.8.0
Assigned To: Mrunal Patel
DeShuai Ma
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-11-01 05:35 EDT by DeShuai Ma
Modified: 2018-03-28 10:09 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-03-28 10:08:55 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0489 None None None 2018-03-28 10:09 EDT

  None (edit)
Description DeShuai Ma 2017-11-01 05:35:54 EDT
Description of problem:
In some of node, when restart cri-o (system container), it always failed.

Version-Release number of selected component (if applicable):
# ./rootfs/usr/bin/crio --version
crio version 1.0.2
commit: "29077fa6fbd85f0ca9c453ab1bf1ff7b02bc3f5c"

openshift v3.7.0-0.188.0
kubernetes v1.7.6+a08f5eeb62
etcd 3.2.8
OS: rhel-7.4

How reproducible:
In some env

Steps to Reproduce:
[root@ip-172-18-3-194 netns]# systemctl restart cri-o
Job for cri-o.service failed because the control process exited with error code. See "systemctl status cri-o.service" and "journalctl -xe" for details.

//error log
Nov 01 05:01:02 ip-172-18-3-194.ec2.internal systemd[1]: Starting crio daemon...
Nov 01 05:01:02 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:02.925081160-04:00" level=debug msg="[graphdriver] trying provided driver "overlay""
Nov 01 05:01:02 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:02.925218549-04:00" level=debug msg="overlay: overide_kernelcheck=1"
Nov 01 05:01:02 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:02.928550627-04:00" level=warning msg="Using pre-4.0.0 kernel for overlay, mount failures may require kernel update"
Nov 01 05:01:02 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:02.938039155-04:00" level=debug msg="backingFs=xfs,  projectQuotaSupported=false"
Nov 01 05:01:02 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:02.957034087-04:00" level=warning msg="hooks path: "/usr/share/containers/oci/hooks.d" does not exist"
Nov 01 05:01:02 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:02.957098896-04:00" level=warning msg="hooks path: "/etc/containers/oci/hooks.d" does not exist"
Nov 01 05:01:02 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:02.957431126-04:00" level=info msg="CNI network openshift-sdn (type=openshift-sdn) is used from /etc/cni/net.d/80-openshift-network.conf"
Nov 01 05:01:02 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:02.957644895-04:00" level=info msg="CNI network openshift-sdn (type=openshift-sdn) is used from /etc/cni/net.d/80-openshift-network.conf"
Nov 01 05:01:02 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:02.978210108-04:00" level=debug msg="seccomp status: true"
Nov 01 05:01:02 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:02.979348865-04:00" level=debug msg="Golang's threads limit set to 52290"
Nov 01 05:01:06 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:06.742432894-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:01:10 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:10.500897129-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:01:14 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:14.261668911-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:01:18 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:18.032184561-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:01:21 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:21.771947837-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:01:25 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:25.510992870-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:01:29 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:29.251881146-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:01:32 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:32.996297655-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:01:36 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:36.737923757-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:01:40 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:40.477217623-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:01:44 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:44.220704877-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:01:48 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:47.998946313-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:01:51 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:51.738983656-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:01:55 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:55.487455974-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:01:59 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:01:59.239892295-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:02:03 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:02:03.039240830-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:02:06 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:02:06.802948555-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:02:10 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:02:10.548915463-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:02:14 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:02:14.438891402-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:02:18 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:02:18.185889009-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:02:21 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:02:21.927896222-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:02:25 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:02:25.683905109-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:02:29 ip-172-18-3-194.ec2.internal runc[87465]: time="2017-11-01 05:02:29.472924623-04:00" level=warning msg="failed to find container exit file: timed out waiting for the condition"
Nov 01 05:02:32 ip-172-18-3-194.ec2.internal systemd[1]: cri-o.service start operation timed out. Terminating.
Nov 01 05:02:33 ip-172-18-3-194.ec2.internal systemd[1]: cri-o.service: main process exited, code=exited, status=143/n/a
Nov 01 05:02:33 ip-172-18-3-194.ec2.internal systemd[1]: Failed to start crio daemon.
Nov 01 05:02:33 ip-172-18-3-194.ec2.internal systemd[1]: Unit cri-o.service entered failed state.
Nov 01 05:02:33 ip-172-18-3-194.ec2.internal systemd[1]: cri-o.service failed.

Actual results:


Expected results:


Additional info:
Comment 6 Giuseppe Scrivano 2017-11-20 10:19:21 EST
I could not see this error here but I've opened a PR that sets the timeout to infinity as the contrib/systemd/crio.service file already does:

https://github.com/projectatomic/atomic-system-containers/pull/148
Comment 7 Antonio Murdaca 2017-11-20 10:21:50 EST
DeShuai could you re-test this once once we have system containers built
Comment 8 Giuseppe Scrivano 2017-11-20 10:25:23 EST
I am rebuilding gscrivano/cri-o-centos right now.  It should be ready in few minutes.
Comment 9 DeShuai Ma 2017-11-20 21:13:17 EST
I'll re-test it
Comment 11 DeShuai Ma 2018-01-04 03:36:03 EST
Verify on ocp-3.9
# openshift version
openshift v3.9.0-0.16.0
kubernetes v1.9.0-beta1
etcd 3.2.8

# cd /var/lib/containers/atomic/cri-o.0/
# ./rootfs/usr/bin/crio --version
crio version 1.8.2

Now when restart cri-o no this error. 'systemctl restart cri-o' can be success
Comment 14 errata-xmlrpc 2018-03-28 10:08:55 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489

Note You need to log in before you can comment on or make changes to this bug.