Bug 1942375 - CRI-O failing with error "reserving ctr name"
Summary: CRI-O failing with error "reserving ctr name"
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.6.z
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.8.0
Assignee: Kir Kolyshkin
QA Contact: MinLi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-03-24 09:45 UTC by Rutvik
Modified: 2021-07-27 22:55 UTC (History)
5 users (show)

Fixed In Version: runc-1.0.0-86.rhaos4.6.git23384e2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-27 22:55:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:55:34 UTC

Description Rutvik 2021-03-24 09:45:48 UTC
Description of problem:

As per the z-stream fix https://bugzilla.redhat.com/show_bug.cgi?id=1934656#c7, the customer has successfully upgraded the cluster to v4.6.21 but the bare metal workers still facing the issue with CRI-O due to which pods are either getting stuck at ContainerCreating or Terminating phase.

Mar 23 20:42:22 [host_44] crio[3612]: time="2021-03-23 20:42:22.278230942Z" level=warning msg="Error reserving ctr name k8s_application_app_name3 for id 55f93365d7453a9f6e72aecc5baf3de1b4424871a9eeed771ff29dd3741c6411: name is reserved"

Mar 23 20:23:39 [host_44] crio[2973]: time="2021-03-23 20:23:39.206376994Z" level=warning msg="Stopping container cdfd6fb1057a7fe40f0b7a67882fa645dfc978470bdecce403c898f5fc11dce6 with stop signal timed out: timeout reached after 30 seconds waiting for container process to exit"


Version-Release number of selected component (if applicable):
v4.6.21

How reproducible:
Always on BareMetal nodes

Actual results:
level=warning msg="Error reserving ctr name

Expected results:
Pods should not be stuck in the ContainerCreating phase.

Additional info:
This issue is usually affecting the bare metal workers only which are being heavily used as compared to other workers.

Comment 4 Peter Hunt 2021-04-01 19:45:26 UTC
Kir, can you take a look and see if there's anything fishy about runc here?

Comment 5 MinLi 2021-04-02 03:04:23 UTC
A similar bug in 4.6:https://bugzilla.redhat.com/show_bug.cgi?id=1934656

Comment 8 Kir Kolyshkin 2021-04-20 23:40:07 UTC
This might be a dupe of #1903228 -- alas, I don't have anything to say at this time.

Comment 9 Kir Kolyshkin 2021-04-27 22:48:15 UTC
Copying the status update I have provided at https://bugzilla.redhat.com/show_bug.cgi?id=1903228#c35:

It would make sense to test if my fix (upstream: https://github.com/opencontainers/runc/pull/2918, 4.6 backport: https://github.com/projectatomic/runc/pull/47) helps or not. I see that both PRs were merged, but I'm not sure if RPMs are available.

Comment 10 Peter Hunt 2021-04-28 15:41:27 UTC
RPMs are available now

Comment 14 MinLi 2021-05-13 06:04:06 UTC
$ oc get clusterversion 
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2021-05-12-122225   True        False         102m    Cluster version is 4.8.0-0.nightly-2021-05-12-122225


sh-4.4# chroot /host 
sh-4.4# rpm -qa | grep runc 
runc-1.0.0-95.rhaos4.8.gitcd80260.el8.x86_64

Comment 17 Kir Kolyshkin 2021-05-27 02:47:23 UTC
An additional fix (https://github.com/projectatomic/runc/pull/52) went into runc-1.0.0-86.rhaos4.6.git23384e2, please re-test with that one.

Comment 21 errata-xmlrpc 2021-07-27 22:55:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.