RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1543575 - Container deployment fails with oci runtime error applying cgroup configuration
Summary: Container deployment fails with oci runtime error applying cgroup configuration
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: docker
Version: 7.4
Hardware: Unspecified
OS: Unspecified
urgent
medium
Target Milestone: rc
: ---
Assignee: Daniel Walsh
QA Contact: atomic-bugs@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1523043
TreeView+ depends on / blocked
 
Reported: 2018-02-08 18:10 UTC by Alan Pevec
Modified: 2023-03-14 11:02 UTC (History)
35 users (show)

Fixed In Version: docker-1.13.1-52.gitce62987.el7_4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1523043
Environment:
Last Closed: 2018-03-07 09:51:51 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0436 0 normal SHIPPED_LIVE docker bug fix and enhancement update 2018-03-07 14:51:35 UTC

Description Alan Pevec 2018-02-08 18:10:15 UTC
After investigation back in December in bz 1523043, it was traced down to the systemd/docker interaction.
We are now hitting this more frequently in the OpenStack upstream CI and it is blocking OSP 13 production chain.

+++ This bug was initially created as a clone of Bug #1523043 +++

TripleO CI is sometimes hitting errors when starting containers:

e.g.
 \"Error running ['docker', 'run', '--name', 'rabbitmq_image_tag', '--label', 'config_id=tripleo_step1', '--label', 'container_name=rabbitmq_image_tag', '--label', 'managed_by=paunch', '--label', 'config_data={\\"start_order\\": 1, \\"command\\": [\\"/bin/bash\\", \\"-c\\", \\"/usr/bin/docker tag \\'192.168.24.1:8787/rhosp12/openstack-rabbitmq:12.0-20171201.1\\' \\'192.168.24.1:8787/rhosp12/openstack-rabbitmq:pcmklatest\\'\\"], \\"user\\": \\"root\\", \\"volumes\\": [\\"/etc/hosts:/etc/hosts:ro\\", \\"/etc/localtime:/etc/localtime:ro\\", \\"/dev/shm:/dev/shm:rw\\", \\"/etc/sysconfig/docker:/etc/sysconfig/docker:ro\\", \\"/usr/bin:/usr/bin:ro\\", \\"/var/run/docker.sock:/var/run/docker.sock:rw\\"], \\"image\\": \\"192.168.24.1:8787/rhosp12/openstack-rabbitmq:12.0-20171201.1\\", \\"detach\\": false, \\"net\\": \\"host\\"}', '--net=host', '--user=root', '--volume=/etc/hosts:/etc/hosts:ro', '--volume=/etc/localtime:/etc/localtime:ro', '--volume=/dev/shm:/dev/shm:rw', '--volume=/etc/sysconfig/docker:/etc/sysconfig/docker:ro', '--volume=/usr/bin:/usr/bin:ro', '--volume=/var/run/docker.sock:/var/run/docker.sock:rw', '192.168.24.1:8787/rhosp12/openstack-rabbitmq:12.0-20171201.1', '/bin/bash', '-c', \\"/usr/bin/docker tag '192.168.24.1:8787/rhosp12/openstack-rabbitmq:12.0-20171201.1' '192.168.24.1:8787/rhosp12/openstack-rabbitmq:pcmklatest'\\"]. [125]\", 
 \"/usr/bin/docker-current: Error response from daemon: invalid header field value \\"oci runtime error: container_linux.go:247: starting container process caused \\\\"process_linux.go:258: applying cgroup configuration for process caused \\\\\\\\"write /sys/fs/cgroup/pids/system.slice/docker-0642d71adf65f90fac83693d33be8857e9b1c4a5c69254357ea04fdeadf10c49.scope/cgroup.procs: no such device\\\\\\\\"\\\\"\\n\\".\", 

--- Additional comment from Mark McLoughlin on 2017-12-13 09:11:44 EST ---

Looks similar to https://github.com/openshift/origin/issues/16246

--- Additional comment from Vikas Choudhary on 2018-01-07 16:10:19 EST ---

Here is the detailed analysis of this issue

https://github.com/openshift/origin/issues/16246#issuecomment-355852817

--- Additional comment from Emilien Macchi on 2018-01-30 13:19:47 EST ---

We are having a similar if not the same situation in TripleO gate at this time:
https://bugs.launchpad.net/tripleo/+bug/1746298

The issue is critical as it makes our jobs randomly failing and it blocks OSP13 production chain at this time. Note that oci-register-machine is already disabled.

--- Additional comment from Vikas Choudhary on 2018-02-05 00:41:48 EST ---

There are two different races. The one related to pids cgroup join, IMO, will not get work arounded by disabling oci-register-machine.

--- Additional comment from Alan Pevec on 2018-02-06 18:53:47 EST ---

(In reply to Vikas Choudhary from comment #49)
> Verified from the logs that you shared, docker in use is 1.12.6 and that is
> using runc which is at this commit:
> https://github.com/projectatomic/runc/commit/
> c5d311627d39439c5b1cc35c67a51c9c6ccda648

I also checked latest 7.4-extras-pending build docker-1.13.1-48.gitec9911e.el7_4 and "Fix race against systemd" is not included.

> Fix from opencontainers/runc,
> https://github.com/opencontainers/runc/pull/1683, is not there.  Therefore
> as i said in previous comment, to avoid this failure, mentioned fix should
> be backported to projectatomic/runc

Where do we need to file rhbz to get this into Extras quickly?
systemd bz 1532586 is approved for 7.4.z but next batch update is only March 6th.

--- Additional comment from Daniel Walsh on 2018-02-07 10:02:18 EST ---

Antonio can you see if we can get this patch back ported to docker-runc?

Comment 8 Alan Pevec 2018-02-16 13:19:41 UTC
There seems to be an issue when deploying containerized TripleO with iptables rules added in Docker 1.13

https://bugzilla.redhat.com/show_bug.cgi?id=1543580#c21

* Change the default `FORWARD` policy to `DROP` [#28257](https://github.com/docker/docker/pull/28257)

Comment 11 errata-xmlrpc 2018-03-07 09:51:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0436


Note You need to log in before you can comment on or make changes to this bug.