Bug 1046469

Summary:	docker privileged mode with cmd /sbin/init - agetty & high cpu
Product:	[Fedora] Fedora	Reporter:	fortuik <fortuik>
Component:	docker	Assignee:	Daniel Walsh <dwalsh>
Status:	CLOSED EOL	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	22	CC:	adimania, admiller, dwalsh, golang-updates, ichavero, jcajka, jchaloup, jeder, jkeck, jmario, lsm5, mattdm, mgoldman, miminar, skottler, systemd-maint, test1, vbatts
Target Milestone:	---	Keywords:	Reopened
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-07-19 10:49:27 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description fortuik 2013-12-25 14:40:41 UTC

Description of problem:
docker containers in privileged mode mess with the host agetty and produce very hicht cpu load

Version-Release number of selected component (if applicable):
Fedora 20
Docker 0.7.1, 0.7.2

How reproducible:
docker run -privileged -d mattdm/fedora:latest /sbin/init

Actual results:
docker container runs
cpu load is very high - agetty
stopping the container & the docker daemon do not free the cpu resources
one have to kill the agetty process

Expected results:
docker container runs

Additional info:
In my real case i run rhel containers with puppet & ssh daemons for a small puppet lab.

The reason i met this bug is that running docker container with /sbin/init cmd under RHEL 6.5 require priviliged mode.

Comment 1 Daniel Walsh 2014-05-28 17:55:23 UTC

If you try this with the latest code does it still happen.  Killing a container should stop all processes.

Closing as fixed in current release.  Reopen if it still happens.

Comment 2 test1 2015-02-18 15:44:02 UTC

Still see it in RHEL7.

To reproduce, you need to start two Fedora based container in privileged mode. If you start only one in privileged mode, there will be no problem.

Dockerfile

FROM centos:centos7
RUN yum -y swap -- remove fakesystemd -- install systemd systemd-libs
CMD ["/usr/sbin/init"]


Script

docker build -t test .
docker run -d --privileged test
docker run -d --privileged test

Comment 3 Joe Mario 2015-09-29 21:34:32 UTC

> Closing as fixed in current release.  Reopen if it still happens.

Re-opening.  It's still happening with a RHEL7.2 host, a RHEL7.2 container and a docker-1.8.2-2.el7.  The Dockerfile contains systemd, if that matters.

Steps to reproduce:
1) Create two containers using:
   docker run -d -v /sys/fs/cgroup:/sys/fs/cgroup:ro --privileged test /sbin/init
   docker run -d -v /sys/fs/cgroup:/sys/fs/cgroup:ro --privileged test /sbin/init

2) Start two containers:
   docker exec -it <container-1-id> bash
   docker exec -it <container-2-id> bash

3) Run top on the bare metal host.  You should see an agetty process burning up one cpu:

  PID   USER  PR  NI    VIRT  RES  SHR  S  %CPU %MEM  TIME+    COMMAND
73529   root  20   0  110004  680  660  R  99.7  0.0  10:53.98  agetty 

Note, this problem does not occur if, when creating the containers, I replace the "--privileged" flag with "--security-opt label:disable --cap-add SYS_ADMIN"

Joe

Comment 4 Daniel Walsh 2015-09-30 13:07:59 UTC

Not sure this can be fixed.  systemd would have to figure out it is running in a container and then do something different.

Comment 5 Daniel Walsh 2015-09-30 13:08:57 UTC

Potentially being caused by multiple udevs running?

Comment 6 Jeremy Eder 2015-09-30 19:49:40 UTC

I have some more tracing and debug available, but...the agetty process is just spinning in the read syscall:

# ps aux|egrep 'USER|19567'|grep -v grep
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     19567 96.8  0.0 110004   796 ?        Rs   15:37  11:08 /sbin/agetty --noclear tty1

# strace -c -p 19567
Process 19567 attached
^CProcess 19567 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00    0.080810           0    168606           read
------ ----------- ----------- --------- --------- ----------------
100.00    0.080810                168606           total

Comment 7 Daniel Walsh 2015-09-30 19:52:28 UTC

Is read failing?

Comment 8 Jeremy Eder 2015-09-30 20:03:14 UTC

No

...
read(0, "", 1)                          = 0
read(0, "", 1)                          = 0
read(0, "", 1)                          = 0
read(0, "", 1)                          = 0
...

Comment 9 Joe Mario 2015-10-01 01:03:08 UTC

Here's a little more info, but likely doesn't help.  (It didn't help me).
But I share it anyways.

The agetty process continuously loops through calls from main (agetty.c:372) to get_logname().

(gdb) bac
#0  get_logname (cp=<synthetic pointer>, tp=0x7fff062f3520, op=0x7fff062f3870) at term-utils/agetty.c:1553
#1  main (argc=<optimized out>, argv=<optimized out>) at term-utils/agetty.c:372

The line read at line 1533 is:
Breakpoint 1, get_logname (cp=<synthetic pointer>, tp=0x7fff062f3520, op=0x7fff062f3870) at term-utils/agetty.c:1533
1533				if (read(STDIN_FILENO, &c, 1) < 1) {

After it completes, errno is not set, and the value stored into the variable "c" is:
(gdb) p c
$15 = 3 '\003'

After a bunch of checks, it gets down to the switch stmt at line 1574:
1573				/* Do erase, kill and end-of-line processing. */
1574				switch (key) {

Unfortunately the value of "key" is optimized away.

It falls through to the "default:" case at line 1604, and then executes line 1605:
1602				case CTL('D'):
1603					exit(EXIT_SUCCESS);
1604				default:
1605					if (!isascii(ascval) || !isprint(ascval))
1606						break;

On the call to isprint(), we get to the read() of 1 byte from __fd=0:
(gdb) bac
#0  0x0000000000401fd0 in read@plt ()
#1  0x0000000000403752 in read (__nbytes=1, __buf=0x7fff062f3510, __fd=0) at /usr/include/bits/unistd.h:44
#2  get_logname (cp=<synthetic pointer>, tp=0x7fff062f3520, op=0x7fff062f3870) at term-utils/agetty.c:1533
#3  main (argc=<optimized out>, argv=<optimized out>) at term-utils/agetty.c:372

Unfortunately, even though I'm stepping by instruction, in the optimized binary, I can't stop between the isprint() and the upper level read() at line 1533 (where we started above).

Joe

Comment 10 Joe Mario 2015-10-01 01:29:54 UTC

One more note, wrt to the earlier comment about multiple udevs running:

During the moment that the two privileged containers are started, the number of systemd-udevd processes jumps from 2 to 146.  After about 5 seconds that number drops to 4.

If the same two container invocations are done again, but this time without --privileged, then only 15 systemd-udevd processes are created (again for about 4-5 seconds).

Comment 11 Joe Mario 2015-10-01 17:16:48 UTC

Here's one workaround for this.  Sharing what I've learned.

Adding the following to the docker file 

   rm -f /lib/systemd/system/systemd*udev* ; \
   rm -f /lib/systemd/system/getty.target;

causes both the runnaway agetty and the spike in systemd-udevd processes to go away.  I understand there is no need for udev or getty in containers.


I did also try:
   RUN systemctl disable getty.target
   RUN systemctl disable systemd-udevd.service

and although the Dockerfile built fine, all the getty and udev services were still running as they were previously.  So those two "systemctl disable" appear to be no-ops.  Perhaps they get run before /sbin/init (systemd) is invoked.

Comment 14 Mike McCune 2016-03-28 22:57:02 UTC

This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 15 Fedora Admin XMLRPC Client 2016-06-08 14:09:12 UTC

This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 16 Fedora End Of Life 2016-07-19 10:49:27 UTC

Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.