Bug 1038259 - systemd-journald uses a lot of CPU if several LXC containers are running
Summary: systemd-journald uses a lot of CPU if several LXC containers are running
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: 20
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-12-04 18:08 UTC by Enrique
Modified: 2014-12-09 02:55 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-12-09 02:55:20 UTC
Type: Bug


Attachments (Terms of Use)

Description Enrique 2013-12-04 18:08:50 UTC
Description of problem:

 Using F20 with default installation and setting up LXC to create 10 containers running at the same time, systemd-journald starts to use quite some CPU (>60% of one CPU).
 I see that there are 7 processes of systemd-journald, 2 of which are very active, while the rest are quite.

Version-Release number of selected component (if applicable):

systemd-208-6.fc20.x86_64

How reproducible:

Always that the LXC containers are started.

Steps to Reproduce:
1. Create several LXC containers and start them
2. Connect to the containers and start executing CPU intensive processes (in my case compilations).
3. Check system-journald CPU consumption.

Comment 1 Enrique 2013-12-04 18:16:43 UTC
 The journalctl command shows actually normal activity on the log.
 Doing a pstack on the systemd-journald processes shows that most of the time the process is int twalk() and setspent() for one of the process and clock_gettime() by the other.

Comment 2 Lennart Poettering 2014-06-18 09:31:10 UTC
Where does your /dev/kmsg and /dev/console devices point to?

Can you attach an strace of journald when this happens?

Most likely this is just a broken LXC configuration.

Comment 3 Enrique 2014-07-08 09:43:57 UTC
 Hi,
 sorry for the delay in answering, but it is a production system and it took a while until the problem appeared again.

 strace for all the three systemd-journald that are using heavily the CPU is the same:

read(8, "", 8192)                       = 0
epoll_wait(7, {{EPOLLIN|EPOLLERR|EPOLLHUP, {u32=8, u64=8}}}, 1, -1) = 1
writev(2, [{"/dev/kmsg buffer overrun, some m"..., 45}, {"\n", 1}], 2) = 46

 this is repeated endlessly.

 The devices:
crw-------. 1 root root 5,  1 Jun 10 13:10 /dev/console
crw-r--r--. 1 root root 1, 11 Jun 10 13:10 /dev/kmsg
 

 Hope it helps.

Comment 4 Tethys 2014-07-28 12:57:46 UTC
I'm also seeing the same problem. I'm using CentOS 7 with systemd 208:

[root@mollyboxc1 ~]# ls -l /dev/kmsg /dev/console
lrwxrwxrwx 1 root root 11 Jul 19 10:20 /dev/console -> lxc/console
lrwxrwxrwx 1 root root  7 Jul 19 10:20 /dev/kmsg -> console
[root@mollyboxc1 ~]# strace -s1024 -fp 12 2>&1 | head
Process 12 attached
read(8, "", 8192)                       = 0
epoll_wait(7, {{EPOLLIN|EPOLLERR|EPOLLHUP, {u32=8, u64=8}}}, 1, -1) = 1
writev(2, [{"/dev/kmsg buffer overrun, some messages lost.", 45}, {"\n", 1}], 2) = 46
read(8, "", 8192)                       = 0
epoll_wait(7, {{EPOLLIN|EPOLLERR|EPOLLHUP, {u32=8, u64=8}}}, 1, -1) = 1
writev(2, [{"/dev/kmsg buffer overrun, some messages lost.", 45}, {"\n", 1}], 2) = 46
read(8, "", 8192)                       = 0
epoll_wait(7, {{EPOLLIN|EPOLLERR|EPOLLHUP, {u32=8, u64=8}}}, 1, -1) = 1
writev(2, [{"/dev/kmsg buffer overrun, some messages lost.", 45}, {"\n", 1}], 2) = 46

I wonder if it makes a difference that the underlying physical machine is
running Ubuntu, and thus we're using the standard Ubuntu kernel. Is there
some kernel support that systemd relies on that Fedora/RHEL/CentOS kernels
provide that Ubuntu's doesn't?

Comment 5 James Kyle 2014-07-28 23:53:50 UTC
We saw this as well.

A workaround is to set 

lxc.kmsg 0 

in your lxc config. This disables the symlink creation from kmsg -> console.

Comment 6 Arun S A G 2014-08-15 21:48:30 UTC
I could reproduce this with


$ rpm -q systemd
systemd-204-20.fc19.x86_64
[14:47] saga@gnubox ~ $ cat /etc/redhat-release 
Fedora release 19 (Schrödinger’s Cat)
[14:48] saga@gnubox ~ $ rpm -q lxc
lxc-1.0.5-2.fc19.x86_64

Comment 7 Lennart Poettering 2014-12-09 02:55:20 UTC
(In reply to Tethys from comment #4)
> I'm also seeing the same problem. I'm using CentOS 7 with systemd 208:
> 
> [root@mollyboxc1 ~]# ls -l /dev/kmsg /dev/console
> lrwxrwxrwx 1 root root 11 Jul 19 10:20 /dev/console -> lxc/console
> lrwxrwxrwx 1 root root  7 Jul 19 10:20 /dev/kmsg -> console


Yeah, this setup is broken. We read from /dev/kmsg and write to /dev/console. If you symlink them to each other, then we will run into a loop.

Such a setup is *really* broken. /dev/kmsg should really not be a symlink to /dev/console.


Note You need to log in before you can comment on or make changes to this bug.