| Summary: | systemd-journald uses a lot of CPU if several LXC containers are running | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Enrique <cquike> |
| Component: | systemd | Assignee: | systemd-maint |
| Status: | CLOSED NOTABUG | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 20 | CC: | bugs, cquike, johannbg, lnykryn, msekleta, plautrba, sagarun, systemd-maint, tethys, vpavlin, zbyszek |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-12-09 02:55:20 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Enrique
2013-12-04 18:08:50 UTC
The journalctl command shows actually normal activity on the log. Doing a pstack on the systemd-journald processes shows that most of the time the process is int twalk() and setspent() for one of the process and clock_gettime() by the other. Where does your /dev/kmsg and /dev/console devices point to? Can you attach an strace of journald when this happens? Most likely this is just a broken LXC configuration.
Hi,
sorry for the delay in answering, but it is a production system and it took a while until the problem appeared again.
strace for all the three systemd-journald that are using heavily the CPU is the same:
read(8, "", 8192) = 0
epoll_wait(7, {{EPOLLIN|EPOLLERR|EPOLLHUP, {u32=8, u64=8}}}, 1, -1) = 1
writev(2, [{"/dev/kmsg buffer overrun, some m"..., 45}, {"\n", 1}], 2) = 46
this is repeated endlessly.
The devices:
crw-------. 1 root root 5, 1 Jun 10 13:10 /dev/console
crw-r--r--. 1 root root 1, 11 Jun 10 13:10 /dev/kmsg
Hope it helps.
I'm also seeing the same problem. I'm using CentOS 7 with systemd 208:
[root@mollyboxc1 ~]# ls -l /dev/kmsg /dev/console
lrwxrwxrwx 1 root root 11 Jul 19 10:20 /dev/console -> lxc/console
lrwxrwxrwx 1 root root 7 Jul 19 10:20 /dev/kmsg -> console
[root@mollyboxc1 ~]# strace -s1024 -fp 12 2>&1 | head
Process 12 attached
read(8, "", 8192) = 0
epoll_wait(7, {{EPOLLIN|EPOLLERR|EPOLLHUP, {u32=8, u64=8}}}, 1, -1) = 1
writev(2, [{"/dev/kmsg buffer overrun, some messages lost.", 45}, {"\n", 1}], 2) = 46
read(8, "", 8192) = 0
epoll_wait(7, {{EPOLLIN|EPOLLERR|EPOLLHUP, {u32=8, u64=8}}}, 1, -1) = 1
writev(2, [{"/dev/kmsg buffer overrun, some messages lost.", 45}, {"\n", 1}], 2) = 46
read(8, "", 8192) = 0
epoll_wait(7, {{EPOLLIN|EPOLLERR|EPOLLHUP, {u32=8, u64=8}}}, 1, -1) = 1
writev(2, [{"/dev/kmsg buffer overrun, some messages lost.", 45}, {"\n", 1}], 2) = 46
I wonder if it makes a difference that the underlying physical machine is
running Ubuntu, and thus we're using the standard Ubuntu kernel. Is there
some kernel support that systemd relies on that Fedora/RHEL/CentOS kernels
provide that Ubuntu's doesn't?
We saw this as well. A workaround is to set lxc.kmsg 0 in your lxc config. This disables the symlink creation from kmsg -> console. I could reproduce this with $ rpm -q systemd systemd-204-20.fc19.x86_64 [14:47] saga@gnubox ~ $ cat /etc/redhat-release Fedora release 19 (Schrödinger’s Cat) [14:48] saga@gnubox ~ $ rpm -q lxc lxc-1.0.5-2.fc19.x86_64 (In reply to Tethys from comment #4) > I'm also seeing the same problem. I'm using CentOS 7 with systemd 208: > > [root@mollyboxc1 ~]# ls -l /dev/kmsg /dev/console > lrwxrwxrwx 1 root root 11 Jul 19 10:20 /dev/console -> lxc/console > lrwxrwxrwx 1 root root 7 Jul 19 10:20 /dev/kmsg -> console Yeah, this setup is broken. We read from /dev/kmsg and write to /dev/console. If you symlink them to each other, then we will run into a loop. Such a setup is *really* broken. /dev/kmsg should really not be a symlink to /dev/console. |