Bug 1250339 - Systemd resets cgroup limits set by libvirt-lxc when a new container is created
Systemd resets cgroup limits set by libvirt-lxc when a new container is created
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: systemd (Show other bugs)
x86_64 Linux
unspecified Severity high
: rc
: ---
Assigned To: systemd-maint
Depends On:
  Show dependency treegraph
Reported: 2015-08-05 04:04 EDT by Sergei Turchanov
Modified: 2015-08-06 04:22 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2015-08-06 04:22:34 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
strace of systemd (153.73 KB, text/plain)
2015-08-05 04:04 EDT, Sergei Turchanov
no flags Details

  None (edit)
Description Sergei Turchanov 2015-08-05 04:04:18 EDT
Created attachment 1059360 [details]
strace of systemd

Description of problem:
My host runs a libvirt-lxc container which has a memory limit set on it. This can be verified on host via virsh:

[vhost]# virsh -c lxc:// memtune search-node21
hard_limit     : 16777216
soft_limit     : unlimited
swap_hard_limit: unlimited

and from search-node21:
[search-node21]$ cat /sys/fs/cgroup/memory/memory.limit_in_bytes

When I start another libvirt-lxc container (search-node22) then the memory limit on search-node21 is reset to "unlimited":
[vhost]# virsh -c lxc:// start search-node22
[vhost]# virsh -c lxc:// memtune search-node21
hard_limit     : unlimited
soft_limit     : unlimited
swap_hard_limit: unlimited

and from search-node21:
[search-node21]$ cat /sys/fs/cgroup/memory/memory.limit_in_bytes

I did strace of systemd on virtualization host (vhost) and I see that it is systemd that resets the limit (see attachment). Here are the relevant lines:
1     1438760993.795229 open("/sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2dsearch\\x2dnode21.scope/memory.limit_in_bytes", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 22 <0.000011>
1     1438760993.795255 fstat(22, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 <0.000006>
1     1438760993.795277 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa670069000 <0.000008>
1     1438760993.795307 write(22, "-1\n", 3) = 3 <0.000008>
1     1438760993.795330 close(22)       = 0 <0.000007>
1     1438760993.795350 munmap(0x7fa670069000, 4096) = 0 <0.000008>

from the strace it seems that cgroup_context_apply is called for all containers registered by systemd-machined. 

Version-Release number of selected component (if applicable):
Comment 2 Lukáš Nykrýn 2015-08-05 07:32:10 EDT
This should work with 7.2 systemd. Can you please try this test build? https://copr.fedoraproject.org/coprs/lnykryn/systemd/
Comment 3 Sergei Turchanov 2015-08-05 21:38:20 EDT
Yes, the test build systemd-219-9.el7.centos.x86_64 fixes my problem.
Comment 4 Lukáš Nykrýn 2015-08-06 04:22:34 EDT
Thanks for testing! So this will be fixed in 7.2.

Note You need to log in before you can comment on or make changes to this bug.