Bug 1250339

Summary: Systemd resets cgroup limits set by libvirt-lxc when a new container is created
Product: Red Hat Enterprise Linux 7 Reporter: Sergei Turchanov <the_plumber>
Component: systemdAssignee: systemd-maint
Status: CLOSED CURRENTRELEASE QA Contact: qe-baseos-daemons
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.1CC: lnykryn, systemd-maint-list
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-08-06 08:22:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
strace of systemd none

Description Sergei Turchanov 2015-08-05 08:04:18 UTC
Created attachment 1059360 [details]
strace of systemd

Description of problem:
My host runs a libvirt-lxc container which has a memory limit set on it. This can be verified on host via virsh:

[vhost]# virsh -c lxc:// memtune search-node21
hard_limit     : 16777216
soft_limit     : unlimited
swap_hard_limit: unlimited

and from search-node21:
[search-node21]$ cat /sys/fs/cgroup/memory/memory.limit_in_bytes
17179869184

When I start another libvirt-lxc container (search-node22) then the memory limit on search-node21 is reset to "unlimited":
[vhost]# virsh -c lxc:// start search-node22
[vhost]# virsh -c lxc:// memtune search-node21
hard_limit     : unlimited
soft_limit     : unlimited
swap_hard_limit: unlimited

and from search-node21:
[search-node21]$ cat /sys/fs/cgroup/memory/memory.limit_in_bytes
9223372036854775807

I did strace of systemd on virtualization host (vhost) and I see that it is systemd that resets the limit (see attachment). Here are the relevant lines:
...
1     1438760993.795229 open("/sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2dsearch\\x2dnode21.scope/memory.limit_in_bytes", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 22 <0.000011>
1     1438760993.795255 fstat(22, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 <0.000006>
1     1438760993.795277 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa670069000 <0.000008>
1     1438760993.795307 write(22, "-1\n", 3) = 3 <0.000008>
1     1438760993.795330 close(22)       = 0 <0.000007>
1     1438760993.795350 munmap(0x7fa670069000, 4096) = 0 <0.000008>
....

from the strace it seems that cgroup_context_apply is called for all containers registered by systemd-machined. 


Version-Release number of selected component (if applicable):
systemd-208-20.0.1.el7_1.5.x86_64
libvirt-1.2.9-1.el7.x86_64

Comment 2 Lukáš Nykrýn 2015-08-05 11:32:10 UTC
This should work with 7.2 systemd. Can you please try this test build? https://copr.fedoraproject.org/coprs/lnykryn/systemd/

Comment 3 Sergei Turchanov 2015-08-06 01:38:20 UTC
Yes, the test build systemd-219-9.el7.centos.x86_64 fixes my problem.
Thanks!

Comment 4 Lukáš Nykrýn 2015-08-06 08:22:34 UTC
Thanks for testing! So this will be fixed in 7.2.