Bug 678555

Summary: systemd should not purge application created cgroups, even if they contain no processes
Product: [Fedora] Fedora Reporter: Matěj Cepl <mcepl>
Component: systemdAssignee: Lennart Poettering <lpoetter>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 15CC: anross, aquini, awilliam, bart, berrange, byount, clalance, crobinso, darrellpf, dennisgdaniels, dmaley, dpierce, dwmw2, gholms, hhorak, ipilcher, itamar, jason.dobies, javilinux, jforbes, johannbg, joshua, jp, laine, lists.kho, lists, lmacken, lpoetter, mailings, metherid, mfranc, mrsam, mschmidt, notting, nphilipp, pcfe, plautrba, pmrpla, pslama, shenson, tflink, veillard, virt-maint, vwfoxguru
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: RejectedBlocker, AcceptedNTH
Fixed In Version: systemd-26-8.fc15 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-07-14 21:23:32 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 657621    

Description Matěj Cepl 2011-02-18 07:10:37 EST
Description of problem:
jakoubek:~ $ sudo -i
[sudo] password for matej: 
jakoubek:~# systemctl status libvirtd.service
libvirtd.service - LSB: daemon for libvirt virtualization API
	  Loaded: loaded (/etc/rc.d/init.d/libvirtd)
	  Active: active (running) since Thu, 17 Feb 2011 08:44:54 +0100; 24h ago
	Main PID: 3669 (libvirtd)
	  CGroup: name=systemd:/system/libvirtd.service
		  ├ 1403 /usr/sbin/dnsmasq --strict-order --bind-interfaces...
		  └ 3669 libvirtd --daemon
jakoubek:~# virsh list --all
 Id Name               Status
----------------------------------
  - rawhide              switched-off
  - santiago             switched-off
  - tikanga              switched-off

jakoubek:~# virsh start santiago
error: I couldn't start a domain santiago
error: I cannot create a cgroup for santiago: Director or file doesn't exist

jakoubek:~# systemctl restart libvirtd.service
jakoubek:~# virsh start santiago
Domain santiago started

jakoubek:~# 


Version-Release number of selected component (if applicable):
selinux-policy-3.9.15-1.fc16.noarch
systemd-18-1.fc16.x86_64
libvirt-0.8.8-1.fc16.x86_64


How reproducible:
100% (given long enough delay between start of the service and attempt to start to run virtual machine happens)

Steps to Reproduce:
1. start machine or restart libvirtd service
2. wait for long time (couple of hours?)
3. virsh start <domain>
  
Actual results:
error as shown above

Expected results:
running domain
Comment 1 Daniel Berrange 2011-04-01 06:09:41 EDT
> jakoubek:~# virsh start santiago
> error: I couldn't start a domain santiago
> error: I cannot create a cgroup for santiago: Director or file doesn't exist

This indicates that libvirtd's cgroups do not exist, either because libvirtd was started *before* the cgconfig service ran, or because something deleted libvirtd's cgroups.


> Steps to Reproduce:
> 1. start machine or restart libvirtd service
> 2. wait for long time (couple of hours?)
> 3. virsh start <domain>

If you immediately try to start a guest, after a restart of libvirtd, does it work ?  If so, this suggests that something is deleting libvirtd's cgroups in that 'step 2 wait for long time'.

I wonder if systemd periodically deletes empty cgroups ? Or is there some other cgroup related daemon running ?
Comment 2 Lennart Poettering 2011-04-01 16:43:02 EDT
(In reply to comment #1)

> I wonder if systemd periodically deletes empty cgroups ? Or is there some other
> cgroup related daemon running ?

Yupp, from time to time we go through the tree and kill empty cgroups. I figure you want us to stop doing that?
Comment 3 Daniel Berrange 2011-04-04 05:27:54 EDT
Yes, purging libvirtd's cgroups is rather unkind !  Even if a cgroup does not have any processes in it, it is still useful for it to exist, because it will have an impact on future child cgroups which will contain processes. libvirt creates a 3 level hierarchy, starting from the location in which libvirtd itself is placed.

  [cgroup where libvirtd process is placed by systemd/init]
   |
   +- libvirt
        |
        +- qemu
        |   |
        |   +- qemuguest1
        |   +- qemuguest1
        |   +- qemuguest1
        |   +- qemuguest1
        |
        +- lxc
            |
            +- lxcguest1

Only the leaf nodes actually contain processes. The first two levels are just there to allow admins to set limits that will apply to later child processes
Comment 4 Lennart Poettering 2011-04-05 17:57:50 EDT
Hmm, so we "trim" the cgroup hierarchies at four places:

- When a user session ends we trim his entire hierarchy

- When all processes of a service exited we trim the service's hierarchy

- When a service entered "dead" or "failed" mode (i.e. is stopped) we trim the service's hierarchy. (This is often the same as the previous case)

- Before we start a service we trim its hiearachy

Now, this basically boils down that in real life we should never conflict with libvirt: we never interfere with the tree as long is the daemon is still running. We only trim before and after it is running.

So, unless there's a bug lurking here I don't think systemd is at fault.
Comment 5 Michal Schmidt 2011-04-14 04:44:34 EDT
There is a bug lurking here.
You missed one place where the cgroup hierarchies are trimmed - when reloading the daemon:
manager_reload()
-> manager_clear_jobs_and_units()
  -> unit_free()
    -> cgroup_bonding_free_list()
      -> cgroup_bonding_free()
        -> cg_trim()

Steps to reproduce:
1. service libvirtd restart
2. find /sys/fs/cgroup -path '*libvirt*' -type d > cg-list-1
3. systemctl daemon-reload
4. find /sys/fs/cgroup -path '*libvirt*' -type d > cg-list-2
5. diff -Nu cg-list-*

Actual result:
--- cg-list-1	2011-04-14 10:38:03.138854891 +0200
+++ cg-list-2	2011-04-14 10:38:09.644854896 +0200
@@ -2,7 +2,4 @@
 /sys/fs/cgroup/blkio/libvirt/lxc
 /sys/fs/cgroup/blkio/libvirt/qemu
 /sys/fs/cgroup/cpu/system/libvirtd.service
-/sys/fs/cgroup/cpu/system/libvirtd.service/libvirt
-/sys/fs/cgroup/cpu/system/libvirtd.service/libvirt/lxc
-/sys/fs/cgroup/cpu/system/libvirtd.service/libvirt/qemu
 /sys/fs/cgroup/systemd/system/libvirtd.service

A workaround is to set "DefaultControllers=" in /etc/systemd/system.conf
Comment 6 Daniel Berrange 2011-04-14 06:04:56 EDT
> - When a user session ends we trim his entire hierarchy

Could you clarify what you mean by 'user session' here ?  eg does it mean the hierarchy is trimmed when the user logs out of X ?

As well as the privileged, per-host libvirtd, there is an unprivileged libvirtd daemon that is run per user ID. This isn't tied to the user X session - it is just spawned on demand from any application, whether logged in via X, or ssh, or cron, etc Thus we don't necessarily want to kill libvirtd & its VMs when the user logs out of X, and thus wouldn't really want its cgroups trimmed either.

> - When all processes of a service exited we trim the service's hierarchy

> - When a service entered "dead" or "failed" mode (i.e. is stopped) we trim the
service's hierarchy. (This is often the same as the previous case)

> - Before we start a service we trim its hiearachy

These three cases should all be fine, because libvirtd will re-create anything it needs at startup, and any VMs still running when libvirtd is stopped, will mean the cgroups are not empty & thus not trimmable.

- Daemon reload

This sounds like the main problem people are hitting in this bug.
Comment 7 Dennis Gilmore 2011-04-18 16:14:37 EDT
Im seeing this also, and i feel its a release blocker proposed criteria

The release must boot successfully as a virtual guest in a situation where the virtual host is running the same release (using Fedora's current preferred virtualization technology) 

we cant guarantee that
Comment 8 Lennart Poettering 2011-04-19 21:54:07 EDT
(In reply to comment #5)
> There is a bug lurking here.
> You missed one place where the cgroup hierarchies are trimmed - when reloading
> the daemon:
> manager_reload()
> -> manager_clear_jobs_and_units()
>   -> unit_free()
>     -> cgroup_bonding_free_list()
>       -> cgroup_bonding_free()
>         -> cg_trim()

Ah, indeed.

Fixed now in git. I hope there's not another cg_trim() call lurking somewhere.

> Could you clarify what you mean by 'user session' here ?  eg does it mean the
> hierarchy is trimmed when the user logs out of X ?

Yes, this is what happens.
Comment 9 Cole Robinson 2011-04-20 10:08:52 EDT
*** Bug 698027 has been marked as a duplicate of this bug. ***
Comment 10 Fedora Update System 2011-04-20 22:00:23 EDT
systemd-25-1.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/systemd-25-1.fc15
Comment 11 Fedora Update System 2011-04-20 23:01:48 EDT
Package systemd-25-1.fc15:
* should fix your issue,
* was pushed to the Fedora 15 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-25-1.fc15'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/systemd-25-1.fc15
then log in and leave karma (feedback).
Comment 12 Adam Williamson 2011-04-21 14:44:30 EDT
is https://bugzilla.redhat.com/show_bug.cgi?id=666130 the same as this?
Comment 13 Tim Flink 2011-04-21 14:52:18 EDT
Discussed in the 2011-04-21 blocker bug review meeting. This does come close to the alpha release criteria

"When booting a system installed without a graphical environment, or when using a correct configuration setting to cause an installed system to boot in non-graphical mode, the system should boot to a state where it is possible to log in through at least one of the default virtual consoles"

However, it is fixable with an update and doesn't happen every time to every user so rejected as a release blocker. Since it is a major issue, accepted as NTH for final.
Comment 14 Cole Robinson 2011-04-27 09:01:25 EDT
*** Bug 699886 has been marked as a duplicate of this bug. ***
Comment 15 Cole Robinson 2011-04-27 09:01:44 EDT
*** Bug 666130 has been marked as a duplicate of this bug. ***
Comment 16 Cole Robinson 2011-04-27 09:03:10 EDT
*** Bug 699932 has been marked as a duplicate of this bug. ***
Comment 17 Fedora Update System 2011-04-30 23:22:47 EDT
systemd-25-1.fc15 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 18 Daniel Rowe 2011-05-28 01:56:50 EDT
The release version of Fedora 15 seems to be doing this still.
Comment 19 Jay Dobies 2011-05-28 14:10:47 EDT
I still see it too in Fedora 15 gold, using systemd-26-1.fc15.x86_64.
Comment 20 Michal Schmidt 2011-05-30 06:44:31 EDT
Reopening. It's been fixed for "systemctl daemon-reload", but it is still reproducible using "systemctl daemon-reexec".
It's a less common operation than daemon-reload. It is used when systemd or glibc packages are updated.
Comment 21 Cole Robinson 2011-06-10 12:37:48 EDT
*** Bug 711703 has been marked as a duplicate of this bug. ***
Comment 22 Cole Robinson 2011-06-13 11:13:20 EDT
*** Bug 696218 has been marked as a duplicate of this bug. ***
Comment 23 Bryan Yount 2011-06-13 16:30:43 EDT
Seems to be fixed with the latest round of updates in Fedora 15 but I'm not
sure which update did it.  Now I don't have to restart libvirtd every time I
want to run a new virtual machine.
Comment 24 Michal Schmidt 2011-06-14 03:49:02 EDT
(In reply to comment #23)
> Seems to be fixed with the latest round of updates in Fedora 15

I don't think so.

> Now I don't have to restart libvirtd every time I want to run a new
> virtual machine.

"every time"? That was never the case. See comment #20. It used to break if something (usually an RPM scriptlet) ran 'systemctl daemon-reload' (which a lot of them do). This is fixed, but the 'daemon-reexec' case remains.
Comment 25 Bryan Yount 2011-06-14 20:39:09 EDT
Sorry my comment was a bit out of order there.  I was confirming the fix for the "systemctl daemon-reload" case.  The "every time" comment was how it appeared to me in passing observation but I did not do extensive testing.  Carry on ;)
Comment 26 Ian Pilcher 2011-06-23 11:15:23 EDT
I am still getting this error just about every time I try to start a VM.

  systemd-26-4.fc15.x86_64
Comment 27 Scott Williams 2011-06-28 23:00:12 EDT
I'm also seeing this and so far have been unable to create a VM at all in Fedora 15.  Even being careful to start the cgconfig then libvirt, I still get this error whenever attempting to create a new guest in virt-manager.

systemd-26-5.fc15.x86_64
libvirt-0.8.8-4.fc15.x86_64
Comment 28 Scott Williams 2011-06-28 23:04:17 EDT
Please ignore last comment.  Restarting cgconfig works.  Helps to make sure the terminal window used to restart the service isn't actually an ssh session on another machine.  Doh!
Comment 29 Cole Robinson 2011-06-29 11:55:38 EDT
*** Bug 716436 has been marked as a duplicate of this bug. ***
Comment 30 Cole Robinson 2011-07-01 10:29:19 EDT
*** Bug 714407 has been marked as a duplicate of this bug. ***
Comment 32 Fedora Update System 2011-07-06 05:33:39 EDT
systemd-26-7.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/systemd-26-7.fc15
Comment 33 Fedora Update System 2011-07-06 17:39:41 EDT
Package systemd-26-7.fc15:
* should fix your issue,
* was pushed to the Fedora 15 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-26-7.fc15'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/systemd-26-7.fc15
then log in and leave karma (feedback).
Comment 34 Cole Robinson 2011-07-11 12:58:48 EDT
*** Bug 709076 has been marked as a duplicate of this bug. ***
Comment 35 Fedora Update System 2011-07-14 21:22:53 EDT
systemd-26-8.fc15 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.