Description of problem: When running libvirt-daemon-1.2.9-3 and systemd-216-5 together on Fedora 21, libvirt will attempt to modify cgroup data under /sys/fs/cgroup/devices/ rather than /sys/fs/cgroup/systemd/machine.slice/machine-qemu<INSTANCE_UUID>.scope/ when attaching a volume with nova-volume (OpenStack). Version-Release number of selected component (if applicable): libvirt-daemon-1.2.9-3 Steps to Reproduce: 1. Install Fedora 21 and ensure libvirt-daemon-1.2.9-3 and systemd-216-5 are installed. 2. Build an instance with OpenStack nova. 3. Attach a volume with cinder/nova-volume. Actual results: The volume attach fails because libvirt attempts to modify: /sys/fs/cgroup/devices/machine.slice/machine-qemu<INSTANCE_UUID>.scope/devices.allow Expected results: Libvirt should modify this path instead: /sys/fs/cgroup/systemd/machine.slice/machine-qemu<INSTANCE_UUID>.scope/devices.allow Additional info: If I restart libvirtd after the volume attachment failure and retry the attachment, it succeeds. It seems like a libvirtd restart is required for it to realize that the /sys/fs/cgroup/systemd/ path exists. Should libvirt realize this change when it registers the virtual machine with systemd-machined?
Or perhaps this is a systemd bug. Should systemd be working with the cgroups under /sys/fs/cgroup/devices for qemu instances?
Additional context: When everything works, I find the machine-qemu cgroups in the usual locations (/sys/fs/cgroup/{devices,perf_event,...}) as well as the systemd location (/sys/fs/cgroup/systemd/). Problems occur with libvirt when the cgroups for the qemu instance ONLY appear in the systemd cgroup location. Restarting libvirtd cleans up the bad situation and puts all of the cgroups in the right place. This happening with ~ 5% of builds. I'm working to reproduce it on a node with libvirtd debugging logging enabled.
Debug logs: https://gist.github.com/major/ba5191d7e995bfdbd188 My system is running in permissive mode, but here's the AVC logged: type=AVC msg=audit(1414678555.120:169): avc: denied { read } for pid=4511 comm="qemu-system-x86" name="nova" dev="dm-0" ino=36334 scontext=system_u:system_r:svirt_t:s0:c279,c465 tcontext=system_u:object_r:nova_var_lib_t:s0 tclass=lnk_file permissive=1 And the audit2allow output: #============= svirt_t ============== allow svirt_t nova_var_lib_t:lnk_file read; The SELinux AVC looks unrelated, but I'm not 100% sure.
Finally caught some activity via inotify events: https://gist.github.com/major/f5fb72aa09030ba68ea7 It looks like something is checking to ensure that /sys/fs/cgroup/devices/machine.slice exists and then it looks for the machine-qemu#####.scope directory a few times. Once all that's done, there's an inotify delete event for the scope directory: <Event dir=True mask=0x40000200 maskname=IN_DELETE|IN_ISDIR name=machine-qemu\x2dinstance\x2d5d8aae0e\x2db66b\x2d4690\x2d8a42\x2def79160e5487.scope path=/sys/fs/cgroup/devices/machine.slice pathname=/sys/fs/cgroup/devices/machine.slice/machine-qemu\x2dinsta nce\x2d5d8aae0e\x2db66b\x2d4690\x2d8a42\x2def79160e5487.scope wd=3 > I haven't been able to figure out if this activity is coming from systemd or libvirtd. Libvirtd claims that it is making the cgroups successfully: https://gist.github.com/major/ba5191d7e995bfdbd188
Changing component to systemd. It appears that systemd is aggressively removing the /sys/fs/cgroup/devices entries while the VM is running.
This is very similar to the issue brought up in bug 1139223.
Talked to folks in #systemd on freenode and an email to the systemd-devel ML was recommended: http://lists.freedesktop.org/archives/systemd-devel/2014-November/024886.html
Fixed in git. (0cd385d31814c8c1bc0c81d11ef321036b8b0921)