Bug 1760233 - Unable to complete install: 'Unable to read from '/sys/fs/cgroup/unified/machine/cgroup.controllers': No such file or directory'
Summary: Unable to complete install: 'Unable to read from '/sys/fs/cgroup/unified/mach...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Virtualization Tools
Classification: Community
Component: libvirt
Version: unspecified
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Pavel Hrdina
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-10-10 08:51 UTC by Chris Marusich
Modified: 2019-11-15 14:48 UTC (History)
6 users (show)

Fixed In Version: libvirt-5.10.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-15 14:48:06 UTC
Embargoed:


Attachments (Terms of Use)
libvirtd logs showing the error (186.99 KB, application/gzip)
2019-10-10 08:51 UTC, Chris Marusich
no flags Details
Systemd references in non-systemd configurations (1.68 KB, patch)
2019-10-28 11:45 UTC, Miguel Arruga Vivas
no flags Details | Diff
vircgroup: Ensure /machine group is associated with its parent. (1.56 KB, patch)
2019-11-01 16:50 UTC, Miguel Arruga Vivas
no flags Details | Diff

Description Chris Marusich 2019-10-10 08:51:05 UTC
Created attachment 1624193 [details]
libvirtd logs showing the error

The Guix project has discovered a possible bug in libvirt, as reported here:

https://debbugs.gnu.org/cgi/bugreport.cgi?bug=36634

Description of problem:

When creating a new domain using virt-manager, the following error occurs:

Unable to complete install: 'Unable to read from '/sys/fs/cgroup/unified/machine/cgroup.controllers': No such file or directory'

Traceback (most recent call last):
  File "/gnu/store/ffbzcig3qdby93rsx5b43kscvp1k5pfh-virt-manager-2.1.0/share/virt-manager/virtManager/asyncjob.py", line 75, in cb_wrapper
    callback(asyncjob, *args, **kwargs)
  File "/gnu/store/ffbzcig3qdby93rsx5b43kscvp1k5pfh-virt-manager-2.1.0/share/virt-manager/virtManager/create.py", line 2122, in _do_async_install
    guest.installer_instance.start_install(guest, meter=meter)
  File "/gnu/store/ffbzcig3qdby93rsx5b43kscvp1k5pfh-virt-manager-2.1.0/share/virt-manager/virtinst/installer.py", line 415, in start_install
    doboot, transient)
  File "/gnu/store/ffbzcig3qdby93rsx5b43kscvp1k5pfh-virt-manager-2.1.0/share/virt-manager/virtinst/installer.py", line 358, in _create_guest
    domain = self.conn.createXML(install_xml or final_xml, 0)
  File "/gnu/store/1lwai2bkvkm0d5vvfqscpil2lwc2kqq5-python-libvirt-5.5.0/lib/python3.7/site-packages/libvirt.py", line 3840, in createXML
    if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self)
libvirt.libvirtError: Unable to read from '/sys/fs/cgroup/unified/machine/cgroup.controllers': No such file or directory

Version-Release number of selected component (if applicable):

libvirt version 5.5.0

I have also verified that the problem occurs in libvirt 5.6.0, but the
logs etc. below are from a Guix system using libvirt 5.5.0.

How reproducible:

This behavior is consistently reproducible on Guix.  Here is all the information, including reproduction steps, that I have:

Neither /sys/fs/cgroup/machine/cgroup.controllers nor /sys/fs/cgroup/machine.slice/cgroup.controllers exist on my system:

guest@gnu ~$ ls /sys/fs/cgroup/machine/cgroup.controllers
ls: cannot access '/sys/fs/cgroup/machine/cgroup.controllers': No such file or directory
guest@gnu ~$ ls /sys/fs/cgroup/machine.slice/cgroup.controllers
ls: cannot access '/sys/fs/cgroup/machine.slice/cgroup.controllers': No such file or directory
guest@gnu ~$ 

Here is the output of mount and the contents of /proc/mounts:

guest@gnu ~$ mount
none on /proc type proc (rw,relatime)
none on /dev type devtmpfs (rw,relatime,size=736152k,nr_inodes=184038,mode=755)
none on /sys type sysfs (rw,relatime)
/dev/vda1 on / type ext4 (rw,relatime)
none on /dev/pts type devpts (rw,relatime,gid=996,mode=620,ptmxmode=000)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,relatime)
/dev/vda1 on /gnu/store type ext4 (ro,relatime)
none on /run/systemd type tmpfs (rw,nosuid,nodev,noexec,relatime,mode=755)
none on /run/user type tmpfs (rw,nosuid,nodev,noexec,relatime,mode=755)
cgroup on /sys/fs/cgroup type tmpfs (rw,relatime)
cgroup on /sys/fs/cgroup/elogind type cgroup (rw,relatime,name=elogind)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,relatime,cpuset)
cgroup on /sys/fs/cgroup/cpu type cgroup (rw,relatime,cpu)
cgroup on /sys/fs/cgroup/cpuacct type cgroup (rw,relatime,cpuacct)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,relatime,memory)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,relatime,freezer)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,relatime,blkio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,relatime,perf_event)
cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=148776k,mode=700,uid=1000,gid=998)
guest@gnu ~$ cat /proc/mounts
none /proc proc rw,relatime 0 0
none /dev devtmpfs rw,relatime,size=736152k,nr_inodes=184038,mode=755 0 0
none /sys sysfs rw,relatime 0 0
/dev/vda1 / ext4 rw,relatime 0 0
none /dev/pts devpts rw,relatime,gid=996,mode=620,ptmxmode=000 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev,relatime 0 0
/dev/vda1 /gnu/store ext4 ro,relatime 0 0
none /run/systemd tmpfs rw,nosuid,nodev,noexec,relatime,mode=755 0 0
none /run/user tmpfs rw,nosuid,nodev,noexec,relatime,mode=755 0 0
cgroup /sys/fs/cgroup tmpfs rw,relatime 0 0
cgroup /sys/fs/cgroup/elogind cgroup rw,relatime,name=elogind 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/cpu cgroup rw,relatime,cpu 0 0
cgroup /sys/fs/cgroup/cpuacct cgroup rw,relatime,cpuacct 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,relatime,memory 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,relatime,devices 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,relatime,freezer 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,relatime,blkio 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,relatime,perf_event 0 0
cgroup2 /sys/fs/cgroup/unified cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 0 0
tmpfs /run/user/1000 tmpfs rw,nosuid,nodev,relatime,size=148776k,mode=700,uid=1000,gid=998 0 0
guest@gnu ~$ 

These are the directories and files under /sys/fs/cgroup:

guest@gnu ~$ find /sys/fs/cgroup
/sys/fs/cgroup
/sys/fs/cgroup/unified
/sys/fs/cgroup/unified/cgroup.procs
/sys/fs/cgroup/unified/cgroup.max.descendants
/sys/fs/cgroup/unified/cgroup.stat
/sys/fs/cgroup/unified/cgroup.threads
/sys/fs/cgroup/unified/cgroup.controllers
/sys/fs/cgroup/unified/cgroup.subtree_control
/sys/fs/cgroup/unified/cgroup.max.depth
/sys/fs/cgroup/perf_event
/sys/fs/cgroup/perf_event/cgroup.procs
/sys/fs/cgroup/perf_event/cgroup.sane_behavior
/sys/fs/cgroup/perf_event/tasks
/sys/fs/cgroup/perf_event/notify_on_release
/sys/fs/cgroup/perf_event/release_agent
/sys/fs/cgroup/perf_event/cgroup.clone_children
/sys/fs/cgroup/blkio
/sys/fs/cgroup/blkio/cgroup.procs
/sys/fs/cgroup/blkio/blkio.throttle.read_iops_device
/sys/fs/cgroup/blkio/blkio.throttle.io_service_bytes
/sys/fs/cgroup/blkio/cgroup.sane_behavior
/sys/fs/cgroup/blkio/blkio.throttle.write_iops_device
/sys/fs/cgroup/blkio/blkio.reset_stats
/sys/fs/cgroup/blkio/blkio.throttle.read_bps_device
/sys/fs/cgroup/blkio/blkio.throttle.write_bps_device
/sys/fs/cgroup/blkio/tasks
/sys/fs/cgroup/blkio/notify_on_release
/sys/fs/cgroup/blkio/release_agent
/sys/fs/cgroup/blkio/cgroup.clone_children
/sys/fs/cgroup/blkio/blkio.throttle.io_serviced
/sys/fs/cgroup/blkio/blkio.throttle.io_service_bytes_recursive
/sys/fs/cgroup/blkio/blkio.throttle.io_serviced_recursive
/sys/fs/cgroup/freezer
/sys/fs/cgroup/freezer/cgroup.procs
/sys/fs/cgroup/freezer/cgroup.sane_behavior
/sys/fs/cgroup/freezer/tasks
/sys/fs/cgroup/freezer/notify_on_release
/sys/fs/cgroup/freezer/release_agent
/sys/fs/cgroup/freezer/cgroup.clone_children
/sys/fs/cgroup/devices
/sys/fs/cgroup/devices/cgroup.procs
/sys/fs/cgroup/devices/devices.deny
/sys/fs/cgroup/devices/cgroup.sane_behavior
/sys/fs/cgroup/devices/devices.list
/sys/fs/cgroup/devices/devices.allow
/sys/fs/cgroup/devices/tasks
/sys/fs/cgroup/devices/notify_on_release
/sys/fs/cgroup/devices/release_agent
/sys/fs/cgroup/devices/cgroup.clone_children
/sys/fs/cgroup/memory
/sys/fs/cgroup/memory/cgroup.procs
/sys/fs/cgroup/memory/memory.use_hierarchy
/sys/fs/cgroup/memory/memory.kmem.tcp.usage_in_bytes
/sys/fs/cgroup/memory/memory.soft_limit_in_bytes
/sys/fs/cgroup/memory/cgroup.sane_behavior
/sys/fs/cgroup/memory/memory.force_empty
/sys/fs/cgroup/memory/memory.pressure_level
/sys/fs/cgroup/memory/memory.move_charge_at_immigrate
/sys/fs/cgroup/memory/memory.kmem.tcp.max_usage_in_bytes
/sys/fs/cgroup/memory/memory.max_usage_in_bytes
/sys/fs/cgroup/memory/memory.oom_control
/sys/fs/cgroup/memory/memory.stat
/sys/fs/cgroup/memory/memory.kmem.slabinfo
/sys/fs/cgroup/memory/memory.limit_in_bytes
/sys/fs/cgroup/memory/memory.swappiness
/sys/fs/cgroup/memory/memory.numa_stat
/sys/fs/cgroup/memory/memory.kmem.failcnt
/sys/fs/cgroup/memory/memory.kmem.max_usage_in_bytes
/sys/fs/cgroup/memory/memory.usage_in_bytes
/sys/fs/cgroup/memory/tasks
/sys/fs/cgroup/memory/memory.failcnt
/sys/fs/cgroup/memory/cgroup.event_control
/sys/fs/cgroup/memory/memory.kmem.tcp.failcnt
/sys/fs/cgroup/memory/memory.kmem.limit_in_bytes
/sys/fs/cgroup/memory/notify_on_release
/sys/fs/cgroup/memory/release_agent
/sys/fs/cgroup/memory/memory.kmem.usage_in_bytes
/sys/fs/cgroup/memory/memory.kmem.tcp.limit_in_bytes
/sys/fs/cgroup/memory/cgroup.clone_children
/sys/fs/cgroup/cpuacct
/sys/fs/cgroup/cpuacct/cgroup.procs
/sys/fs/cgroup/cpuacct/cgroup.sane_behavior
/sys/fs/cgroup/cpuacct/cpuacct.usage_percpu_sys
/sys/fs/cgroup/cpuacct/cpuacct.usage_percpu
/sys/fs/cgroup/cpuacct/cpuacct.stat
/sys/fs/cgroup/cpuacct/cpuacct.usage
/sys/fs/cgroup/cpuacct/tasks
/sys/fs/cgroup/cpuacct/cpuacct.usage_sys
/sys/fs/cgroup/cpuacct/cpuacct.usage_all
/sys/fs/cgroup/cpuacct/cpuacct.usage_percpu_user
/sys/fs/cgroup/cpuacct/notify_on_release
/sys/fs/cgroup/cpuacct/release_agent
/sys/fs/cgroup/cpuacct/cgroup.clone_children
/sys/fs/cgroup/cpuacct/cpuacct.usage_user
/sys/fs/cgroup/cpu
/sys/fs/cgroup/cpu/cgroup.procs
/sys/fs/cgroup/cpu/cpu.cfs_period_us
/sys/fs/cgroup/cpu/cgroup.sane_behavior
/sys/fs/cgroup/cpu/cpu.stat
/sys/fs/cgroup/cpu/cpu.shares
/sys/fs/cgroup/cpu/cpu.cfs_quota_us
/sys/fs/cgroup/cpu/tasks
/sys/fs/cgroup/cpu/notify_on_release
/sys/fs/cgroup/cpu/release_agent
/sys/fs/cgroup/cpu/cgroup.clone_children
/sys/fs/cgroup/cpuset
/sys/fs/cgroup/cpuset/cgroup.procs
/sys/fs/cgroup/cpuset/cgroup.sane_behavior
/sys/fs/cgroup/cpuset/cpuset.memory_pressure
/sys/fs/cgroup/cpuset/cpuset.memory_migrate
/sys/fs/cgroup/cpuset/cpuset.memory_pressure_enabled
/sys/fs/cgroup/cpuset/cpuset.mem_exclusive
/sys/fs/cgroup/cpuset/cpuset.memory_spread_slab
/sys/fs/cgroup/cpuset/cpuset.cpu_exclusive
/sys/fs/cgroup/cpuset/tasks
/sys/fs/cgroup/cpuset/cpuset.effective_mems
/sys/fs/cgroup/cpuset/cpuset.effective_cpus
/sys/fs/cgroup/cpuset/notify_on_release
/sys/fs/cgroup/cpuset/release_agent
/sys/fs/cgroup/cpuset/cpuset.sched_load_balance
/sys/fs/cgroup/cpuset/cpuset.mems
/sys/fs/cgroup/cpuset/cpuset.mem_hardwall
/sys/fs/cgroup/cpuset/cpuset.sched_relax_domain_level
/sys/fs/cgroup/cpuset/cpuset.cpus
/sys/fs/cgroup/cpuset/cgroup.clone_children
/sys/fs/cgroup/cpuset/cpuset.memory_spread_page
/sys/fs/cgroup/elogind
/sys/fs/cgroup/elogind/cgroup.procs
/sys/fs/cgroup/elogind/cgroup.sane_behavior
/sys/fs/cgroup/elogind/c1
/sys/fs/cgroup/elogind/c1/cgroup.procs
/sys/fs/cgroup/elogind/c1/tasks
/sys/fs/cgroup/elogind/c1/notify_on_release
/sys/fs/cgroup/elogind/c1/cgroup.clone_children
/sys/fs/cgroup/elogind/tasks
/sys/fs/cgroup/elogind/notify_on_release
/sys/fs/cgroup/elogind/release_agent
/sys/fs/cgroup/elogind/cgroup.clone_children
guest@gnu ~$ 

I enabled debug logging as described in:

https://wiki.libvirt.org/page/DebugLogs

And here is what I believe to be the relevant parts of the logs.

Daemon logs (see attached file libvirtd.log.gz for full contents - note that I ran through the reproduction steps multiple times, so there is more than one occurrence of the problem in the logs):

2019-10-10 06:56:02.818+0000: 332: error : virFileReadAll:1431 : Failed to open file '/sys/fs/cgroup/unified/machine/cgroup.controllers': No such file or directory
2019-10-10 06:56:02.818+0000: 332: error : virCgroupV2ParseControllersFile:260 : Unable to read from '/sys/fs/cgroup/unified/machine/cgroup.controllers': No such file or directory

If you would like to try reproducing this error yourself in a Guix system, you can do so by using the following VM image:

https://media.marusich.info/libvirt-bug-repro.tar.gz

That tarball is about 1 GB large, and its SHA-512 hash is:

3691e6bfe9f1dae0a7c501772eff3b2525eaff6f929dc21a755171d444c07b3d9b026d443cb46446f455755227c75c75b37d909bd358f1c498b0604c3be2c61c

Alternatively, you can create your own copy of the same VM image using
Guix by following these steps:

- Install Guix, or boot up the vanilla pre-built QEMU image in a VM: https://guix.gnu.org/
- Put the following into a file named config.scm (this fully describes the OS to be built):

;; This is an operating system configuration for a VM image.
;; Modify it as you see fit and instantiate the changes by running:
;;
;;   guix system reconfigure /etc/config.scm
;;

(use-modules (gnu) (guix) (srfi srfi-1))
(use-service-modules virtualization desktop networking ssh xorg)
(use-package-modules virtualization bootloaders certs fonts nvi
                     package-management wget xorg)

(define vm-image-motd (plain-file "motd" "
\x1b[1;37mThis is the GNU system.  Welcome!\x1b[0m

This instance of Guix is a template for virtualized environments.
You can reconfigure the whole system by adjusting /etc/config.scm
and running:

  guix system reconfigure /etc/config.scm

Run '\x1b[1;37minfo guix\x1b[0m' to browse documentation.

\x1b[1;33mConsider setting a password for the 'root' and 'guest' \
accounts.\x1b[0m
"))

(define this-file
  (local-file (basename (assoc-ref (current-source-location) 'filename))
              "config.scm"))


(operating-system
 (host-name "gnu")
 (timezone "Etc/UTC")
 (locale "en_US.utf8")
 (keyboard-layout (keyboard-layout "us" "altgr-intl"))

 ;; Label for the GRUB boot menu.
 (label (string-append "GNU Guix " (package-version guix)))

 (firmware '())

 ;; Below we assume /dev/vda is the VM's hard disk.
 ;; Adjust as needed.
 (bootloader (bootloader-configuration
              (bootloader grub-bootloader)
              (target "/dev/vda")
              (terminal-outputs '(console))))
 (file-systems (cons (file-system
                      (mount-point "/")
                      (device "/dev/vda1")
                      (type "ext4"))
                     %base-file-systems))

 (users (cons (user-account
               (name "guest")
               (comment "GNU Guix Live")
               (password "")            ;no password
               (group "users")
               (supplementary-groups '("wheel" "netdev"
                                       "audio" "video"
                                       "libvirt")))
              %base-user-accounts))

 ;; Our /etc/sudoers file.  Since 'guest' initially has an empty password,
 ;; allow for password-less sudo.
 (sudoers-file (plain-file "sudoers" "\
root ALL=(ALL) ALL
%wheel ALL=NOPASSWD: ALL\n"))

 (packages (append (list virt-manager font-bitstream-vera nss-certs nvi
                         wget)
                   %base-packages))

 (services
  (append (list (service xfce-desktop-service-type)

                ;; Copy this file to /etc/config.scm in the OS.
                (simple-service 'config-file etc-service-type
                                `(("config.scm" ,this-file)))

                ;; Choose SLiM, which is lighter than the default GDM.
                (service slim-service-type
                         (slim-configuration
                          (auto-login? #t)
                          (default-user "guest")
                          (xorg-configuration
                           (xorg-configuration
                            (keyboard-layout keyboard-layout)))))

                ;; Uncomment the line below to add an SSH server.
                (service openssh-service-type
                         (openssh-configuration
                          (allow-empty-passwords? #t)))

                ;; Use the DHCP client service rather than NetworkManager.
                (service dhcp-client-service-type))

          ;; Remove GDM, ModemManager, NetworkManager, and wpa-supplicant,
          ;; which don't make sense in a VM.
          (append
           (list (service libvirt-service-type
                          (libvirt-configuration
                           (unix-sock-group "libvirt")
                           (log-filters
                            "3:remote 4:event 3:util.json 3:rpc 1:*")
                           (log-outputs
                            "1:file:/var/run/libvirt/libvirtd.log")))
                 (service virtlog-service-type))
           (remove (lambda (service)
                     (let ((type (service-kind service)))
                       (or (memq type
                                 (list gdm-service-type
                                       wpa-supplicant-service-type
                                       cups-pk-helper-service-type
                                       network-manager-service-type
                                       modem-manager-service-type))
                           (eq? 'network-manager-applet
                                (service-type-name type)))))
                   (modify-services %desktop-services
                                    (login-service-type config =>
                                                        (login-configuration
                                                         (inherit config)
                                                         (motd vm-image-motd))))))))

 ;; Allow resolution of '.local' host names with mDNS.
 (name-service-switch %mdns-host-lookup-nss))

- Upgrade Guix to a recent version (this is the one I'm using, which causes libvirt 5.5.0 to be used):
    guix pull --commit=458fe419232844d2021608d20dcd8f6e095eb2b4
- Build a VM image (this will take a long time):
    cp $(guix system vm-image --image-size=10GiB config.scm) qemu-image
- Make the file usable:
    sudo chown $(whoami) qemu-image && chmod 644 qemu-image
- Launch the VM:
    qemu-system-x86_64 \
        -net user,hostfwd=tcp:127.0.0.1:2222-:22 \
        -net nic,model=virtio \
        -enable-kvm -m 1500 \
        -smp 1 \
        -device virtio-blk,drive=myhd \
        -drive if=none,file=qemu-image,id=myhd
- Start virt-manager.
- Create a new connection (File > Add Connection) with the default values; you won't be able to create a new domain unless you do this first.
- Try to create a new domain using any random installer ISO from the Internet.

At this point, it should fail with the messages shown above.

Actual results:

The error messages shown above is printed, and the domain does not get created successfully.

Expected results:

The error does not occur and the domain gets created successfully.

Additional info:

The problem does not occur when using libvirt version 5.4.0.  To see this, you can build a VM and follow the reproduction steps above, replacing Guix commit 458fe419232844d2021608d20dcd8f6e095eb2b4 with commit 03b6c474454c1f90466435e872a005e296ddcbd0 (which will cause libvirt 5.4.0 to be used).  You will find that the error does not occur when you to create a domain from within that newly built VM.

From a shell, on a "bad" Guix system where the problem occurs, the content of /proc/self/cgroup is:

9:perf_event:/
8:blkio:/
7:freezer:/
6:devices:/
5:memory:/
4:cpuacct:/
3:cpu:/
2:cpuset:/
1:name=elogind:/c2
0::/

And on a "good" Guix system where the problem does not occur, the content of /proc/self/cgroup is the same:

9:perf_event:/
8:blkio:/
7:freezer:/
6:devices:/
5:memory:/
4:cpuacct:/
3:cpu:/
2:cpuset:/
1:name=elogind:/c2
0::/

Finally, I should also note that Guix uses elogind, rather than logind.  Guix does not use systemd.

Thank you for your help!  If you need more information, please don't hesitate to ask.

Comment 1 Cole Robinson 2019-10-10 10:25:14 UTC
CCing phrdina. this looks like a different root issue than the one we fixed in Fedora

Comment 2 Miguel Arruga Vivas 2019-10-28 11:43:26 UTC
As you can read here[1] the core issue seems to be that the generic layout without systemd is exactly the same as with systemd without the .slice part (nor the delegation on the slice creation as far as I understand from the code), but nowhere on  In virCgroupV2MakeGroup(src/util/vircgroupv2.c:429[2]) an mkdir is performed on the new sub-group, but I cannot find any check for the partition itself on virCgroupV2Available, that's the error reproducible with the guix image provided.  The function virCgroupSetPartitionSuffix (src/util/vircgroupv2.c:789[3]) contains a reference of these directories as fixed, even though only systemd creates them.

As far as I understand the manual should remove all references to user and system cgroups on non-systemd systems, which clearly is a bug.  I'm attaching a patch for that.

On the other hand, a clarification must be made by libvirt maintainers: either libvirt creates of /machine group when not running on systemd or the manual should specify that it is a configuration step that must be performed by system administrators and/or distributions before starting libvirtd, as it is specified with the custom partitions.

[1] https://libvirt.org/cgroups.html#currentLayoutGeneric
[2] https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/util/vircgroupv2.c;h=e0362990ab3ff669a38df0988f509359bb0cbbe6;hb=HEAD#l429
[3] https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/util/vircgroup.c;h=b46f20abfda626956b8a8c96d99c4c8c45800ddc;hb=HEAD#l789

Comment 3 Miguel Arruga Vivas 2019-10-28 11:45:57 UTC
Created attachment 1629745 [details]
Systemd references in non-systemd configurations

Comment 4 Chris Marusich 2019-10-30 07:48:30 UTC
Hi Miguel,

Thank you for the analysis!  I just now walked through the code, myself, using the debug output I attached earlier as a guide.  I examined the following version of the source (commit bafb3d1fbef9eac49230015b2fdbe60ceb1673b8, which was tagged with v5.6.0):

https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/util/vircgroup.c;h=825f62a97b9cf637cf3664fcbce0522ddf15ef31;hb=bafb3d1fbef9eac49230015b2fdbe60ceb1673b8#l827
https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/util/vircgroupv2.c;h=e36c36685b6d0332d6310fe633fcfb4fbfd2f1de;hb=bafb3d1fbef9eac49230015b2fdbe60ceb1673b8

It looks to me like your assessment is correct: even though the directory /sys/fs/cgroup/unified/machine does not exist, libvirt does not create it.  The debug output I attached earlier clearly shows the following functions being executed:

2019-10-10 06:45:05.116+0000: 335: debug : virCgroupNewMachineManual:1201 : Fallback to non-systemd setup
2019-10-10 06:45:05.116+0000: 335: debug : virCgroupNewPartition:849 : path=/machine create=1 controllers=ffffffff
2019-10-10 06:45:05.116+0000: 335: debug : virCgroupNew:678 : pid=-1 path=/machine parent=(nil) controllers=-1 group=0x7f28680c77e0
[... some lines omitted ...]
2019-10-10 06:45:05.118+0000: 335: error : virFileReadAll:1431 : Failed to open file '/sys/fs/cgroup/unified/machine/cgroup.controllers': No such file or directory
2019-10-10 06:45:05.118+0000: 335: error : virCgroupV2ParseControllersFile:260 : Unable to read from '/sys/fs/cgroup/unified/machine/cgroup.controllers': No such file or directory

In fact, there IS a potential call path from virCgroupNewPartition to virCgroupMakeGroup to virCgroupV2MakeGroup, where mkdir is actually called.  It looks like that mkdir invocation is intended to create the /sys/fs/cgroup/unified/machine directory, but I'm not 100% sure.  In any case, libvirt never reaches that mkdir call when this problem occurs, since (in the virCgroupNewPartition function) the first call to virCgroupNew results in the error above.

The fact that virCgroupMakeGroup has code to call virCgroupMakeGroup (and thus mkdir) makes me think that maybe libvirt intends to create the missing /sys/fs/cgroup/unified/machine directory but fails to do so.  In any case, I agree that it would be helpful if the libvirt maintainers would clarify the intended behavior.  If libvirt can be fixed to create the expected cgroups automatically, that would be great; however, if GNU/Linux distributions like Guix are expected to create the necessary cgroups before running libvirt, then we need clear guidance as to what needs to be created in this situation.

-- 
Chris

Comment 5 Miguel Arruga Vivas 2019-11-01 16:50:04 UTC
Created attachment 1631560 [details]
vircgroup: Ensure /machine group is associated with its parent.

This patch fixes the problem by providing the parent to the group creation.  I've tested on the master branch.  What do you think?

Comment 6 Miguel Arruga Vivas 2019-11-01 16:57:10 UTC
(In reply to Chris Marusich from comment #4)
> (...)
> The fact that virCgroupMakeGroup has code to call virCgroupMakeGroup (and
> thus mkdir) makes me think that maybe libvirt intends to create the missing
> /sys/fs/cgroup/unified/machine directory but fails to do so.  In any case, I
> agree that it would be helpful if the libvirt maintainers would clarify the
> intended behavior.  If libvirt can be fixed to create the expected cgroups
> automatically, that would be great; however, if GNU/Linux distributions like
> Guix are expected to create the necessary cgroups before running libvirt,
> then we need clear guidance as to what needs to be created in this situation.

Thank you very much for your analysis too.  The intent is clear, as the create parameter is provided by the expression STREQ(partition, "/machine") in virCgroupNewMachineManual.  The simplest way to solve it I've found was the one in the patch.

Best regards,
Miguel

Comment 7 Pavel Hrdina 2019-11-04 13:11:35 UTC
(In reply to Miguel Arruga Vivas from comment #5)
> Created attachment 1631560 [details]
> vircgroup: Ensure /machine group is associated with its parent.
> 
> This patch fixes the problem by providing the parent to the group creation. 
> I've tested on the master branch.  What do you think?

Thanks for the patch, I'll look into it if that patch fixes this issue.
However, libvirt project accepts patches that are send to libvir-list
and we have some guidelines on how to submit a patch [1].

Can you please post the patch to mailing list.  Few notes, we require 
Developer Certificate of Origin [2], so please add Signed-of-by with your
real name into the commit message and subject of commit message should not
end with a dot.

Thanks,
Pavel


[1] <https://libvirt.org/hacking.html>
[2] <https://developercertificate.org/>

Comment 8 Michal Privoznik 2019-11-15 14:48:06 UTC
Pushed to master as:

a74df786a2 vircgroup: Ensure /machine group is associated with its parent
ddcb33bdc0 doc: cgroups: Remove unwanted references to systemd

v5.9.0-251-ga74df786a2


Note You need to log in before you can comment on or make changes to this bug.