Bug 2228376
| Summary: | "virsh cpu-stats" command fails after tripleo_nova_libvirt service is restarted | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Alex Stupnikov <astupnik> | ||||
| Component: | openstack-tripleo-heat-templates | Assignee: | OSP Team <rhos-maint> | ||||
| Status: | NEW --- | QA Contact: | Joe H. Rahme <jhakimra> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 16.2 (Train) | CC: | kchamart, mburns, mprivozn, smooney, tkajinam | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | All | ||||||
| OS: | All | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | Type: | Bug | |||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Alex Stupnikov
2023-08-02 08:49:44 UTC
I have managed to workaround this issue by appending the following lines to /etc/systemd/system/tripleo_nova_libvirt.service and running systemctl daemon-reload: new VMs were no longer affected by libvirt restarts. [Service] Delegate=yes I am not sure if this should be used as official workaround until fix is delivered. One of the libvirt developers pointed me to this upstream libvirt fix:
https://gitlab.com/libvirt/libvirt/-/commit/8865c42771600a4 -- qemu:
Provide virDomainGetCPUStats() implementation for session connection
That said, the above fix is for the libvirt "session" connection (i.e.
non-root), and the libvirt commit message says "system" connections
(i.e. root) work fine.
The nova_libvirt container (which is running the libvirt daemon and QEMU
instances) runs with "system" connection (i.e. privileged). So I'm not
sure if it's the above libvirt fix is the solution here.
@mprivzon: Hi, Michal; does the above make sense to you? (Including
Alex's workaround in comment#2.)
Meanwhile, I learnt something interesting about libvirt 'system'
connections inside a container from Andrea Bolognani from the libvirt
team:
libvirt's 'system' connection inside a container can sometimes
behave somewhat more similar to 'session' connections _outside_ the
container. That's because, regardless of the shenanigans the
container runtime will perform to make processes believe that they
are running with elevated privileges, a bunch of the actual
capabilities will be missing.
We're not saying this is the case here, just a potential angle to
consider.
(For those following along, here's a quick primer on system vs session
operating modes: https://libvirt.org/daemons.html#operating-modes.)
(In reply to Kashyap Chamarthy from comment #3) > One of the libvirt developers pointed me to this upstream libvirt fix: > > https://gitlab.com/libvirt/libvirt/-/commit/8865c42771600a4 -- qemu: > Provide virDomainGetCPUStats() implementation for session connection This is unrelated, because OSP runs priviledged libvirt, i.e. connects to qemu:///system. The commit above was made specifically for KubeVirt needs as they connect to qemu:///session. Though it may help, because the same code is used for both connections. > > That said, the above fix is for the libvirt "session" connection (i.e. > non-root), and the libvirt commit message says "system" connections > (i.e. root) work fine. > > The nova_libvirt container (which is running the libvirt daemon and QEMU > instances) runs with "system" connection (i.e. privileged). So I'm not > sure if it's the above libvirt fix is the solution here. You are correct and the commit is unrelated. The real problem here is why libvirt fails to detect CGroups after it was restarted. For that I'd need to see debug logs. (In reply to Michal Privoznik from comment #5) > (In reply to Kashyap Chamarthy from comment #3) > > One of the libvirt developers pointed me to this upstream libvirt fix: > > > > https://gitlab.com/libvirt/libvirt/-/commit/8865c42771600a4 -- qemu: > > Provide virDomainGetCPUStats() implementation for session connection > > This is unrelated, because OSP runs priviledged libvirt, i.e. connects to > qemu:///system. The commit above was made specifically for KubeVirt needs as > they connect to qemu:///session. Though it may help, because the same code > is used for both connections. What I forgot to mention is: plain 'virsh cpu-stats $domain' won't work for qemu:///session, because by default it tries to get per-host-cpu stats for the guest (i.e. how much time the guest consumed on host CPU#0, CPU#1, etc.). And without cpu accounting controller there's (possibly) no way to get that information. BUT, what is possible is to get overall statistics (i.e. how much time the guest consumed on all host CPUs combined). Long story short, use: 'virsh cpu-stats --total $domain'. But as I've said earlier, this is just a workaround and the real reason is libvirt failed to detect CGroup after it was restarted. Created attachment 1983541 [details]
libvirtd and instance logs
Here are libvirt logs and instance logs from my lab. Logs cover the following calls:
57 podman exec -it nova_libvirt virsh cpu-stats instance-0000002a
58 systemctl restart tripleo_nova_libvirt
59 podman exec -it nova_libvirt virsh cpu-stats instance-0000002a
60 podman restart nova_libvirt
61 podman exec -it nova_libvirt virsh cpu-stats instance-0000002a
I was a bit wrong about reproducer: for clean compute node restart of tripleo_nova_libvirt systemd unit doesn't help to reproduce, nova_libvirt container should be restarted using podman command first (looks like originally compute wasn't clean and although I haven't used podman command, problem still was reproduced).
And after another try on fresh compute I figured out that first "systemctl restart tripleo_nova_libvirt" doesn't break things, but second "systemctl restart tripleo_nova_libvirt" does:
[root@compute-2 ~]# podman exec -it nova_libvirt virsh list
Id Name State
-----------------------------------
3 instance-00000028 running
[root@compute-2 ~]# podman exec -it nova_libvirt virsh cpu-stats instance-00000028
CPU0:
cpu_time 0.631494967 seconds
vcpu_time 0.000000000 seconds
[root@compute-2 ~]# systemctl restart tripleo_nova_libvirt
[root@compute-2 ~]# podman exec -it nova_libvirt virsh cpu-stats instance-00000028
CPU0:
cpu_time 0.650575681 seconds
vcpu_time 0.000000000 seconds
[root@compute-2 ~]# systemctl restart tripleo_nova_libvirt
[root@compute-2 ~]# podman exec -it nova_libvirt virsh cpu-stats instance-00000028
error: Failed to retrieve CPU statistics for domain 'instance-00000028'
error: Requested operation is not valid: cgroup CPUACCT controller is not mounted
Please let me know if anything else is needed. So this is CGroups detection after the first restart: 2023-08-16 09:06:48.920+0000: 986222: debug : virGDBusIsServiceEnabled:387 : Service org.freedesktop.machine1 is unavailable 2023-08-16 09:06:48.920+0000: 986222: debug : virCgroupNewDetect:1152 : pid=982717 controllers=-1 group=0x7fb32fffe828 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupDetectPlacement:349 : Detecting placement for pid 982717 path 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 0:cpu at /sys/fs/cgroup/cpu,cpuacct in /machine/qemu-1-instance-0000002a.libvirt-qemu/emulator for pid 982717 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 1:cpuacct at /sys/fs/cgroup/cpu,cpuacct in /machine/qemu-1-instance-0000002a.libvirt-qemu/emulator for pid 982717 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 2:cpuset at /sys/fs/cgroup/cpuset in /machine/qemu-1-instance-0000002a.libvirt-qemu/emulator for pid 982717 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 3:memory at /sys/fs/cgroup/memory in /machine/qemu-1-instance-0000002a.libvirt-qemu for pid 982717 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 4:devices at /sys/fs/cgroup/devices in /machine/qemu-1-instance-0000002a.libvirt-qemu for pid 982717 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 5:freezer at /sys/fs/cgroup/freezer in /machine/qemu-1-instance-0000002a.libvirt-qemu for pid 982717 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 6:blkio at /sys/fs/cgroup/blkio in /machine/qemu-1-instance-0000002a.libvirt-qemu for pid 982717 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 7:net_cls at /sys/fs/cgroup/net_cls,net_prio in /machine/qemu-1-instance-0000002a.libvirt-qemu for pid 982717 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 8:perf_event at /sys/fs/cgroup/perf_event in /machine/qemu-1-instance-0000002a.libvirt-qemu for pid 982717 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 9:name=systemd at /sys/fs/cgroup/systemd in /system.slice/run-r22a2e1da04734a7aa857f08f8b7c4284.scope for pid 982717 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1DetectControllers:493 : Auto-detecting controllers 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1DetectControllers:503 : Controller 'cpu' present=yes 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1DetectControllers:503 : Controller 'cpuacct' present=yes 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1DetectControllers:503 : Controller 'cpuset' present=yes 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1DetectControllers:503 : Controller 'memory' present=yes 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1DetectControllers:503 : Controller 'devices' present=yes 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1DetectControllers:503 : Controller 'freezer' present=yes 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1DetectControllers:503 : Controller 'blkio' present=yes 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1DetectControllers:503 : Controller 'net_cls' present=yes 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1DetectControllers:503 : Controller 'perf_event' present=yes 2023-08-16 09:06:48.921+0000: 986222: debug : virCgroupV1DetectControllers:503 : Controller 'name=systemd' present=yes Please note the placement of controllers: /sys/fs/cgroup/$controller/machine/qemu-1-instance-0000002a.libvirt-qemu. Now, this is CGroups detection after the second restart: 2023-08-16 09:07:19.041+0000: 986726: debug : virGDBusIsServiceEnabled:387 : Service org.freedesktop.machine1 is unavailable 2023-08-16 09:07:19.041+0000: 986726: debug : virCgroupNewDetect:1152 : pid=982717 controllers=-1 group=0x7f22f1137828 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupDetectPlacement:349 : Detecting placement for pid 982717 path 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 0:cpu at /sys/fs/cgroup/cpu,cpuacct in /system.slice/run-r22a2e1da04734a7aa857f08f8b7c4284.scope for pid 982717 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 1:cpuacct at /sys/fs/cgroup/cpu,cpuacct in /system.slice/run-r22a2e1da04734a7aa857f08f8b7c4284.scope for pid 982717 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 2:cpuset at /sys/fs/cgroup/cpuset in /machine/qemu-1-instance-0000002a.libvirt-qemu/emulator for pid 982717 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 3:memory at /sys/fs/cgroup/memory in /system.slice/run-r22a2e1da04734a7aa857f08f8b7c4284.scope for pid 982717 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 4:devices at /sys/fs/cgroup/devices in /system.slice for pid 982717 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 5:freezer at /sys/fs/cgroup/freezer in /machine/qemu-1-instance-0000002a.libvirt-qemu for pid 982717 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 6:blkio at /sys/fs/cgroup/blkio in /system.slice/run-r22a2e1da04734a7aa857f08f8b7c4284.scope for pid 982717 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 7:net_cls at /sys/fs/cgroup/net_cls,net_prio in /machine/qemu-1-instance-0000002a.libvirt-qemu for pid 982717 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 8:perf_event at /sys/fs/cgroup/perf_event in /machine/qemu-1-instance-0000002a.libvirt-qemu for pid 982717 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1ValidatePlacement:423 : Detected mount/mapping 9:name=systemd at /sys/fs/cgroup/systemd in /system.slice/run-r22a2e1da04734a7aa857f08f8b7c4284.scope for pid 982717 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1DetectControllers:493 : Auto-detecting controllers 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1DetectControllers:503 : Controller 'cpu' present=yes 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1DetectControllers:503 : Controller 'cpuacct' present=yes 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1DetectControllers:503 : Controller 'cpuset' present=yes 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1DetectControllers:503 : Controller 'memory' present=yes 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1DetectControllers:503 : Controller 'devices' present=yes 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1DetectControllers:503 : Controller 'freezer' present=yes 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1DetectControllers:503 : Controller 'blkio' present=yes 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1DetectControllers:503 : Controller 'net_cls' present=yes 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1DetectControllers:503 : Controller 'perf_event' present=yes 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1DetectControllers:503 : Controller 'name=systemd' present=yes 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupV1ValidateMachineGroup:161 : Name 'run-r22a2e1da04734a7aa857f08f8b7c4284.scope' for controller 'cpu' does not match 'instance-0000002a', 'qemu-1-instance-0000002a', 'instance-0000002a.libvirt-qemu', 'machine-qemu\x2dinstance\x2d0000002a.scope' or 'machine-qemu\x2d1\x2dinstance\x2d0000002a.scope' 2023-08-16 09:07:19.042+0000: 986726: debug : virCgroupNewDetectMachine:1203 : Failed to validate machine name for 'instance-0000002a' driver 'qemu' Now, do you notice the change? SOME controllers are now in unexpected /sys/fs/cgroup/$controller/system.slice/run-r22a2e1da04734a7aa857f08f8b7c4284.scope which then fails libvirt validation, which then leads libvirt think CGroups were mangled with behind its back and thus it refuses to work with such controller(s). And unfortunately, the cpuacct controller is one of those affected. I have no idea who mangled the placement and what is going on, but this is NOT a libvirt bug. If something is messing up CGroup placement, well it needs to stop doing that. Thank you very much Michal! After discussion with Kashyap we have decided to ask Michal why Delegate flag changes systemd/cgroup behavior. I have decided to take a second look at the lab to confirm that there are no possible disparities. It turns out that there are: as mentioned in comment #10, it takes two libvirt restarts using "systemctl restart tripleo_nova_libvirt" command to reproduce the issue for new VM, but originally I believed that only one is needed. So I got false positive outcome for workaround testing: after single restart VM's CPU stats were still printed, while in reality behavior is still the same. As a result, we need to figure out why some controllers end up in unexpected /sys/fs/cgroup/$controller/system.slice/run-*.scope after second restart. I will try to ask support experts to help us understand this, but engineering engagement is possibly needed here. |