Bug 2143235
| Summary: | memory utilization of virtnodedevd.service is constantly growing | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Jaroslav Pulchart <jaroslav.pulchart> |
| Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> |
| libvirt sub component: | General | QA Contact: | yalzhang <yalzhang> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | unspecified | ||
| Priority: | unspecified | CC: | berrange, bstinson, chhu, jdenemar, jsuchane, jwboyer, lhuang, lmen, mprivozn, mzhan, pkrempa, virt-maint, yafu |
| Version: | CentOS Stream | Keywords: | Triaged, Upstream |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | libvirt-8.10.0-1.el9 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-05-09 07:27:15 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | 8.10.0 |
| Embargoed: | |||
I set set MaxMem limit for the virtnodedevd.service service to 100MB. cat /etc/systemd/system/virtnodedevd.service.d/service.conf [Service] MemoryMax=100M my expectation was "the process will be OOM killed in approx 24h", however it balance at the 100MB memory usage instead without issue: # systemctl status virtnodedevd.service ● virtnodedevd.service - Virtualization nodedev daemon Loaded: loaded (/usr/lib/systemd/system/virtnodedevd.service; disabled; vendor preset: disabled) Drop-In: /etc/systemd/system/virtnodedevd.service.d └─service.conf Active: active (running) since Fri 2022-11-18 19:28:09 CET; 2 days ago TriggeredBy: ● virtnodedevd-admin.socket ● virtnodedevd-ro.socket ● virtnodedevd.socket Docs: man:virtnodedevd(8) https://libvirt.org Main PID: 624476 (virtnodedevd) Tasks: 19 (limit: 206089) Memory: 93.2M (max: 100.0M available: 6.7M) CPU: 41min 31.152s CGroup: /system.slice/virtnodedevd.service └─624476 /usr/sbin/virtnodedevd --timeout 20 # systemctl status virtnodedevd.service | grep Memory: Memory: 93.3M (max: 100.0M available: 6.6M) # systemctl status virtnodedevd.service | grep Memory: Memory: 93.3M (max: 100.0M available: 6.6M) # systemctl status virtnodedevd.service | grep Memory: Memory: 94.0M (max: 100.0M available: 5.9M) # systemctl status virtnodedevd.service | grep Memory: Memory: 94.5M (max: 100.0M available: 5.4M) # systemctl status virtnodedevd.service | grep Memory: Memory: 94.2M (max: 100.0M available: 5.7M) # systemctl status virtnodedevd.service | grep Memory: Memory: 94.2M (max: 100.0M available: 5.7M) # systemctl status virtnodedevd.service | grep Memory: Memory: 94.2M (max: 100.0M available: 5.7M) # systemctl status virtnodedevd.service | grep Memory: Memory: 94.6M (max: 100.0M available: 5.3M) # systemctl status virtnodedevd.service | grep Memory: Memory: 94.9M (max: 100.0M available: 5.0M) # systemctl status virtnodedevd.service | grep Memory: Memory: 96.1M (max: 100.0M available: 3.8M) # systemctl status virtnodedevd.service | grep Memory: Memory: 95.8M (max: 100.0M available: 4.1M) # systemctl status virtnodedevd.service | grep Memory: Memory: 95.7M (max: 100.0M available: 4.2M) anybody knows why the process consume all available memory (GBs) if no memory limit is set, however it is capable to work without any issue with low MaxMem limit and has balanced memory usage? Hi Jaroslav, Could you please help to collect some logs about it? for example, run "journalctl -u virtnodedevd" and check if there is any clue. And is there any vm running on the system? Or is there any heavy workload on this system? Since the virtnodedevd services is continously running for more than 2 weeks. The virtnodedevd service will timeout and be inactive if any related function is not called during 120s. > Could you please help to collect some logs about it? Logs are "empty", we can see starting/stopping as I was doing service restart: # journalctl -u virtnodedevd Nov 16 15:41:36 cmp0096.na3.pcigdc.com systemd[1]: Starting Virtualization nodedev daemon... Nov 16 15:41:36 cmp0096.na3.pcigdc.com systemd[1]: Started Virtualization nodedev daemon. Nov 16 20:49:12 cmp0096.na3.pcigdc.com systemd[1]: Stopping Virtualization nodedev daemon... Nov 16 20:49:12 cmp0096.na3.pcigdc.com systemd[1]: virtnodedevd.service: Deactivated successfully. Nov 16 20:49:12 cmp0096.na3.pcigdc.com systemd[1]: Stopped Virtualization nodedev daemon. Nov 16 20:49:12 cmp0096.na3.pcigdc.com systemd[1]: virtnodedevd.service: Consumed 1min 50.893s CPU time. Nov 16 20:49:12 cmp0096.na3.pcigdc.com systemd[1]: Starting Virtualization nodedev daemon... Nov 16 20:49:12 cmp0096.na3.pcigdc.com systemd[1]: Started Virtualization nodedev daemon. Nov 16 20:51:12 cmp0096.na3.pcigdc.com systemd[1]: virtnodedevd.service: Deactivated successfully. Nov 18 19:00:45 cmp0096.na3.pcigdc.com systemd[1]: Starting Virtualization nodedev daemon... Nov 18 19:00:45 cmp0096.na3.pcigdc.com systemd[1]: Started Virtualization nodedev daemon. > And is there any vm running on the system? Or is there any heavy workload on this system? The situation do not depends on VM deployment. It is observed on host which was empty (no VMs) and there is 0% utilization. > Since the virtnodedevd services is continously running for more than 2 weeks. The virtnodedevd service will timeout and be inactive if any related function is not called during 120s. It is not deactivated. The Openstack Nova Compute service is keeping connection into it. I try to lover the timeout to 20s to ensure it will be deactivated without any luck. I added some extra debug logs into openstack nova and I saw that nova is periodically running function on libvirt's connection, approx each 30s, as "conn.listAllDevices()" but not only. See: https://github.com/openstack/nova/blob/stable/yoga/nova/virt/libvirt/host.py#L1520 I have tested on libvirt-8.9.0-2.el9.x86_64 with below senarios:
1. Run "virsh nodedev-list" for 1000 times, and check the memory occupied by virtnodedevd service; The memory occupied increased from 13.9M to 24.0M after 8min;
2. Test memory leak by valgrind;
Details:
1. Start the virtnodedevd and check the memory, it's 13.9M. Then run "virsh nodedev-list" for 1000 times, and check the memory occupied by virtnodedevd service again.
# cat test.sh
#!/bin/sh
systemctl start virtnodedevd
systemctl status virtnodedevd
i=0
while [ $i -ne 1000 ]
do
virsh nodedev-list
i=$(($i+1))
echo "$i"
done
systemctl status virtnodedevd
# sh test.sh
......
● virtnodedevd.service - Virtualization nodedev daemon
Loaded: loaded (/usr/lib/systemd/system/virtnodedevd.service; disabled; vendor preset: disabled)
Active: active (running) since Thu 2022-11-24 22:36:12 EST; 7s ago
TriggeredBy: ● virtnodedevd-admin.socket
● virtnodedevd.socket
● virtnodedevd-ro.socket
Docs: man:virtnodedevd(8)
https://libvirt.org
Main PID: 3547 (virtnodedevd)
Tasks: 19 (limit: 407705)
Memory: 13.9M
CPU: 253ms
CGroup: /system.slice/virtnodedevd.service
└─3547 /usr/sbin/virtnodedevd --timeout 120
......after 1000 times run "virsh nodedev-list"......
● virtnodedevd.service - Virtualization nodedev daemon
Loaded: loaded (/usr/lib/systemd/system/virtnodedevd.service; disabled; vendor preset: disabled)
Active: active (running) since Thu 2022-11-24 22:52:22 EST; 8min ago
TriggeredBy: ● virtnodedevd-admin.socket
● virtnodedevd.socket
● virtnodedevd-ro.socket
Docs: man:virtnodedevd(8)
https://libvirt.org
Main PID: 9631 (virtnodedevd)
Tasks: 19 (limit: 407705)
Memory: 24.0M
CPU: 1min 2.165s
CGroup: /system.slice/virtnodedevd.service
└─9631 /usr/sbin/virtnodedevd --timeout 120
Nov 24 22:52:22 dell-per740xd-19.lab.eng.pek2.redhat.com systemd[1]: Starting Virtualization nodedev daemon...
Nov 24 22:52:22 dell-per740xd-19.lab.eng.pek2.redhat.com systemd[1]: Started Virtualization nodedev daemon.
Test with valgrind:
1. stop the virtnodedevd service and sockets:
# systemctl status virtnodedevd
○ virtnodedevd.service - Virtualization nodedev daemon
Loaded: loaded (/usr/lib/systemd/system/virtnodedevd.service; disabled; vendor preset: disabled)
Active: inactive (dead) since Thu 2022-11-24 23:03:03 EST; 16s ago
Duration: 10min 40.890s
TriggeredBy: ○ virtnodedevd-admin.socket
○ virtnodedevd.socket
○ virtnodedevd-ro.socket
Docs: man:virtnodedevd(8)
https://libvirt.org
Process: 9631 ExecStart=/usr/sbin/virtnodedevd $VIRTNODEDEVD_ARGS (code=exited, status=0/SUCCESS)
Main PID: 9631 (code=exited, status=0/SUCCESS)
CPU: 1min 2.371s
2. Run valgrind in one terminal:
# valgrind --leak-check=full virtnodedevd
==15745== Memcheck, a memory error detector
==15745== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==15745== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==15745== Command: virtnodedevd
==15745==
3. Run "nodedev-list" in another terminal:
# virsh nodedev-list
4.Check the info in first terminal, there is memory leak(full log attached):
==15745== LEAK SUMMARY:
==15745== definitely lost: 384 bytes in 12 blocks
==15745== indirectly lost: 4,563 bytes in 174 blocks
==15745== possibly lost: 896 bytes in 2 blocks
==15745== still reachable: 1,059,261 bytes in 13,765 blocks
==15745== suppressed: 0 bytes in 0 blocks
==15745== Reachable blocks (those to which a pointer was found) are not shown.
==15745== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==15745==
==15745== For lists of detected and suppressed errors, rerun with: -s
==15745== ERROR SUMMARY: 7 errors from 7 contexts (suppressed: 0 from 0)
The only real leak from the attached log seems to be: ==15745== 4,684 (192 direct, 4,492 indirect) bytes in 8 blocks are definitely lost in loss record 2,395 of 2,403 ==15745== at 0x4849464: calloc (vg_replace_malloc.c:1328) ==15745== by 0x4D39320: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.6800.4) ==15745== by 0x4999762: virPCIVPDParse (virpcivpd.c:656) ==15745== by 0x497A7F8: virPCIDeviceGetVPD (virpci.c:2691) ==15745== by 0x4A23DB7: UnknownInlinedFun (node_device_conf.c:3084) ==15745== by 0x4A23DB7: virNodeDeviceGetPCIDynamicCaps (node_device_conf.c:3117) ==15745== by 0x1900CABF: UnknownInlinedFun (node_device_udev.c:415) ==15745== by 0x1900CABF: UnknownInlinedFun (node_device_udev.c:1399) ==15745== by 0x1900CABF: udevAddOneDevice (node_device_udev.c:1564) ==15745== by 0x1900DFDD: UnknownInlinedFun (node_device_udev.c:1638) ==15745== by 0x1900DFDD: UnknownInlinedFun (node_device_udev.c:1692) ==15745== by 0x1900DFDD: nodeStateInitializeEnumerate (node_device_udev.c:2017) ==15745== by 0x4991F08: virThreadHelper (virthread.c:256) ==15745== by 0x5136801: start_thread (in /usr/lib64/libc.so.6) ==15745== by 0x50D6313: clone (in /usr/lib64/libc.so.6) Others are single-shot allocations via virOnce/pthread_once. The leak itself (~4kiB) doesn't though explain the ~10MiB increase in consumed memory as accounted by systemd though. My guess is that the issue is not a memory leak. The reason (from my point of view): when I set MemoryMax: cat /etc/systemd/system/virtnodedevd.service.d/service.conf [Service] MemoryMax=100M then it is using up-to MemoryMax RAM only, keeping it in balanced usage (freeing something, then allocating) in controlled way around MemoryMax size. That will not be possible in case of memory leak as process will not be capable to free memory and systemd will kill it when it uses MemoryMax asap. That is not observed. If I'm correct. Hi, Jaroslav Which OpenStack are you using, it's not RedHat OpenStack 17.0, right ? In RHOSP 17.0, it's using kolla, tripleo_nova_virtnodedevd.service starts the nova_virtnodedevd container, kolla_start runs "command": "/usr/sbin/virtqemud --config /etc/libvirt/virtqemud.conf" to start the virtnodedevd Hi, Jaroslav Which OpenStack are you using, it's not RedHat OpenStack 17.0, right ? In RHOSP 17.0, it's using kolla, tripleo_nova_virtnodedevd.service starts the nova_virtnodedevd container, kolla_start runs "command": "/usr/sbin/virtnodedevd --config /etc/libvirt/virtnodedevd.conf" to start the virtnodedevd I would like to avoid discussion about Openstack versions and distributions. This report is about libvirts's virtnodedevd memory utilization growing in case there is any kind of service using it for long period of time. Hi Peter, could you please help to check the attachment in comment 12? The file can be checked with massif-visualizer or ms_print. It's catched on one of my system. Just run "virsh nodedev-list" continously can reproduce the memory grow issue. Many thanks to Luyao's help to debug the issue and catch the log file. And Luyao said that something needs to be fixed in virPCIVPDResourceCustomUpsertValue from the log file. (In reply to yalzhang from comment #13) > Hi Peter, could you please help to check the attachment in comment 12? The > file can be checked with massif-visualizer or ms_print. It's catched on one > of my system. Just run "virsh nodedev-list" continously can reproduce the > memory grow issue. Many thanks to Luyao's help to debug the issue and catch > the log file. And Luyao said that something needs to be fixed in > virPCIVPDResourceCustomUpsertValue from the log file. If it is VPD related that could explain why only some people see a leak - VPD information depends on the hardware present - certain NICs will have it IIRC. (In reply to Daniel Berrangé from comment #14) > (In reply to yalzhang from comment #13) > > Hi Peter, could you please help to check the attachment in comment 12? The > > file can be checked with massif-visualizer or ms_print. It's catched on one > > of my system. Just run "virsh nodedev-list" continously can reproduce the > > memory grow issue. Many thanks to Luyao's help to debug the issue and catch > > the log file. And Luyao said that something needs to be fixed in > > virPCIVPDResourceCustomUpsertValue from the log file. > > If it is VPD related that could explain why only some people see a leak - > VPD information depends on the hardware present - certain NICs will have it > IIRC. Yes, I can not reproduce the "memory grow" issue on an old desktop(Checked just now, there is no VPD info for the NIC). But it can be reproduced on a beaker server with modern NICs. I am able to reproduce on a machine with a VPD PCI device and got these stack traces: ==62886== 479 (24 direct, 455 indirect) bytes in 1 blocks are definitely lost in loss record 2,119 of 2,164 ==62886== at 0x486D0CC: calloc (vg_replace_malloc.c:1328) ==62886== by 0x4E4B047: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.6800.4) ==62886== by 0x49DA84B: virPCIVPDParse (virpcivpd.c:656) ==62886== by 0x49AC5C3: virPCIDeviceGetVPD (virpci.c:2691) ==62886== by 0x4A96F23: virNodeDeviceGetPCIVPDDynamicCap (node_device_conf.c:3081) ==62886== by 0x4A97083: virNodeDeviceGetPCIDynamicCaps (node_device_conf.c:3114) ==62886== by 0x4A95C6B: virNodeDeviceUpdateCaps (node_device_conf.c:2681) ==62886== by 0xC1D887F: nodeDeviceGetXMLDesc (node_device_driver.c:355) ==62886== by 0x4C52093: virNodeDeviceGetXMLDesc (libvirt-nodedev.c:287) ==62886== by 0x154693: remoteDispatchNodeDeviceGetXMLDesc (remote_daemon_dispatch_stubs.h:15681) ==62886== by 0x1545FB: remoteDispatchNodeDeviceGetXMLDescHelper (remote_daemon_dispatch_stubs.h:15658) ==62886== by 0x4ACECC3: virNetServerProgramDispatchCall (virnetserverprogram.c:428) ==62886== ==62886== 958 (48 direct, 910 indirect) bytes in 2 blocks are definitely lost in loss record 2,135 of 2,164 ==62886== at 0x486D0CC: calloc (vg_replace_malloc.c:1328) ==62886== by 0x4E4B047: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.6800.4) ==62886== by 0x49DA84B: virPCIVPDParse (virpcivpd.c:656) ==62886== by 0x49AC5C3: virPCIDeviceGetVPD (virpci.c:2691) ==62886== by 0x4A96F23: virNodeDeviceGetPCIVPDDynamicCap (node_device_conf.c:3081) ==62886== by 0x4A97083: virNodeDeviceGetPCIDynamicCaps (node_device_conf.c:3114) ==62886== by 0xC1DDBB7: udevProcessPCI (node_device_udev.c:415) ==62886== by 0xC1E0463: udevGetDeviceDetails (node_device_udev.c:1399) ==62886== by 0xC1E09BB: udevAddOneDevice (node_device_udev.c:1564) ==62886== by 0xC1E0CA7: udevProcessDeviceListEntry (node_device_udev.c:1638) ==62886== by 0xC1E0E47: udevEnumerateDevices (node_device_udev.c:1692) ==62886== by 0xC1E17EB: nodeStateInitializeEnumerate (node_device_udev.c:2019) ==62886== ==62886== 2,874 (144 direct, 2,730 indirect) bytes in 6 blocks are definitely lost in loss record 2,152 of 2,164 ==62886== at 0x486D0CC: calloc (vg_replace_malloc.c:1328) ==62886== by 0x4E4B047: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.6800.4) ==62886== by 0x49DA84B: virPCIVPDParse (virpcivpd.c:656) ==62886== by 0x49AC5C3: virPCIDeviceGetVPD (virpci.c:2691) ==62886== by 0x4A96F23: virNodeDeviceGetPCIVPDDynamicCap (node_device_conf.c:3081) ==62886== by 0x4A97083: virNodeDeviceGetPCIDynamicCaps (node_device_conf.c:3114) ==62886== by 0x4A95C6B: virNodeDeviceUpdateCaps (node_device_conf.c:2681) ==62886== by 0x4A98EEB: virNodeDeviceObjMatch (virnodedeviceobj.c:877) ==62886== by 0x4A9943F: virNodeDeviceObjListExportCallback (virnodedeviceobj.c:948) ==62886== by 0x496F303: virHashForEach (virhash.c:367) ==62886== by 0x4A9959B: virNodeDeviceObjListExport (virnodedeviceobj.c:982) Alright, I have a fix for VPD problem. However, I'm not sure whether that's the one causing this bug. Jaroslav, could you confirm that 'virsh nodedev-list --cap vpd' prints something out? Alternatively, I can provide a build with my fix if you want to test that. Michal, the output of 'virsh nodedev-list --cap vpd' is: pci_0000_41_00_0 pci_0000_41_00_1 pci_0000_63_00_0 pci_0000_63_00_1 Perfect, so that's very likely it. If you want to test my fix, I've made a scratch build here: https://mprivozn.fedorapeople.org/rpms/nodedev/ Thanks Michal,
I took your src package and build it in our koji, install it at one of the spare server,
# rpm -qa | grep libvirt
python3-libvirt-8.7.0-1.el9.x86_64
libvirt-libs-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-daemon-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-daemon-driver-storage-core-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-daemon-driver-network-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-daemon-driver-nwfilter-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-daemon-config-nwfilter-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-daemon-config-network-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-daemon-driver-storage-disk-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-daemon-driver-storage-iscsi-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-daemon-driver-storage-logical-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-daemon-driver-storage-mpath-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-daemon-driver-storage-rbd-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-daemon-driver-storage-scsi-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-daemon-driver-storage-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-daemon-driver-interface-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-daemon-driver-nodedev-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-daemon-driver-qemu-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-daemon-driver-secret-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-client-8.9.0-3.el9_rc.8f4280bca8.x86_64
libvirt-8.9.0-3.el9_rc.8f4280bca8.x86_64
removed the MemoryMax systemd service limit and restart it. Current situation after 33 minutes of running:
# systemctl status virtnodedevd.service
● virtnodedevd.service - Virtualization nodedev daemon
Loaded: loaded (/usr/lib/systemd/system/virtnodedevd.service; disabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/virtnodedevd.service.d
└─service.conf
Active: active (running) since Wed 2022-11-30 17:02:39 CET; 33min ago
TriggeredBy: ● virtnodedevd-admin.socket
● virtnodedevd-ro.socket
● virtnodedevd.socket
Docs: man:virtnodedevd(8)
https://libvirt.org
Main PID: 6031 (virtnodedevd)
Tasks: 19 (limit: 206089)
Memory: 21.8M
CPU: 26.408s
CGroup: /system.slice/virtnodedevd.service
└─6031 /usr/sbin/virtnodedevd --timeout 120
so lets see tomorrow how it will looks.
I have tried on one of my system with VPD NICs which can reproduce the issue, the scratch build works well. After 9 hours run "virsh nodedev-list"(more than 60000 times), the occupied memory is around 20M.
# systemctl status virtnodedevd
● virtnodedevd.service - Virtualization nodedev daemon
Loaded: loaded (/usr/lib/systemd/system/virtnodedevd.service; disabled; vendor preset: disabled)
Active: active (running) since Wed 2022-11-30 10:49:49 EST; 9h ago
TriggeredBy: ● virtnodedevd.socket
● virtnodedevd-ro.socket
● virtnodedevd-admin.socket
Docs: man:virtnodedevd(8)
https://libvirt.org
Main PID: 141029 (virtnodedevd)
Tasks: 19 (limit: 407718)
Memory: 19.4M
......
So far so good. 14h of runing and we are oscillating around 22.5MB:
# systemctl status virtnodedevd.service
● virtnodedevd.service - Virtualization nodedev daemon
Loaded: loaded (/usr/lib/systemd/system/virtnodedevd.service; disabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/virtnodedevd.service.d
└─service.conf
Active: active (running) since Wed 2022-11-30 17:02:39 CET; 14h ago
TriggeredBy: ● virtnodedevd-admin.socket
● virtnodedevd-ro.socket
● virtnodedevd.socket
Docs: man:virtnodedevd(8)
https://libvirt.org
Main PID: 6031 (virtnodedevd)
Tasks: 19 (limit: 206089)
Memory: 22.3M
CPU: 10min 14.499s
CGroup: /system.slice/virtnodedevd.service
└─6031 /usr/sbin/virtnodedevd --timeout 120
Perfect! I've merged patch as:
commit 64d32118540aca3d42bc5ee21c8b780cafe04bfa
Author: Michal Prívozník <mprivozn>
AuthorDate: Wed Nov 30 14:53:21 2022 +0100
Commit: Michal Prívozník <mprivozn>
CommitDate: Thu Dec 1 08:38:01 2022 +0100
node_device_conf: Avoid memleak in virNodeDeviceGetPCIVPDDynamicCap()
The virNodeDeviceGetPCIVPDDynamicCap() function is called from
virNodeDeviceGetPCIDynamicCaps() and therefore has to be a wee
bit more clever about adding VPD capability. Namely, it has to
remove the old one before adding a new one. This is how other
functions called from virNodeDeviceGetPCIDynamicCaps() behave
as well.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2143235
Signed-off-by: Michal Privoznik <mprivozn>
Reviewed-by: Peter Krempa <pkrempa>
v8.10.0-rc2-8-g64d3211854
Test with libvirt-8.10.0-1.el9.x86_64, the issue is fixed. I can confirm the 8.10.0-1 is OK. The virtnodedevd.service consumes 18.2M of RAM after 17h of running (now grow). Move the bug to be verified based on above verification. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (libvirt bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2171 |
Description of problem: Memory utilization of virtnodedevd.service is constantly growing. After two weeks it uses 1.2GB of RAM, where initial amount (after fresh start) is just 16MB. Version-Release number of selected component (if applicable): libvirt-*-8.7.0-1.el9 How reproducible: 1/ virtnodedevd.service is just running for several days 2/ observe memory utilization grow Steps to Reproduce: 1. Enable virtnodedevd service or socket 2. We use openstack-nova-compute which keep socket connection to virtnodedevd and as result it is newer stopped by timeout. However there is none communication between them. Actual results: Service memory utilization is growing, like "Memory: 1.2G" after 2weeks and 1day. # systemctl status virtnodedevd.service ● virtnodedevd.service - Virtualization nodedev daemon Loaded: loaded (/usr/lib/systemd/system/virtnodedevd.service; disabled; vendor preset: disabled) Active: active (running) since Tue 2022-11-01 08:30:26 CET; 2 weeks 1 day ago TriggeredBy: ● virtnodedevd-ro.socket ● virtnodedevd-admin.socket ● virtnodedevd.socket Docs: man:virtnodedevd(8) https://libvirt.org Main PID: 6223 (virtnodedevd) Tasks: 19 (limit: 127531) Memory: 1.2G CPU: 2h 35min 27.934s CGroup: /system.slice/virtnodedevd.service └─6223 /usr/sbin/virtnodedevd --timeout 120 Expected results: Memory utilization is low and constant (like immediately after service restart): # systemctl restart virtnodedevd.service # systemctl status virtnodedevd.service ● virtnodedevd.service - Virtualization nodedev daemon Loaded: loaded (/usr/lib/systemd/system/virtnodedevd.service; disabled; vendor preset: disabled) Active: active (running) since Wed 2022-11-16 13:16:47 CET; 1s ago TriggeredBy: ● virtnodedevd-admin.socket ● virtnodedevd-ro.socket ● virtnodedevd.socket Docs: man:virtnodedevd(8) https://libvirt.org Main PID: 3077832 (virtnodedevd) Tasks: 19 (limit: 127531) Memory: 15.9M CPU: 166ms CGroup: /system.slice/virtnodedevd.service └─3077832 /usr/sbin/virtnodedevd --timeout 120 Additional info: n/a