Bug 2218987 - virtnetworkd.service is not triggered by socket after first deactivation
Summary: virtnetworkd.service is not triggered by socket after first deactivation
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: libvirt
Version: 38
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Libvirt Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-30 19:45 UTC by sid
Modified: 2024-05-28 13:17 UTC (History)
8 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2024-05-28 13:17:44 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description sid 2023-06-30 19:45:01 UTC
virtnetworkd.service is triggered by the following sockets:

virtnetworkd.socket
virtnetworkd-admin.socket
virtnetworkd-ro.socket

When an libvirt client ( say, virsh / GNOME Boxes ) tries to connect to qemu:///system bus, the libvirt qemu daemon sends the appropriate messages to the above virtnetworkd sockets, which then triggers the virtnetworkd.service to start.

A working virtnetworkd.service is shown below:

# systemctl status virtnetworkd.service
● virtnetworkd.service - Virtualization network daemon
     Loaded: loaded (/usr/lib/systemd/system/virtnetworkd.service; disabled; preset: disabled)
    Drop-In: /usr/lib/systemd/system/service.d
             └─10-timeout-abort.conf
     Active: active (running) since Fri 2023-06-30 19:07:32 UTC; 2min 8s ago
TriggeredBy: ● virtnetworkd.socket
             ● virtnetworkd-admin.socket
             ● virtnetworkd-ro.socket
       Docs: man:virtnetworkd(8)
             https://libvirt.org
   Main PID: 5986 (virtnetworkd)
      Tasks: 21 (limit: 14182)
     Memory: 4.0M
        CPU: 343ms
     CGroup: /system.slice/virtnetworkd.service
             ├─5986 /usr/sbin/virtnetworkd --timeout 120
             ├─6057 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
             └─6058 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper

In the above case, the following command succeeds as shown below:

$ virsh -c qemu:///system net-list --all 
 Name      State    Autostart   Persistent
--------------------------------------------
 default   active   yes         yes

As noted, there is a timeout of 2 minutes for the virtnetworkd daemon. So, if the daemon is inactive for more that 2 minutes, the primary daemon process (virtnetworkd - pid 5986), will exit. This is to preserve system resources. 

An exited virtnetworkd daemon is shown below:

# systemctl status virtnetworkd.service
● virtnetworkd.service - Virtualization network daemon
     Loaded: loaded (/usr/lib/systemd/system/virtnetworkd.service; disabled; preset: disabled)
    Drop-In: /usr/lib/systemd/system/service.d
             └─10-timeout-abort.conf
     Active: active (running) since Fri 2023-06-30 19:07:32 UTC; 2min 11s ago
TriggeredBy: ● virtnetworkd.socket
             ● virtnetworkd-admin.socket
             ● virtnetworkd-ro.socket
       Docs: man:virtnetworkd(8)
             https://libvirt.org
    Process: 5986 ExecStart=/usr/sbin/virtnetworkd $VIRTNETWORKD_ARGS (code=exited, status=0/SUCCESS)
   Main PID: 5986 (code=exited, status=0/SUCCESS)
      Tasks: 2 (limit: 14182)
     Memory: 740.0K
        CPU: 349ms
     CGroup: /system.slice/virtnetworkd.service
             ├─6057 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
             └─6058 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper

[A] - If a client wants to connect to virtnetworkd.service, it should communicate with one of the sockets above directly, or via libvirt-qemu service to trigger virtnetworkd.service to start again.

However, [A] doesn't work as expected.

In the above state, the following command hangs and has to be terminated as shown below:

$ time virsh -c qemu:///system net-list --all 
^C

real	0m18.874s
user	0m0.021s
sys	0m0.022s

Explanation:
------------

As observed above, the virtnetworkd.service is still in "active (running)" state, even after the primary daemon process has exited. So, [A] doesn't actually work. This causes the clients to wait indefinitely for virtnetworkd daemon to start. Hence, above "virsh" command needs to be terminated with Ctrl+C.

I assume, maybe since the 2 dnsmasq process belong to the same cgroup are running, systemd reports virtnetworkd.service as "active (running)". This causes the trigger sockets to think virtnetworkd.service is still active, and servicing clients. 

Note: These "dnsmasq" processes will never be stopped when the service is stopped ( so running VMs will not be interrupted ).

Workaround:
-----------

# systemctl stop virtnetworkd.service

This stops the virtnetworkd.service, so the trigger sockets now trigger the service to start the stopped service correctly.

Example shown below:

[root@fedora libvirt] # systemctl stop virtnetworkd.service
Warning: Stopping virtnetworkd.service, but it can still be activated by:
  virtnetworkd.socket
  virtnetworkd-admin.socket
  virtnetworkd-ro.socket

[root@fedora libvirt] # systemctl status virtnetworkd.service
○ virtnetworkd.service - Virtualization network daemon
     Loaded: loaded (/usr/lib/systemd/system/virtnetworkd.service; disabled; preset: disabled)
    Drop-In: /usr/lib/systemd/system/service.d
             └─10-timeout-abort.conf
     Active: inactive (dead) since Fri 2023-06-30 19:37:50 UTC; 1s ago
   Duration: 9min 15.479s
TriggeredBy: ● virtnetworkd.socket
             ● virtnetworkd-admin.socket
             ● virtnetworkd-ro.socket
       Docs: man:virtnetworkd(8)
             https://libvirt.org
    Process: 8045 ExecStart=/usr/sbin/virtnetworkd $VIRTNETWORKD_ARGS (code=exited, status=0/SUCCESS)
   Main PID: 8045 (code=exited, status=0/SUCCESS)
      Tasks: 2 (limit: 14182)
     Memory: 748.0K
        CPU: 296ms
     CGroup: /system.slice/virtnetworkd.service
             ├─6057 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
             └─6058 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper

Jun 30 19:28:35 fedora dnsmasq-dhcp[6057]: read /var/lib/libvirt/dnsmasq/default.hostsfile

Now, "virsh" command works again.

$ virsh -c qemu:///system net-list --all 
 Name      State    Autostart   Persistent
--------------------------------------------
 default   active   yes         yes


Reproducible: Always

Steps to Reproduce:
1. Run the following command as normal user ( requires password )

$ virsh -c qemu:///system net-list --all 

2. If above command succeeds ( i.e prints output within a couple of seconds ), exit all libvirt clients like GNOME Boxes. Do not run any virsh commands.

3. Wait for 2 minutes.

4. Re-run command in [1].

5. Command should hang.
Actual Results:  
Command hangs indefinitely and needs to be terminated with Ctrl+C.

$ virsh -c qemu:///system net-list --all 
^C
$


Expected Results:  
Command should print some output like:

$ virsh -c qemu:///system net-list --all 
 Name      State    Autostart   Persistent
--------------------------------------------
 default   active   yes         yes

I think this is a regression due to:

https://fedoraproject.org/wiki/Changes/LibvirtModularDaemons

Comment 1 sid 2023-07-22 20:56:31 UTC
I guess this is a systemd bug - https://bugzilla.redhat.com/show_bug.cgi?id=2213660

Comment 2 Aoife Moloney 2024-05-28 13:17:44 UTC
Fedora Linux 38 entered end-of-life (EOL) status on 2024-05-21.

Fedora Linux 38 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.