RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1964855 - 'systemctl stop libvirt-guests' hangs after libvirtd timeout
Summary: 'systemctl stop libvirt-guests' hangs after libvirtd timeout
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: libvirt
Version: 9.0
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: beta
: ---
Assignee: Martin Kletzander
QA Contact: Yanqiu Zhang
URL:
Whiteboard:
Depends On:
Blocks: 1964861
TreeView+ depends on / blocked
 
Reported: 2021-05-26 08:20 UTC by Yanqiu Zhang
Modified: 2023-05-09 08:07 UTC (History)
16 users (show)

Fixed In Version: libvirt-8.9.0-1.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1964861 (view as bug list)
Environment:
Last Closed: 2023-05-09 07:26:10 UTC
Type: Bug
Target Upstream Version: 8.9.0
Embargoed:


Attachments (Terms of Use)
libvirt-guests_bt.txt (5.29 KB, text/plain)
2021-05-26 08:20 UTC, Yanqiu Zhang
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker LIBVIRTAT-11070 0 None None None 2022-11-23 13:36:19 UTC
Red Hat Product Errata RHBA-2023:2171 0 None None None 2023-05-09 07:26:53 UTC

Description Yanqiu Zhang 2021-05-26 08:20:47 UTC
Created attachment 1787148 [details]
libvirt-guests_bt.txt

Description of problem:
'systemctl stop libvirt-guests' hangs after libvirtd timeout

Version-Release number of selected component (if applicable):
libvirt-client-7.0.0-6.el9.x86_64
qemu-kvm-6.0.0-2.el9.x86_64

How reproducible:
100%

Steps to Reproduce:
1.systemctl start libvirt-guests
it will also start libvirtd

2.wait 'libvirtd --timeout 120' to timeout

3.systemctl stop libvirt-guests 


Actual results:
Step 3 hangs, actually at 'virsh connect' inside the .sh:
# ps aux|grep 'libvirt|virsh' -E
...
root       54720  0.0  0.0  20424  8540 pts/3    S+   09:07   0:00 systemctl stop libvirt-guests
root       54722  0.0  0.0   7696  4340 ?        Ss   09:07   0:00 /usr/bin/sh /usr/libexec/libvirt-guests.sh stop
root       54728  0.0  0.0 108572 16136 ?        Sl   09:07   0:00 virsh connect
...

test_connect()
{
    local uri="$1"

    if run_virsh "$uri" connect 2>/dev/null; then
        return 0;
    else
        eval_gettext "Can't connect to \$uri. Skipping."
        return 1
    fi
}

Expected results:


Additional info:
1. Not reproduces when start libvirtd -> libvirtd timeout -> start libvirt-guests", libvirtd will be auto started.
2. Can not try "start libvirtd -> start libvirt-guests -> stop libvirtd -> stop libvirt-guests", because stopping libvirtd also stopped libvirt-guests due to Bug 1946697(fixedin:libvirt-7.0.0-13)

Comment 1 Yanqiu Zhang 2021-05-26 08:52:58 UTC
Reproduces on latest build:
libvirt-client-7.3.0-1.el9.x86_64

Can also reproduce by : start libvirt-guests -> stop libvirtd -> stop libvirt-guests.

Comment 2 Michal Privoznik 2021-06-23 13:16:31 UTC
Honestly, this smells like a systemd bug to me. There's still libvirtd.socket which is supposed to spawn libvirtd upon connection. And we can see virsh is stuck in connecting to libvirtd. But it's never spawned to process the connection request. Let me switch over to systemd team to get their expertise.

Comment 3 David Tardon 2021-06-24 12:00:03 UTC
What does "systemctl status libvirtd.socket" say when this happens? Does "nc -zU /run/libvirt/libvirt-sock" start libvirtd.service?

Comment 4 Yanqiu Zhang 2021-06-24 12:11:40 UTC
(In reply to David Tardon from comment #3)
> What does "systemctl status libvirtd.socket" say when this happens? Does "nc
> -zU /run/libvirt/libvirt-sock" start libvirtd.service?

# systemctl status libvirtd.socket -l
● libvirtd.socket - Libvirt local socket
     Loaded: loaded (/usr/lib/systemd/system/libvirtd.socket; enabled; vendor preset: disabled)
     Active: active (running) since Wed 2021-05-19 00:57:18 EDT; 1 months 5 days ago
   Triggers: ● libvirtd.service
     Listen: /run/libvirt/libvirt-sock (Stream)
      Tasks: 0 (limit: 38031)
     Memory: 0B
        CPU: 0
     CGroup: /system.slice/libvirtd.socket

May 19 00:57:18 dell-* systemd[1]: Listening on Libvirt local socket.

# nc -zU /run/libvirt/libvirt-sock
# echo $?
0
# systemctl status libvirtd
● libvirtd.service - Virtualization daemon
     Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
     Active: inactive (dead) since Thu 2021-06-24 08:04:40 EDT; 2min 55s ago
TriggeredBy: ● libvirtd-admin.socket
             ● libvirtd.socket
             ● libvirtd-ro.socket
       Docs: man:libvirtd(8)
             https://libvirt.org
    Process: 457048 ExecStart=/usr/sbin/libvirtd $LIBVIRTD_ARGS (code=exited, status=0/SUCCESS)
   Main PID: 457048 (code=exited, status=0/SUCCESS)
      Tasks: 2 (limit: 32768)
     Memory: 33.2M
        CPU: 547ms
     CGroup: /system.slice/libvirtd.service
             ├─3947419 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/li>
             └─3947420 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/li>

Jun 24 08:04:23 dell-* libvirtd[457067]: libcap-ng used by "/usr/sbin/libvirtd" failed due to not havin>
Jun 24 08:04:23 dell-* libvirtd[457068]: libcap-ng used by "/usr/sbin/libvirtd" failed due to not havin>
Jun 24 08:04:23 dell-* dnsmasq[3947419]: read /etc/hosts - 2 addresses
Jun 24 08:04:23 dell-* dnsmasq[3947419]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
Jun 24 08:04:23 dell-* dnsmasq-dhcp[3947419]: read /var/lib/libvirt/dnsmasq/default.hostsfile
Jun 24 08:04:40 dell-* systemd[1]: Stopping Virtualization daemon...
Jun 24 08:04:40 dell-* systemd[1]: libvirtd.service: Succeeded.
Jun 24 08:04:40 dell-* systemd[1]: libvirtd.service: Unit process 3947419 (dnsmasq) remains running aft>
Jun 24 08:04:40 dell-* systemd[1]: libvirtd.service: Unit process 3947420 (dnsmasq) remains running aft>
Jun 24 08:04:40 dell-* systemd[1]: Stopped Virtualization daemon.

And 'libvirt-guests stop' still hangs.

Comment 5 Martin Kletzander 2022-02-24 08:50:12 UTC
This might be related to a stuck libvirt-guests script which is left there even if you killed the "systemctl stop libvirt-guests".  It is running "virsh connect" which succeeded at connecting to the socket which was not picked up by a started daemon due to misconfiguration fixed in 1868537.  After I killed processes connected to the socket(s) everything started to work again.

Could you check that there is no virsh, libvirt-guests or anything else running that could be connected to the socket?

Anyway this could be fixed if systemd closed the connection on the socket if it failed to start the triggered service, maybe once it stops trying.

Comment 6 Yanqiu Zhang 2022-02-24 10:01:39 UTC
According to bz1868537#c20, if never executed 'stop libvirt-guests' before, 'start virtqemud' or 'virsh list' doesn't hang. Also, killing the 'virsh connect' process left by libvirt-guests also makes everything back to work. So it's better for libvirt-guests to work well by itself I think.

virtqemud was inactive so no virsh or running vms, libvirt-guests yes its 'stop' called the 'virsh connect', I don't know where to find other things that could be connected to the socket.

Comment 7 Michal Sekletar 2022-07-21 15:37:36 UTC
The problem is After= dependency between libvirt-guests.service (G) and virt-qemud.service (Q). Stop job on G will cause start of Q. However, start job for Q is not runnable because of the dependency. Generally, whenever we have B.service after A.service and we have stop job for B and start job for A systemd is always going to try to run stop job first and then start job. This is done in order to achieve more natural restart behaviour, this might seem unrelated but actually in systemd restart is just stop job which is executed and then turned into start job.

Solution is to drop After= dependency on the virt-qemud.service as it is not needed. Ideally, you should After= on socket or on service but not both at the same time.

Comment 9 Martin Kletzander 2022-09-26 11:11:06 UTC
Fixed upstream with v8.7.0-137-g59d30adacd1d:

commit 59d30adacd1d5a4de8c25200ebfc666bd180fe1e
Author: Martin Kletzander <mkletzan>
Date:   Tue Aug 30 08:29:53 2022 +0200

    libvirt-guests: Fix dependency ordering in service file

Comment 10 Yanqiu Zhang 2022-10-15 03:12:06 UTC
Tested on rhel9.2 with:
libvirt-8.8.0-1.el9.x86_64
qemu-kvm-7.1.0-3.el9.x86_64

Steps:
1. man virtqemud
       If using libvirt-guests service then the ordering for that service needs to be adapted so that it
       is  ordered  after  the service unit instead of the socket unit.  Since dependencies and ordering
       cannot be changed with drop-in overrides, the whole libvirt-guests unit file needs to be changed.
       In  order  to preserve such change copy the installed /usr/lib/systemd/system/libvirt-guests.ser‐
       vice to /etc/systemd/system/libvirt-guests.service and make the change there,  specifically  make
       sure the After= ordering mentions virtqemud.service and not virtqemud.socket:

          [Unit]
          After=virtqemud.service

2. # cat /usr/lib/systemd/system/libvirt-guests.service 
[Unit]
Description=Suspend/Resume Running libvirt Guests
Requires=virt-guest-shutdown.target
After=network.target
After=time-sync.target
After=libvirtd.socket
After=virtqemud.socket
After=virtlxcd.socket
After=virtvboxd.socket
After=virtvzd.socket
After=virtxend.socket
After=virt-guest-shutdown.target
Documentation=man:libvirt-guests(8)
Documentation=https://libvirt.org

[Service]
EnvironmentFile=-/etc/sysconfig/libvirt-guests
# Hack just call traditional service until we factor
# out the code
ExecStart=/usr/libexec/libvirt-guests.sh start
ExecStop=/usr/libexec/libvirt-guests.sh stop
Type=oneshot
RemainAfterExit=yes
StandardOutput=journal+console
TimeoutStopSec=0

[Install]
WantedBy=multi-user.target

3. Start&destroy libvirt-guests when virtqemud.service stopped but virtqemud.socket active, with default service file:
# systemctl start virtqemud
# ps aux|grep virtqemud
root       82007  2.3  0.0 1581368 33416 ?       Ssl  22:28   0:00 /usr/sbin/virtqemud --timeout 120
(Wait for it timeout)
#  systemctl is-active virtqemud.socket; systemctl is-active virtqemud.service
active
inactive
# systemctl start libvirt-guests
#  systemctl is-active virtqemud.socket; systemctl is-active virtqemud.service
active
inactive
# systemctl stop libvirt-guests
(hangs here)
# ps aux|grep virsh
root       82062  0.5  0.0 325904 16036 ?        Sl   22:32   0:00 virsh connect
# kill -9 82062
#  systemctl is-active virtqemud.socket; systemctl is-active virtqemud.service
active
active
# systemctl is-active libvirt-guests.service 
inactive

4.Start&destroy libvirt-guests when both virtqemud.service and virtqemud.socket inactive, with default service file:
# systemctl start libvirt-guests
# systemctl is-active virtqemud.socket
inactive
# systemctl is-active virtqemud.service 
inactive
# systemctl is-active libvirt-guests.service
active
# systemctl stop libvirt-guests
#

5.Modify service file as guided, 
# cp /usr/lib/systemd/system/libvirt-guests.service /etc/systemd/system/libvirt-guests.service
# vim /etc/systemd/system/libvirt-guests.service
(add After=virtqemud.service under [unit], and remove 'After=virtqemud.socket')
# cat /etc/systemd/system/libvirt-guests.service|grep virtqemud
After=virtqemud.service
# systemctl is-active virtqemud.socket; systemctl is-active virtqemud.service
active
inactive
# systemctl stop libvirt-guests
(still hangs)

6.Modify /usr/lib/systemd/system/libvirt-guests.service directly:
# cat /usr/lib/systemd/system/libvirt-guests.service|grep virtqemud
After=virtqemud.service
# systemctl daemon-reload
# systemctl start libvirt-guests
(hangs)
(manually kill virsh connect) and # systemctl start virtqemud
# systemctl start libvirt-guests
# systemctl stop virtqemud
# systemctl stop libvirt-guests
(hangs)

Hi Martin and Peter,
In latest test, 'stop libvirt-guests' still hangs when virtqemud inactive and virtqemud.socket active. Could you check above tests&results please? Anything I missed or did wrongly? 
Thanks!

Comment 11 Martin Kletzander 2022-10-17 08:40:09 UTC
Did you do systemctl daemon-reload after changing the service file?  Not sure this needs to be done without libvirt-guests, whether it registers the change for the already running service, but that could be the only last missing piece, hopefully.

Comment 12 Yanqiu Zhang 2022-10-17 12:30:30 UTC
Retested:
# cp /usr/lib/systemd/system/libvirt-guests.service /etc/systemd/system/libvirt-guests.service
# vim /etc/systemd/system/libvirt-guests.service
#  cat /etc/systemd/system/libvirt-guests.service|grep virtqemud
After=virtqemud.service
#  systemctl daemon-reload
# systemctl start libvirt-guests
# systemctl is-active virtqemud.socket; systemctl is-active virtqemud.service
active
inactive
# systemctl is-active libvirt-guests
active
# systemctl stop libvirt-guests
hangs
# ps aux|grep virsh
root       90132  0.2  0.0 325904 16000 ?        Sl   08:13   0:00 virsh connect
# systemctl status libvirt-guests
● libvirt-guests.service - Suspend/Resume Running libvirt Guests
     Loaded: loaded (/etc/systemd/system/libvirt-guests.service; disabled; vendor preset: disabled)
     Active: deactivating (stop) since Mon 2022-10-17 08:13:30 EDT; 4min 59s ago
...

Hi Martin,
The test result is still the same: hangs. Please help confirm. Thank you!
I'm not sure, is it possible to skip test_connect if virtqemud.service inactive when stop libvirt-guests?

Comment 13 Martin Kletzander 2022-10-17 13:36:05 UTC
Oh, I see, I missed what you are trying to do there.  The paragraph in the man page only relates to the masking of the sockets and switching to traditional service.  I guess that could be more specific in the man page, it's also missing the fact that with traditional service users should remove the --timeout option from the service files.

Basically you should only do the change if switching to traditional non-socket activated services.

Comment 14 Yanqiu Zhang 2022-10-18 02:25:51 UTC
(In reply to Martin Kletzander from comment #13)
But by default rhel9 is using socket activated services. Do you mean libvirt-guests can not be used in current mode?

#  cat /usr/lib/systemd/system/virtqemud.service|grep timeout
#Environment=VIRTQEMUD_ARGS="--timeout 120"  <---commented
#  systemctl daemon-reload

1.In current mode(socket activation):
(1)if virtqemud.socket active and virtqemud.service inactive:
#  systemctl is-active virtqemud.socket; systemctl is-active virtqemud.service
active
inactive
# systemctl start libvirt-guests
#  systemctl is-active virtqemud.socket; systemctl is-active virtqemud.service
active
inactive    <---start libvirt-guests didn't invoke virtqemud.service
# systemctl stop libvirt-guests.service 
(still hangs)

(2)if both virqemud.socket and virtqemud.service active:
#  systemctl is-active virtqemud.socket; systemctl is-active virtqemud.service
active
active
# systemctl start libvirt-guests
# ps aux|grep virtqemud
root       91343  0.4  0.0 1581368 29396 ?       Ssl  21:54   0:00 /usr/sbin/virtqemud
# systemctl stop libvirt-guests
# (succeed)

2. try to disable socket activation:
# for drv in qemu interface network nodedev nwfilter secret storage proxy; do systemctl stop virt${drv}d{,-ro,-admin}.socket; systemctl mask virt${drv}d{,-ro,-admin}.socket;done
# systemctl start virtqemud
Failed to start virtqemud.service: Unit virtqemud-admin.socket is masked.
--->So virtqemud does not support non-socket activation since it uses 'Requires=virtqemud.socket' instead of 'wants'.


Could you tell a complete set of steps for current socket-activation mode please?
And need this bug be set back to 'assigned'?
Thank you!

Comment 15 Martin Kletzander 2022-10-19 14:19:45 UTC
For socket activation mode there is nothing you need to do (with the patch included),that should just work.

For traditional mode (no socket activation) there are (were) steps to be taken, but we decided that supporting non-socket activation for anything else than the modular libvirt daemon is not necessary, so we removed the docs on how to revert it (since it could have never worked since the introduction of modular daemons anyway).

I think this BZ is done as it is, since the only issue is not related to libvirt-guests and is not supported now.

Comment 16 Yanqiu Zhang 2022-10-20 01:57:59 UTC
(In reply to Martin Kletzander from comment #15)

So the conclusion is:
1. For socket activation mode there is nothing need to do, since they have already used 'After=*.socket' instead of *.service.

2. For non-socket activation mode, need manually replace *.socket with *.service, since libvirt-guests.service unit file does not handle every mode now.

3.The usage of ‘modular daemons(like virtqemud) only working with socket activation mode‘ and 'monolithic daemon can work with both socket/non-socket activation modes' should be user well-known, nothing to concern by this fix.

4.The issue of libvirt-guests hanging is not fixed, user self need just ensure their virtqemud/libvirtd service will not timeout(remove the --timeout option) or manually start to make them alive when trying to use libvirt-guests.service.

Are the above correct?
Thanks!

Comment 17 Martin Kletzander 2022-10-20 07:55:28 UTC
1-3 are correct

libvirt-guests should work correctly and not get stuck, even with the non-socket activated monolithic daemon

What is the issue with the current state of libvirt-guests then?

Comment 18 Yanqiu Zhang 2022-10-20 08:24:40 UTC
1. monolithic daemon with non-socket, works well.
# systemctl status libvirtd.socket
○ libvirtd.socket
     Loaded: masked (Reason: Unit libvirtd.socket is masked.)
#  systemctl status libvirt-guests
○ libvirt-guests.service - Suspend/Resume Running libvirt Guests
     Loaded: loaded (/etc/systemd/system/libvirt-guests.service; disabled; vendor preset: disabl>
     Active: inactive (dead) since Thu 2022-10-20 04:05:18 EDT; 2min 25s ago
# cat /etc/systemd/system/libvirt-guests.service|grep libvirtd
After=libvirtd.service
# systemctl is-active libvirt-guests;systemctl is-active libvirtd
inactive
inactive
# systemctl start libvirt-guests
# systemctl is-active libvirt-guests;systemctl is-active libvirtd
active
inactive
# systemctl stop libvirt-guests
# systemctl is-active libvirt-guests;systemctl is-active libvirtd
inactive
inactive

2.monolithic daemon with socket activation:
# systemctl status libvirt-guests
○ libvirt-guests.service - Suspend/Resume Running libvirt Guests
     Loaded: loaded (/usr/lib/systemd/system/libvirt-guests.service; disabled; vendor preset: di>
# cat /usr/lib/systemd/system/libvirt-guests.service|grep libvirtd
After=libvirtd.socket
(1)start and stop libvirt-guests when libvirtd inactive, stop will still hang.
# systemctl is-active libvirt-guests;systemctl is-active libvirtd;systemctl is-active libvirtd.socket
inactive
inactive
active
# systemctl start libvirt-guests
# systemctl is-active libvirt-guests;systemctl is-active libvirtd;systemctl is-active libvirtd.socket
active
inactive
active
# systemctl stop libvirt-guests
(hangs)
# ps aux|grep virsh
root      145478  0.2  0.0 325904 15860 ?        Sl   04:16   0:00 virsh connect
(2)stop libvirt-guests when libvirtd.service running, works well.
# systemctl is-active libvirt-guests;systemctl is-active libvirtd;systemctl is-active libvirtd.socket
inactive
active
active
# systemctl start libvirt-guests
# systemctl stop libvirt-guests
#(succeed)
(3)stop libvirt-guests when libvirtd.service timeout, wil still hang.
# systemctl is-active libvirt-guests;systemctl is-active libvirtd;systemctl is-active libvirtd.socket
active
active
active
# ps aux|grep libvirtd
root      145499  0.5  0.0 1892384 52600 ?       Ssl  04:17   0:00 /usr/sbin/libvirtd --timeout 120

(wait for it timeout)
# ps aux|grep libvirtd
(no libvirtd running)
# systemctl is-active libvirt-guests;systemctl is-active libvirtd;systemctl is-active libvirtd.socket
active
inactive
active
# systemctl stop libvirt-guests
(still hangs)
# ps aux|grep virsh
root      145673  0.1  0.0 325904 16052 ?        Sl   04:22   0:00 virsh connect

Comment 19 Yanqiu Zhang 2022-10-20 09:02:39 UTC
3. modular daemon with socket activation:
# systemctl status libvirt-guests
○ libvirt-guests.service - Suspend/Resume Running libvirt Guests
     Loaded: loaded (/usr/lib/systemd/system/libvirt-guests.service; disabled; vendor preset: disabled)
# cat /usr/lib/systemd/system/libvirt-guests.service|grep virtqemud
After=virtqemud.socket
(1)start and stop libvirt-guests when virtqemud inactive, stop will still hang.
# systemctl is-active libvirt-guests;systemctl is-active virtqemud; systemctl is-active virtqemud.socket
inactive
inactive
active
# systemctl start libvirt-guests
# systemctl is-active libvirt-guests;systemctl is-active virtqemud; systemctl is-active virtqemud.socket
active
inactive
active
# systemctl stop libvirt-guests
(still hangs)
# ps aux|grep virsh
root      146343  0.0  0.0 325904 16092 ?        Sl   04:37   0:00 virsh connect
# pkill virsh
# systemctl is-active libvirt-guests;systemctl is-active virtqemud; systemctl is-active virtqemud.socket
inactive
active
active
(2)stop libvirt-guests when virtqemud.service running, works well
# systemctl is-active libvirt-guests;systemctl is-active virtqemud; systemctl is-active virtqemud.socket
active
active
active
# systemctl stop libvirt-guests
# (succeed)
(3)stop libvirt-guests when virtqemud.service timeout, wil still hang.
# systemctl start libvirt-guests
# ps aux|grep virtqemud
root      146355  0.1  0.0 1581368 31140 ?       Ssl  04:42   0:00 /usr/sbin/virtqemud --timeout 120
(wait for it timeout)
# ps aux|grep virtqemud
(no virtqemud service running)
# systemctl is-active libvirt-guests;systemctl is-active virtqemud; systemctl is-active virtqemud.socket
active
inactive
active
# systemctl stop libvirt-guests
(still hangs)
# ps aux|grep virsh
root      146343  0.0  0.0 325904 16092 ?        Sl   04:37   0:00 virsh connect


From scenarios 2(1)(3), 3(1)(3), the issue same with comment0 still exists. 
The issue is: 
Stopping libvirt-guests will still hang when libvirtd/virtqemud service inactive, even though the libvirt-guests.service unit file only uses ‘After=libvirtd.socket/virtqemud.socket’ instead of ‘After=libvirtd.service/virtqemud.service’. The libvirtd/virtqemud service can only be started(automatically) after ‘stop libvirt-guests’ exits by ‘#pkill virsh’.

Comment 20 Yanqiu Zhang 2022-10-24 03:29:52 UTC
Hi,

I find that if remove 'Before=libvirt-guests.service' in virtqemud.service unit file, the 'hang' issue will disappear:
Still under modular daemon with socket activation situation:
# vim /usr/lib/systemd/system/virtqemud.service
# systemctl daemon-reload
# cat /usr/lib/systemd/system/virtqemud.service|grep libvirt-guests
#Before=libvirt-guests.service  <==comment it(equals to remove)
Start libvirt-guests even when virtqemud not active:
# systemctl start libvirt-guests
# systemctl is-active libvirt-guests;systemctl is-active virtqemud; systemctl is-active virtqemud.socket
active
inactive
active
# systemctl stop libvirt-guests
(succeed)
# systemctl is-active libvirt-guests;systemctl is-active virtqemud; systemctl is-active virtqemud.socket
inactive
active
active

Is this missed in patch? 
But I'm not sure side effect for any other function scenarios.

Comment 21 Martin Kletzander 2022-10-26 14:36:42 UTC
That was definitely missed, there is no need for virtqemud.service (or the other services) to have Before=libvirt-guests.  I am not sure how I tested it that it worked, maybe I had a modification on the other service file.  Thank you for also figuring out the cause.  I sent a patch upstream, will update this BZ once something changes.  Feel free to move this to assigned.

Comment 22 Yanqiu Zhang 2022-10-26 15:18:05 UTC
Thanks Martin's confirmation.

Comment 28 Yanqiu Zhang 2022-11-01 09:51:18 UTC
Tested with libvirt-v8.9.0-rc2-1-g72d4709ab9:

1.default modular daemon with socket activation:
(1) start&destroy libvirt-guests while virtqemud.service inactive:
#  systemctl is-active libvirt-guests;systemctl is-active virtqemud; systemctl is-active virtqemud.socket
inactive
inactive
active
# systemctl start libvirt-guests
#  systemctl is-active libvirt-guests;systemctl is-active virtqemud; systemctl is-active virtqemud.socket
active
inactive
active
# systemctl stop libvirt-guests
(succeed)
#  systemctl is-active libvirt-guests;systemctl is-active virtqemud; systemctl is-active virtqemud.socket
inactive
active
active

(2)destroy libvirt-guests when virtqemud.service timeout
# systemctl start libvirt-guests
# ps aux|grep virtqemud
root      205266  0.0  0.0 1367228 21720 ?       Ssl  04:08   0:00 /usr/sbin/virtqemud --timeout 120
(wait for it timeout)
#  systemctl is-active libvirt-guests;systemctl is-active virtqemud; systemctl is-active virtqemud.socket
active
inactive
active
# systemctl stop libvirt-guests
(succeed)
#  systemctl is-active libvirt-guests;systemctl is-active virtqemud; systemctl is-active virtqemud.socket
inactive
active
active

(3)stop libvirt-guests when virtqemud.service running with vm, then start it when virtqemud timeout:
# virsh start avocado-vt-vm1
Domain 'avocado-vt-vm1' started
# virsh console avocado-vt-vm1
…localhost login: root
Password: 
[root@localhost ~]#
# systemctl start libvirt-guests
#  systemctl is-active libvirt-guests;systemctl is-active virtqemud; systemctl is-active virtqemud.socket
active
active
active
# systemctl stop libvirt-guests
(succeed)
# virsh list --all --managed-save
 Id   Name             State
------------------------------
 -    avocado-vt-vm1   saved
# ps aux|grep virtqemud
root      206508  0.1  0.1 1530700 36164 ?       Ssl  04:21   0:00 /usr/sbin/virtqemud --timeout 120
(wait for it timeout)
#  systemctl is-active libvirt-guests;systemctl is-active virtqemud; systemctl is-active virtqemud.socket
inactive
inactive
active
# systemctl start libvirt-guests
#  systemctl is-active libvirt-guests;systemctl is-active virtqemud; systemctl is-active virtqemud.socket
active
active
active
# virsh list --all --managed-save
 Id   Name             State
--------------------------------
 1    avocado-vt-vm1   running

# virsh console avocado-vt-vm1
…[root@localhost ~]# 
(vm restored)


----------------
2.monolithic daemon with socket activation:

(1)start&stop libvirt-guests when libvirtd.service inactive 
# systemctl start libvirt-guests
# systemctl is-active libvirt-guests;systemctl is-active libvirtd;systemctl is-active libvirtd.socket
active
inactive
active
# systemctl stop libvirt-guests
(succeed)
# systemctl is-active libvirt-guests;systemctl is-active libvirtd;systemctl is-active libvirtd.socket
inactive
active
active

(2)stop libvirt-guests when libvirtd.service timeout
# systemctl is-active libvirt-guests;systemctl is-active libvirtd;systemctl is-active libvirtd.socket
active
active
active
# ps aux|grep libvirtd
root      210615  7.0  0.1 1559484 50236 ?       Ssl  05:34   0:00 /usr/sbin/libvirtd --timeout 120
(wait for it timeout)
# systemctl is-active libvirt-guests;systemctl is-active libvirtd;systemctl is-active libvirtd.socket
active
inactive
active
# systemctl stop libvirt-guests
(succeed)
# systemctl is-active libvirt-guests;systemctl is-active libvirtd;systemctl is-active libvirtd.socket
inactive
active
active

(3)stop libvirt-guests when libvirtd.service running with vm, then start it when libvirtd timeout:
# virsh start avocado-vt-vm1
Domain 'avocado-vt-vm1' started

# systemctl start libvirt-guests
# systemctl is-active libvirt-guests;systemctl is-active libvirtd;systemctl is-active libvirtd.socket
active
active
active
# systemctl stop libvirt-guests
(succeed)
# virsh list --all --managed-save
 Id   Name             State
------------------------------
 -    avocado-vt-vm1   saved

# ps aux|grep libvirtd
root      210792  0.3  0.1 1641452 52416 ?       Ssl  05:36   0:00 /usr/sbin/libvirtd --timeout 120
(wait for it timeout)
# systemctl is-active libvirt-guests;systemctl is-active libvirtd;systemctl is-active libvirtd.socket
inactive
inactive
active
# systemctl start libvirt-guests
# systemctl is-active libvirt-guests;systemctl is-active libvirtd;systemctl is-active libvirtd.socket
active
active
active
# virsh list --all
 Id   Name             State
--------------------------------
 1    avocado-vt-vm1   running


PASS for these two scenarios.

Comment 30 Yanqiu Zhang 2022-11-01 12:18:35 UTC
4. Supplementary docs checking:
(1)Check 'Before=libvirt-guests.service' removement:
# grep libvirt-guests /usr/lib/systemd/system/virt*.service
(nothing output)

# grep libvirt-guests /usr/lib/systemd/system/libvirt*.service
/usr/lib/systemd/system/libvirt-guests.service:Documentation=man:libvirt-guests(8)
/usr/lib/systemd/system/libvirt-guests.service:EnvironmentFile=-/etc/sysconfig/libvirt-guests
/usr/lib/systemd/system/libvirt-guests.service:ExecStart=/usr/libexec/libvirt-guests.sh start
/usr/lib/systemd/system/libvirt-guests.service:ExecStop=/usr/libexec/libvirt-guests.sh stop

(no Before=libvirt-guests.service anymore)

# grep libvirt-guests /usr/lib/systemd/system/*.service
/usr/lib/systemd/system/libvirt-guests.service:Documentation=man:libvirt-guests(8)
/usr/lib/systemd/system/libvirt-guests.service:EnvironmentFile=-/etc/sysconfig/libvirt-guests
/usr/lib/systemd/system/libvirt-guests.service:ExecStart=/usr/libexec/libvirt-guests.sh start
/usr/lib/systemd/system/libvirt-guests.service:ExecStop=/usr/libexec/libvirt-guests.sh stop
/usr/lib/systemd/system/xen-qemu-dom0-disk-backend.service:Before=xendomains.service libvirtd.service libvirt-guests.service  <---still exists
/usr/lib/systemd/system/xenstored.service:Before=libvirtd.service libvirt-guests.service  <---still exists

(two corner texts left)

(2)split daemon service manpage:(per commit f53988d657c6aafd7aa4f2070df01c43f1a936ab:“docs: Do not support non-socket activated modular daemons with systemd”)
# man virtqemud
...
DAEMON STARTUP MODES
       The virtqemud daemon is capable of starting in two modes.

   Socket activation mode
       On  hosts  with systemd it is started in socket activation mode and it will rely on systemd to create and listen on the UNIX
       sockets and pass them as pre-opened file descriptors. In this mode most of the socket related config  options  in  /etc/lib‐
       virt/virtqemud.conf will no longer have any effect.

   Traditional service mode
       On hosts without systemd, it will create and listen on UNIX sockets itself.
...
(no "revert to the traditional" texts anymore)

# grep 'revert to the traditional'  libvirt/docs/manpages/ -Rin
libvirt/docs/manpages/libvirtd.rst:74:OS that uses systemd. To revert to the traditional mode, all the socket

(only libvirtd has the words.)

Almost PASS.

Comment 32 Yanqiu Zhang 2022-11-01 12:27:56 UTC
Hi martin,

There're still two 'Before=libvirt-guests' left in 2 files per comment30 scenario4(1). But they don't affect our current test scope. Could you please modify them later? Thanks!

Comment 33 Martin Kletzander 2022-11-02 13:53:19 UTC
(In reply to yanqzhan from comment #32)
> Hi martin,
> 
> There're still two 'Before=libvirt-guests' left in 2 files per comment30
> scenario4(1). But they don't affect our current test scope. Could you please
> modify them later? Thanks!

I'm not sure what package is the owner of those files, but to my knowledge that is not libvirt.  Can you check it with

rpm -qf /usr/lib/systemd/system/xen-qemu-dom0-disk-backend.service /usr/lib/systemd/system/xenstored.service

please?  I can't seem to even install those.

Comment 34 Yanqiu Zhang 2022-11-03 03:26:49 UTC
(In reply to Martin Kletzander from comment #33)
Yes.
# rpm -qf /usr/lib/systemd/system/xen-qemu-dom0-disk-backend.service /usr/lib/systemd/system/xenstored.service
xen-runtime-4.16.2-2.fc37.x86_64
xen-runtime-4.16.2-2.fc37.x86_64

Maybe they were installed on my host as a dependency when compiling libvirt source.
Need they be modified or maybe not?

Comment 37 Martin Kletzander 2022-11-04 14:31:24 UTC
(In reply to yanqzhan from comment #34)
> (In reply to Martin Kletzander from comment #33)
> Yes.
> # rpm -qf /usr/lib/systemd/system/xen-qemu-dom0-disk-backend.service
> /usr/lib/systemd/system/xenstored.service
> xen-runtime-4.16.2-2.fc37.x86_64
> xen-runtime-4.16.2-2.fc37.x86_64
> 
> Maybe they were installed on my host as a dependency when compiling libvirt
> source.
> Need they be modified or maybe not?

So I took a look at that and those services do not have any socket-activated options and we cannot depend on them, so I think it is correct the way it is.

Comment 38 Yanqiu Zhang 2022-11-23 09:46:58 UTC
Tested with:
libvirt-8.9.0-2.el9.x86_64
qemu-kvm-7.1.0-5.el9.x86_64

PASS with same steps and results as comment28, comment29, comment30.

Comment 40 errata-xmlrpc 2023-05-09 07:26:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (libvirt bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2171


Note You need to log in before you can comment on or make changes to this bug.