Bugzilla (bugzilla.redhat.com) will be under maintenance for infrastructure upgrades and will not be available on July 31st between 12:30 AM - 05:30 AM UTC. We appreciate your understanding and patience. You can follow status.redhat.com for details.
Bug 1878724 - vdsm-tool configure is failing with error "dependency job for libvirtd.service failed"
Summary: vdsm-tool configure is failing with error "dependency job for libvirtd.servic...
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 4.4.1
Hardware: All
OS: Linux
medium
medium
Target Milestone: ovirt-4.4.8
: ---
Assignee: Marcin Sobczyk
QA Contact: Petr Matyáš
URL:
Whiteboard:
Depends On: 1889363
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-14 12:01 UTC by nijin ashok
Modified: 2021-06-30 19:03 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:
oVirt Team: Infra
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 5392541 0 None None None 2020-09-22 06:28:39 UTC
oVirt gerrit 111585 0 master MERGED tool: libvirt: Stop libvirt sockets on reconfiguration 2021-02-12 04:57:36 UTC

Description nijin ashok 2020-09-14 12:01:03 UTC
Description of problem:

The TLS service for libvirtd is not enabled by default. It is enabled while the host is added to the manager. However, before that if a user starts any service that requires libvirtd (example virt-who), it will spawn libvirtd process as below.

===
# systemctl start virt-who

# systemctl status libvirtd
● libvirtd.service - Virtualization daemon
   Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/libvirtd.service.d
           └─unlimited-core.conf
   Active: active (running) since Mon 2020-09-14 10:31:30 UTC; 22s ago

# ps aux|grep libvirtd
root        2234  0.7  1.1 1818924 44472 ?       Ssl  10:31   0:00 /usr/sbin/libvirtd --timeout 120

# systemctl is-enabled libvirtd-tls.socket
disabled

# systemctl status libvirtd-tls.socket
● libvirtd-tls.socket - Libvirt TLS IP socket
   Loaded: loaded (/usr/lib/systemd/system/libvirtd-tls.socket; disabled; vendor preset: disabled)
   Active: inactive (dead)
   Listen: [::]:16514 (Stream)
===

During the vdsm-tool configuration phase, we stop libvirtd service, add the libvirtd-tls.socket as required service and then we will start the libvirtd service.

This is failing while libvitd tries to start the libvirt-tls service.

===
2020-09-14 16:23:40 IST - TASK [ovirt-host-deploy-vdsm : Reconfigure vdsm tool] **************************

"stderr_lines" : [ "Error:  ServiceOperationError: _systemctlStart failed", "b\"A dependency job for libvirtd.service failed. See 'journalctl -xe' for details.\\n\" " ],

# systemctl status libvirtd-tls.socket
● libvirtd-tls.socket - Libvirt TLS IP socket
   Loaded: loaded (/usr/lib/systemd/system/libvirtd-tls.socket; enabled; vendor preset: disabled)
   Active: inactive (dead)
   Listen: [::]:16514 (Stream)

Sep 14 10:54:02 vm249-58.gsslab.pnq2.redhat.com systemd[1]: libvirtd-tls.socket: Socket service libvirtd.service already active, refusing.
Sep 14 10:54:02 vm249-58.gsslab.pnq2.redhat.com systemd[1]: Failed to listen on Libvirt TLS IP socket.
===

The reinstallation will work if a user simply use reinstall again from portal.

I was able to reproduce this issue manually doing what vdsm-tool is doing. The issue is because once vdsm-tool stops libvirtd, it will be automatically started by the libvirtd.socket as virt-who is running. Then when we try to start the libvirtd again with libvirtd-tls.socket, it will fail with the mentioned error as the libvirtd is already active.

- libvirtd socket service is active after installing the host however libvirtd is inactive.

# systemctl status libvirtd.socket
● libvirtd.socket - Libvirt local socket
   Loaded: loaded (/usr/lib/systemd/system/libvirtd.socket; enabled; vendor preset: disabled)
   Active: active (listening) since Mon 2020-09-14 11:39:38 UTC; 2min 36s ago

# systemctl status libvirtd
● libvirtd.service - Virtualization daemon
   Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/libvirtd.service.d
           └─unlimited-core.conf
   Active: inactive (dead) since Mon 2020-09-14 11:41:50 UTC; 34s ago


- Started the virt-who which service which started the libvirtd.

# systemctl start virt-who

# ps aux|grep libvirtd
root        3093  2.8  1.1 1818924 46160 ?       Ssl  11:42   0:00 /usr/sbin/libvirtd --timeout 120

- Stopped the libvirtd service but socket started the service again.

# systemctl stop libvirtd
Warning: Stopping libvirtd.service, but it can still be activated by:
  libvirtd-ro.socket
  libvirtd.socket
  libvirtd-admin.socket

# systemctl status libvirtd
● libvirtd.service - Virtualization daemon
   Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/libvirtd.service.d
           └─unlimited-core.conf
   Active: active (running) since Mon 2020-09-14 11:43:10 UTC; 4s ago


- Enabled tls service and started the libvirtd which failed with mentioned error.

# ln -s /usr/lib/systemd/system/libvirtd-tls.socket /etc/systemd/system/libvirtd.service.requires/libvirtd-tls.socket ;systemctl daemon-reload

# systemctl start libvirtd
A dependency job for libvirtd.service failed. See 'journalctl -xe' for details.

Sep 14 11:43:31 vm249-58.gsslab.pnq2.redhat.com systemd[1]: Reloading.
Sep 14 11:44:18 vm249-58.gsslab.pnq2.redhat.com systemd[1]: Reloading.
Sep 14 11:44:45 vm249-58.gsslab.pnq2.redhat.com systemd[1]: libvirtd-tls.socket: Socket service libvirtd.service already active, refusing.
Sep 14 11:44:45 vm249-58.gsslab.pnq2.redhat.com systemd[1]: Failed to listen on Libvirt TLS IP socket.


I think for clean shutdown of libvirtd during vdsm-tool configuration phase, we should also stop the libvirtd.socket service.


Version-Release number of selected component (if applicable):

vdsm-4.40.22-1.el8ev.x86_64
libvirt-daemon-6.0.0-25.module+el8.2.1+7154+47ffd890.x86_64
Red Hat Virtualization Host 4.4.1 (el8.2)

How reproducible:

100%

Steps to Reproduce:

1. In a freshly deployed host, start virt-who service before adding the host to manager.

Actual results:


vdsm-tool configure is failing with error "dependency job for libvirtd.service failed"

Expected results:

vdsm-tool configure should work.

Additional info:

Comment 2 Petr Matyáš 2020-10-14 13:57:06 UTC
Using vdsm-4.40.33-1.el8ev.x86_64 this still fails the first time I try to install the host (reinstall passes just as is said in description).
Change linked in this bug is apparently present when checking the changed file on host.
I installed RHEL 8.3, then ovirt-host and virt-who which I started as well as libvirt, then I tried to install the host in an engine which failed on:

    "stdout" : "fatal: [10.37.138.41]: FAILED! => {\"changed\": true, \"cmd\": [\"vdsm-tool\", \"configure\", \"--force\"], \"delta\": \"0:00:46.909863\", \"end\": \"2020-10-14 15:19:29.123406\", \"msg\": \"non-zero return code\", \"rc\": 1, \"start\": \"2020-10-14 15:18:42.213543\", \"stderr\": \"Error:  ServiceOperationError: _systemctlStart failed\\nb'Job for libvirtd.socket failed.\\\\nSee \\\"systemctl status libvirtd.socket\\\" and \\\"journalctl -xe\\\" for details.\\\\n' \", \"stderr_lines\": [\"Error:  ServiceOperationError: _systemctlStart failed\", \"b'Job for libvirtd.socket failed.\\\\nSee \\\"systemctl status libvirtd.socket\\\" and \\\"journalctl -xe\\\" for details.\\\\n' \"], \"stdout\": \"\\nChecking configuration status...\\n\\nWARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on vdsm configuration\\nlvm requires configuration\\nlibvirt is not configured for vdsm yet\\nlibvirtd.service doesn't have requirement on libvirtd-tls.socket unit\\nDB file /var/lib/vdsm/storage/managedvolume.db doesn't exists\\nManaged volume database requires configuration\\nabrt is not configured for vdsm\\nmultipath requires configuration\\n\\nRunning configure...\\nReconfiguration of sanlock is done.\\nReconfiguration of passwd is done.\\nReconfiguration of certificates is done.\\nWARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on vdsm configuration\\nBacking up /etc/lvm/lvmlocal.conf to /etc/lvm/lvmlocal.conf.202010141519\\nInstalling /usr/share/vdsm/lvmlocal.conf at /etc/lvm/lvmlocal.conf\\nReconfiguration of lvm is done.\\nReconfiguration of libvirt is done.\\nDB file /var/lib/vdsm/storage/managedvolume.db doesn't exists\\nCreating managed volumes database at /var/lib/vdsm/storage/managedvolume.db\\nSetting up ownership of database file to vdsm:kvm\\nReconfiguration of managedvolumedb is done.\\nReconfiguration of bond_defaults is done.\\nReconfiguration of abrt is done.\\nReconfiguration of sebool is done.\\nReconfiguration of multipath is done.\", \"stdout_lines\": [\"\", \"Checking configuration status...\", \"\", \"WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on vdsm configuration\", \"lvm requires configuration\", \"libvirt is not configured for vdsm yet\", \"libvirtd.service doesn't have requirement on libvirtd-tls.socket unit\", \"DB file /var/lib/vdsm/storage/managedvolume.db doesn't exists\", \"Managed volume database requires configuration\", \"abrt is not configured for vdsm\", \"multipath requires configuration\", \"\", \"Running configure...\", \"Reconfiguration of sanlock is done.\", \"Reconfiguration of passwd is done.\", \"Reconfiguration of certificates is done.\", \"WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on vdsm configuration\", \"Backing up /etc/lvm/lvmlocal.conf to /etc/lvm/lvmlocal.conf.202010141519\", \"Installing /usr/share/vdsm/lvmlocal.conf at /etc/lvm/lvmlocal.conf\", \"Reconfiguration of lvm is done.\", \"Reconfiguration of libvirt is done.\", \"DB file /var/lib/vdsm/storage/managedvolume.db doesn't exists\", \"Creating managed volumes database at /var/lib/vdsm/storage/managedvolume.db\", \"Setting up ownership of database file to vdsm:kvm\", \"Reconfiguration of managedvolumedb is done.\", \"Reconfiguration of bond_defaults is done.\", \"Reconfiguration of abrt is done.\", \"Reconfiguration of sebool is done.\", \"Reconfiguration of multipath is done.\"]}",

Comment 3 Marcin Sobczyk 2020-10-19 10:34:35 UTC
Right, so it turns out that even though virt-who uses the 'libvirtd-ro.socket' [1]
it doesn't require it on a systemd unit level [2]. That means that even if we stop 'libvirtd-ro.socket',
'virt-who.service' will still be running and depending on the implementation anything can really happen.
This has to be fixed on virt-who side first.

Given that, the fact that we also dynamically depend on either 'libvirtd-tcp.socket' or 'libvirt-tls.socket',
so we cannot prevent a similar scenario to happen if someone uses one of these,
and the gentle nature of socket activation I would prefer to revert the patch and leave the things as is.

[1] https://github.com/candlepin/virt-who/blob/4c7fdb032a66e2fe3324cc2d7579101c699e3b00/virtwho/virt/libvirtd/libvirtd.py#L282
[2] https://github.com/candlepin/virt-who/blob/master/virt-who.service


Note You need to log in before you can comment on or make changes to this bug.