Bug 1750340 - New libvirtd uses systemd socket activation by default, which is incompatible with --listen flag usage in /etc/sysconfig/libvirtd
Summary: New libvirtd uses systemd socket activation by default, which is incompatible...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: unspecified
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ovirt-4.4.0
: ---
Assignee: Marcin Sobczyk
QA Contact: Petr Kubica
URL:
Whiteboard:
: 1750279 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-09-09 11:04 UTC by Daniel Berrangé
Modified: 2020-08-04 13:27 UTC (History)
15 users (show)

Fixed In Version: rhv-4.4.0-29
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-08-04 13:27:17 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2020:3246 0 None None None 2020-08-04 13:27:55 UTC
oVirt gerrit 103325 0 'None' MERGED tool: Adjust to libvirt's systemd socket activation 2021-02-11 12:40:09 UTC
oVirt gerrit 103390 0 'None' MERGED systemctl: Add 'enable' method 2021-02-11 12:40:09 UTC
oVirt gerrit 103391 0 'None' MERGED configurators: Extract reading libvirt connection config 2021-02-11 12:40:09 UTC

Description Daniel Berrangé 2019-09-09 11:04:57 UTC
Description of problem:

Historically libvirtd was responsible for creating sockets that it listens on. By default it would create UNIX sockets for local access, and if --listen is given, it would create either TCP or TLS sockets depending on the 'listen_tls' and 'listen_tcp' parameters in the /etc/libvirt/libvirtd.conf file

In the libvirtd 5.6.0 and later, the libvirtd daemon now prefers to uses systemd socket activation. This means that systemd creates the UNIX sockets and passes pre-opened FDs into libvirtd when it starts.  The --listen parameter is *NOT* honoured when socket activation is used. Instead the admin must tell systemd to enable the TCP or TLS socket unit file for libvirtd. Use of --listen will cause libvirtd to fail to start.

In addition the libvirtd daemon is set to automatically shutdown after 120 seconds, *IF* no VMs are running AND no client apps are connected to its socket(s). This behaviour can be disabled by changing /etc/sysconfig/libvirtd to remove the '--timeout 120' arg

The full list of systemd units for libvirtd is

  libvirtd.service        (active by default)
  libvirtd.socket         (active by default)
  libvirtd-ro.socket      (active by default)
  libvirtd-admin.socket   (active by default)
  libvirtd-tcp.socket     (NOT active by default)
  libvirtd-tls.socket     (NOT active by default)


If a host has old libvirtd < 5.6.0 installed AND has "--listen" set in /etc/sysconfig/livirtd AND the admin does an inplace 'yum upgrade', then the RPM %post script will automatically disable systemd socket activation. This is done by calling

   systemctl mask libvirtd.socket libvirtd-ro.socket libvirtd-admin.socket libvirtd-tcp.socket libvirtd-tls.socket

This ensures that in-place host upgrades don't suffer any change in behaviour that could break the existing running service/mgmt app dependant on --listen usage in /etc/sysconfig/libvirtd.

Clean host installs (no existing libvirt installed) will get systemd socket activation by default which *is* a change in behaviour wrt /etc/sysconfig/libvirtd and --listen.


Thus if the mgmt app / admin wants to use TCP/TLS sockets they have two choices

  - To continue the old approach (setting --listen in /etc/sysconfig/libvirtd), then they MUST use 'systemctl mask ...' for all the socket units listed above, before libvirtd.service is started.

  - To adapt to use the new approach, then don't touch /etc/sysconfig/libvirtd at all. Instead use 'systemctl enable libvirtd-tls.socket && 'systemctl start libvirtd-tls.socket'   (assuming  TLS is desired).



Version-Release number of selected component (if applicable):
libvirt-5.6.0-4.el8

Comment 2 Martin Perina 2019-09-10 08:03:01 UTC
*** Bug 1750279 has been marked as a duplicate of this bug. ***

Comment 3 Nir Soffer 2019-09-11 12:51:34 UTC
Daniel, what are the advantages of the new socket activation way?

Currently vdsm requires libvirtd.service, so it the service is stopped or killed vdsm
is stopped/restarted by systemd.

Vdsm mainains one connection to libvirtd, and if the connection fail vdsm panic and 
restart itself. Will this work with socket activation?

For example:
https://github.com/oVirt/vdsm/blob/b137b5f8a7706203764e21d93d0ddccdb52b31b7/lib/vdsm/common/libvirtconnection.py#L118

Comment 4 Daniel Berrangé 2019-09-11 12:59:25 UTC
Essentially this came about because there's a general desire not to have services starting unconditionally on boot in Fedora/RHEL and instead have everything start when the first client connects. 

The reason we haven't done this long ago when first adding systemd support is that there's a complication with libvirtd. It must do auto-start of VMs & other resoruces on boot up, so we can't purely rely on socket activation.

Thus what happens is that we have libvirtd start on boot, but with a 120 second timeout set. Thus if libvirtd does not autostart any VMs, *AND* no client app connects within 120 seconds, libvirtd will shutdown again. An app connecting later will cause it to start up again via socket activation.

Given that VDSM starts unconditionally on boot, and connects to libvirtd, libvirtd will start at boot & stay running as long as VDSM is running. If VDSM stops and is not started again within 120 seconds, and no VMS are running, libvirtd will stop too. It will automatically start again if VDSM is started again.

So overall I don't think there's any notable advantage from VDSM's POV - both the old & new approaches should work fine for VDSM.

Comment 5 Nir Soffer 2019-09-11 13:09:57 UTC
Thanks Daniel!

The current state is that adding a host with Fedora 29/30 fails because libvirt
will not start with the --listen argument added when running "vdsm-tool configure"

Milan, we need to resolves this issue now, since we use Fedora to work on
incremental backup. Can you handle this soon?

Comment 6 Daniel Berrangé 2019-09-11 13:12:44 UTC
NB This shouldn't affect Fedora 29/30 unless the user has manually built or pulled in a newer libvirt version from non-standard locations (eg virt-preview). Fedora 31 is the first release where this is officially included.

Comment 7 Milan Zamazal 2019-09-11 13:45:11 UTC
(In reply to Nir Soffer from comment #5)

> Milan, we need to resolves this issue now, since we use Fedora to work on
> incremental backup. Can you handle this soon?

Sorry, I can't but I hope Marcin can if needed.

Comment 8 Nir Soffer 2019-09-11 15:40:44 UTC
(In reply to Daniel Berrangé from comment #6)

We add the virt-preview repo on Fedora in ovirt-release*.rpm[1], and require
libvirt and qemu from virt-preview, so we are affected now. This is of course
not libvirt problem that we consume rawhide packages on Fedora.

[1] https://resources.ovirt.org/pub/yum-repo/ovirt-release43.rpm
[2] http://resources.ovirt.org/pub/yum-repo/ovirt-release-master.rpm

Comment 9 Marcin Sobczyk 2019-09-13 09:33:54 UTC
(In reply to Daniel Berrangé from comment #0)

I'm working on patches for this right now [1].

> If a host has old libvirtd < 5.6.0 installed AND has "--listen" set in
> /etc/sysconfig/livirtd AND the admin does an inplace 'yum upgrade', then the
> RPM %post script will automatically disable systemd socket activation. This
> is done by calling
> 
>    systemctl mask libvirtd.socket libvirtd-ro.socket libvirtd-admin.socket
> libvirtd-tcp.socket libvirtd-tls.socket

AFAICT this is only true for libvirt >= 5.7.0 [2], libvirt == 5.6.0 doesn't have that feature [3].
That means, that vdsm versions without socket activation detection will never be able to work with libvirt 5.6.0.
Vdsm with the patches should be fine with libvirt >= 5.6.0.

There's one problematic thing with current libvirt's approach - it never unmasks it's own socket units.
Masking is hard-core - nothing except running 'systemctl unmask' can retrieve from that state.

Consider this path:
- both vdsm and libvirt are old and there's no socket activation
- libvirt is upgraded to >= 5.7.0 (as mentioned earlier 5.6.0 simply won't work)
- [2] sees that there's already a '--listen' flag in args, so it masks it's socket units and everything works "the old way"
- vdsm gets upgraded and now is aware of the socket activation - the config is rewritten and the '--listen' flag removed
- there's nothing to unmask the sockets - we try to adapt to new mechanism, but libvirt is stuck with the old one

The only way we can make it work is to unmask these units by ourselves. Am I correct here?

[1] https://gerrit.ovirt.org/#/q/topic:libvirt-socket-activation+(status:open+OR+status:merged)
[2] https://github.com/libvirt/libvirt/blob/ca33d1747251e61a281f0535b6b2fd556be1f121/libvirt.spec.in#L1386
[3] https://github.com/libvirt/libvirt/blob/bafb3d1fbef9eac49230015b2fdbe60ceb1673b8/libvirt.spec.in#L1379

Comment 10 Daniel Berrangé 2019-09-13 09:38:25 UTC
(In reply to Marcin Sobczyk from comment #9)
> (In reply to Daniel Berrangé from comment #0)
> 
> I'm working on patches for this right now [1].
> 
> > If a host has old libvirtd < 5.6.0 installed AND has "--listen" set in
> > /etc/sysconfig/livirtd AND the admin does an inplace 'yum upgrade', then the
> > RPM %post script will automatically disable systemd socket activation. This
> > is done by calling
> > 
> >    systemctl mask libvirtd.socket libvirtd-ro.socket libvirtd-admin.socket
> > libvirtd-tcp.socket libvirtd-tls.socket
> 
> AFAICT this is only true for libvirt >= 5.7.0 [2], libvirt == 5.6.0 doesn't
> have that feature [3].
> That means, that vdsm versions without socket activation detection will
> never be able to work with libvirt 5.6.0.
> Vdsm with the patches should be fine with libvirt >= 5.6.0.
> 
> There's one problematic thing with current libvirt's approach - it never
> unmasks it's own socket units.

That's correct - on an upgrade scenario we wish to preserve existing behaviour, so there's no desire for us to ever unmask in this scenaro.

> Masking is hard-core - nothing except running 'systemctl unmask' can
> retrieve from that state.

Yes, but its the only way to prevent the sockets being used. A simple 'systemctl disable' is not sufficient, as systemd will auto-start the sockets when libvirtd.service is started.

> Consider this path:
> - both vdsm and libvirt are old and there's no socket activation
> - libvirt is upgraded to >= 5.7.0 (as mentioned earlier 5.6.0 simply won't
> work)
> - [2] sees that there's already a '--listen' flag in args, so it masks it's
> socket units and everything works "the old way"
> - vdsm gets upgraded and now is aware of the socket activation - the config
> is rewritten and the '--listen' flag removed

VDSM doesn't need to change the existing config of libvirtd upon upgrade. If you have an existing install of libvirtd with --listen present, it will still work with VDSM after upgrading VDSM. 

> - there's nothing to unmask the sockets - we try to adapt to new mechanism,
> but libvirt is stuck with the old one
> 
> The only way we can make it work is to unmask these units by ourselves. Am I
> correct here?

Yes, that's correct, but there's no need to change the config of your existing host.

Comment 11 RHV bug bot 2019-10-22 17:26:01 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 12 RHV bug bot 2019-10-22 17:39:17 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 13 RHV bug bot 2019-10-22 17:46:31 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 14 RHV bug bot 2019-10-22 18:02:19 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 15 Johnny Westerlund 2019-11-15 07:42:13 UTC
I just upgraded my Fedora 30 - 31 and i had / have the --listen arg in sysconfig/libvirtd.
However after upgrade i do not see libvirt listening.. 

[root@localhost ~]# ss -plnt
State     Recv-Q    Send-Q       Local Address:Port        Peer Address:Port                                                                                    
LISTEN    0         1                127.0.0.1:5900             0.0.0.0:*        users:(("qemu-system-x86",pid=6680,fd=27))                                     
LISTEN    0         32               127.0.0.1:53               0.0.0.0:*        users:(("dnsmasq",pid=1704,fd=5))                                              
LISTEN    0         32            192.168.42.1:53               0.0.0.0:*        users:(("dnsmasq",pid=1359,fd=6))                                              
LISTEN    0         32           192.168.130.1:53               0.0.0.0:*        users:(("dnsmasq",pid=1327,fd=6))                                              
LISTEN    0         32           192.168.122.1:53               0.0.0.0:*        users:(("dnsmasq",pid=1287,fd=6))                                              
LISTEN    0         5                127.0.0.1:631              0.0.0.0:*        users:(("cupsd",pid=1088,fd=10))                                               
LISTEN    0         50               127.0.0.1:36421            0.0.0.0:*        users:(("SpiderOakGroups",pid=2589,fd=17))                                     
LISTEN    0         128              127.0.0.1:42823            0.0.0.0:*        users:(("crc-driver-libv",pid=7440,fd=3))                                      
LISTEN    0         5                    [::1]:631                 [::]:*        users:(("cupsd",pid=1088,fd=9))                                                
LISTEN    0         128                      *:9090                   *:*        users:(("systemd",pid=1,fd=54))                                                
[root@localhost ~]# 


snippet from /etc/libvirt/libvirtd.conf

listen_tcp = 1

# Override the port for accepting secure TLS connections
# This can be a port number, or service name
#
#tls_port = "16514"

# Override the port for accepting insecure TCP connections
# This can be a port number, or service name
#
tcp_port = "16509"


[root@localhost ~]# systemctl status libvirtd.service
● libvirtd.service - Virtualization daemon
   Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2019-11-15 08:16:46 CET; 24min ago
     Docs: man:libvirtd(8)
           https://libvirt.org
 Main PID: 6059 (libvirtd)
    Tasks: 24 (limit: 32768)
   Memory: 104.2M
      CPU: 996ms
   CGroup: /system.slice/libvirtd.service
           ├─1287 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
           ├─1288 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
           ├─1327 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/crc.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
           ├─1328 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/crc.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
           ├─1359 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/docker-machines.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
           ├─1360 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/docker-machines.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
           └─6059 /usr/sbin/libvirtd --listen --config /etc/libvirt/libvirtd.conf

Comment 16 RHV bug bot 2019-11-19 11:53:29 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 17 RHV bug bot 2019-11-19 12:03:30 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 19 RHV bug bot 2019-12-13 13:17:01 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 20 RHV bug bot 2019-12-20 17:46:19 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 21 RHV bug bot 2020-01-08 14:50:35 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 22 RHV bug bot 2020-01-08 15:19:23 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 23 RHV bug bot 2020-01-24 19:52:15 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 25 Petr Kubica 2020-04-24 01:07:18 UTC
Verified in rhv-4.4.0-31

checked rhel and rhv-h el8 based host for properly configured socket activation without --listen argument

Comment 27 errata-xmlrpc 2020-08-04 13:27:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHV RHEL Host (ovirt-host) 4.4), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:3246


Note You need to log in before you can comment on or make changes to this bug.