Bug 1980166

Summary: Too many libvirt connections from Satellite due to ssh connection leaks
Product: Red Hat Satellite Reporter: matt jia <mjia>
Component: Compute Resources - libvirtAssignee: Lukas Zapletal <lzap>
Status: CLOSED ERRATA QA Contact: Lukáš Hellebrandt <lhellebr>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.9.0CC: ahumbe, arahaman, fdelorey, fhirtz, imomin, ktordeur, lzap, oezr, pcreech, tbrisker, yyadav
Target Milestone: 6.11.0Keywords: Regression, Triaged
Target Release: Unused   
Hardware: All   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-07-05 14:29:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description matt jia 2021-07-08 01:20:53 UTC
Our customers are still impacted by this connection error between Satellite and libvrt:

Call to virConnectOpen failed: End of file while reading data: Ncat: Connection reset by peer.: Input/output error


Thus, open a BZ for this known upstream issue:

https://projects.theforeman.org/issues/14854

Comment 2 Lukas Zapletal 2021-07-08 12:14:17 UTC
Please increase the following values

#max_anonymous_clients = 20
#max_workers = 20

in /etc/libvirt/libvirtd.conf and restart libvirtd daemon. The libvirt daemon in RHEL is not configured for heavy concurrent client use, we generally do recomment oVirt or Red Hat Enterprise virtualization for enterprise workloads.

Comment 3 Lukas Zapletal 2021-07-08 14:06:49 UTC
An upstream patch to properly close tcp+ssh connections is pending review: https://projects.theforeman.org/issues/14854

Use the workaround from comment 2 as a temporary solution.

Comment 12 Bryan Kearney 2021-08-12 12:05:21 UTC
Upstream bug assigned to lzap

Comment 13 Bryan Kearney 2021-08-12 12:05:23 UTC
Upstream bug assigned to lzap

Comment 15 Lukas Zapletal 2021-09-10 11:56:29 UTC
The patch is still pending review, from my side it should be ready to be merged. I poked my colleague once again to take a look.

Subscribe here for immediate updates: https://github.com/theforeman/foreman/pull/8652

Comment 19 Lukas Zapletal 2022-02-07 10:03:39 UTC
For the record, this bug was fixed and approved into 7.0 testing phase.

Comment 20 Lukáš Hellebrandt 2022-02-07 13:33:08 UTC
Reproducer (tried on my 6.10.2):

Have a Satellite and Libvirt configured on the same machine such that Libvirt can be added to the Satellite as a CR ( http://thomasmullaly.com/2014/08/15/connect-foreman-to-kvm-host-machine/ ).

In `/etc/libvirt/libvirtd.conf`, set `max_clients = 20`.
In `/etc/ssh/sshd_config`, set `ClientAliveInterval 1000`.
In `/etc/foreman/settings.yaml`, set
```
:logging:
  :level: debug
```
# systemctl restart libvirtd
# systemctl restart sshd
# foreman-maintain service restart

Run `foreman-tail`.

In WebUI:
1) Add the Libvirt to the CR using qemu+ssh scheme
2) Go to Infrastructure -> Compute Resources -> <libvirt> -> Virtual Machines
3) Do 2) once again

=> The first time, the VMs load. The second time, the VMs don't load and after some time, the following error message appears: `There was an error listing VMs: 502 Proxy Error`. In the tailed log, there are NOT the following messages (as expected):
```
Created libvirt connection <id>: qemu+ssh://root@<fqdn>/system
Terminating libvirt connection <id>: libvirt_connection_qemu+ssh://root@<fqdn>/system
```

-------------------------------------
Verified with Sat 7.0 snap 7.0 on RHEL7.

Using the above reproducer, the VMs list always loads no matter how many times I try and the described messages are shown in log.

Comment 21 Brad Buckingham 2022-06-23 10:23:32 UTC
*** Bug 2090963 has been marked as a duplicate of this bug. ***

Comment 24 errata-xmlrpc 2022-07-05 14:29:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Satellite 6.11 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5498