Bug 2184996

Summary: Katello Agent / Goferd Service CLOSE_WAIT Connections on RHEL8 Clients
Product: Red Hat Satellite Reporter: myoder
Component: katello-agentAssignee: Ian Ballou <iballou>
Status: CLOSED ERRATA QA Contact: Cole Higgins <chiggins>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.12.2CC: ahumbe, harliu, iballou, jlenz, juwatts, lufu, osousa, rlavi, satellite6-bugs, zhunting
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qpid-proton-c-0.37.0-1.el8.x86_64,python3-qpid-proton-0.37.0-1.el8.x86_64 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-10-20 22:30:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description myoder 2023-04-06 12:30:56 UTC
Description of problem:

Katello-agent / goferd service is leaving connections in a CLOSE_WAIT status on RHEL8 clients.  Katello-agent is able to successfully install packages on RHEL8 client, but there are CLOSE_WAIT connections left open that never close.  Restarting the goferd service will clean these up.

This issue is not happening with RHEL7 or RHEL9 clients.

Latest packages installed on RHEL8 system (which are the same versions that the RHEL9 client is using, but the RHEL9 client doesn't experience the same issue):

~~~
gofer-2.12.5-8.el8sat.noarch
python3-gofer-proton-2.12.5-8.el8sat.noarch
python3-gofer-2.12.5-8.el8sat.noarch
katello-agent-3.5.7-3.el8sat.noarch
~~~


Satellite connections to RHEL8 client (192.168.0.133), we see there is a FIN_WAIT2 after package is installed with katello-agent:

~~~
[root ~]# netstat -neopa | grep '192.168.0.133' 
tcp        0      0 192.168.0.44:5647       192.168.0.133:45132     ESTABLISHED 987        3631475    68157/qdrouterd      off (0.00/0/0)
tcp        0      0 192.168.0.44:5647       192.168.0.133:49166     FIN_WAIT2   0          0          -                    timewait (46.95/0/0)
tcp        0      0 192.168.0.44:5647       192.168.0.133:49148     FIN_WAIT2   0          0          -                    timewait (24.44/0/0)
~~~


While the Satellite connection makes a TIME_WAIT for the RHEL9 client (192.168.0.37) and does not give the FIN_WAIT2 for RHEL9 client:

~~~
[root ~]# netstat -neopa | grep '192.168.0.37' 
tcp        0      0 192.168.0.44:5647       192.168.0.37:45770      TIME_WAIT   0          0          -                    timewait (54.11/0/0)
tcp        0      0 192.168.0.44:5647       192.168.0.37:45766      ESTABLISHED 987        6191364    68157/qdrouterd      off (0.00/0/0)
~~~


RHEL8 client connections to Satellite port 5647 left in CLOSE_WAIT:

~~~
tcp	  61	  0 192.168.0.133:49168     192.168.0.44:5647       CLOSE_WAIT  0          3290121    319765/python3	   off (0.00/0/0)
tcp	  31	  0 192.168.0.133:45056     192.168.0.44:5647       CLOSE_WAIT  0          2440525    319765/python3	   off (0.00/0/0)
tcp        1	  0 192.168.0.133:45040     192.168.0.44:5647       CLOSE_WAIT  0          2437841    319765/python3	   off (0.00/0/0)
tcp	  31	  0 192.168.0.133:45078     192.168.0.44:5647       CLOSE_WAIT  0          2442474    319765/python3	   off (0.00/0/0)
tcp        1	  0 192.168.0.133:49166     192.168.0.44:5647       CLOSE_WAIT  0          3283590    319765/python3	   off (0.00/0/0)
tcp	  31	  0 192.168.0.133:45022     192.168.0.44:5647       CLOSE_WAIT  0          2435007    319765/python3	   off (0.00/0/0)
tcp        1	  0 192.168.0.133:45096     192.168.0.44:5647       CLOSE_WAIT  0          2443436    319765/python3	   off (0.00/0/0)
tcp        0	  0 192.168.0.133:45132     192.168.0.44:5647       ESTABLISHED 0          2446878    319765/python3	   off (0.00/0/0)
tcp	  61	  0 192.168.0.133:45000     192.168.0.44:5647       CLOSE_WAIT  0          2430922    319765/python3	   off (0.00/0/0)
tcp        1	  0 192.168.0.133:45074     192.168.0.44:5647       CLOSE_WAIT  0          2441191    319765/python3	   off (0.00/0/0)
tcp        1	  0 192.168.0.133:45018     192.168.0.44:5647       CLOSE_WAIT  0          2433281    319765/python3	   off (0.00/0/0)
tcp        1	  0 192.168.0.133:44994     192.168.0.44:5647       CLOSE_WAIT  0          2428425    319765/python3	   off (0.00/0/0)
tcp	  31	  0 192.168.0.133:44996     192.168.0.44:5647       CLOSE_WAIT  0          2429482    319765/python3	   off (0.00/0/0)
tcp        1	  0 192.168.0.133:49186     192.168.0.44:5647       CLOSE_WAIT  0          3290489    319765/python3	   off (0.00/0/0)
tcp	  61	  0 192.168.0.133:45120     192.168.0.44:5647       CLOSE_WAIT  0          2443837    319765/python3	   off (0.00/0/0)
tcp	  91	  0 192.168.0.133:49148     192.168.0.44:5647       CLOSE_WAIT  0          3281779    319765/python3	   off (0.00/0/0)
tcp	  31	  0 192.168.0.133:45100     192.168.0.44:5647       CLOSE_WAIT  0          2443651    319765/python3	   off (0.00/0/0)
tcp        1	  0 192.168.0.133:45118     192.168.0.44:5647       CLOSE_WAIT  0          2443071    319765/python3	   off (0.00/0/0)
~~~


RHEL9 client connections to Satellite on port 5647 (only 1 ESTABLISHED, no CLOSE_WAIT):

~~~
tcp        0      0 192.168.0.37:45766      192.168.0.44:5647       ESTABLISHED 0          174410     43476/python3        off (0.00/0/0)
~~~


Version-Release number of selected component (if applicable):

Satellite 6.11 and Satellite 6.12

How reproducible:
Always

Steps to Reproduce:
1. Configure Satellite to use katello-agent:

  satellite-installer --foreman-proxy-content-enable-katello-agent=true

2. Install katello-agent and gofer packages on RHEL8 client

3. Have satellite install a package on RHEL8 client with katello-agent from the UI


Actual results:
RHEL8 clients leaving connections to port 5647 on Satellite in "CLOSE_WAIT". 


Expected results:
RHEL8 clients should only have one connection to Satellite over port 5647, and no connections left in "CLOSE_WAIT".

Additional info:

Comment 3 Tim Pass 2023-04-24 15:49:27 UTC
I received a customer escalation on this case:

I am having an issue with the Katello agent and goferd.  Jon has done an excellent job with this case.  I am very happy with Jon.  My issue is with engineering.  I am having an issue with the Katello agent and goferd.  My ticket has been open a month.  Engineering just responded the Katello agent is deprecated and I need to move to remote execution.  One month to come back with nothing.  I am running Satellite 6.11 which is still under support.  The specific wording from the Satellite 6.11 upgrade documentation is "The Katello agent is deprecated and will be removed in a future Satellite version."  Do you not have to support it until you do?

FYI remote execution is hot garbage.  By default remote execution uses root to connect from Satellite to clients.  It is rather shocking that Red Hat is pushing a tool that requires you to enable root login for SSH so that they can connect as root.  That goes against everything security and something you should not be promoting.  I will NOT be using remote execution.  Jon mentioned 6.12 has an option to use MQTT instead of goferd.  That I am interested in, but I am not ready to move to 6.12.

Comment 5 jnikolak 2023-05-10 10:58:46 UTC
Any updates in relation to this.

Comment 25 errata-xmlrpc 2023-10-20 22:30:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Satellite Client security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:5982