Bug 1205840

Summary:	Capsule/Pulp: many "Resetting dropped connection" messages on sync
Product:	Red Hat Satellite	Reporter:	Corey Welton <cwelton>
Component:	Installation	Assignee:	satellite6-bugs <satellite6-bugs>
Status:	CLOSED WONTFIX	QA Contact:	Katello QA List <katello-qa-list>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	Unspecified	CC:	akrzos, andrew.schofield, bkearney, cpaquin, cwelton, cyril.cordouibarzi, egolov, ehelms, fcami, ktordeur, mhrivnak, mjahangi, mmccune, parmstro, sauchter, stbenjam, tcarlin
Target Milestone:	Unspecified	Keywords:	ReleaseNotes, Triaged
Target Release:	Unused
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-09-04 18:01:49 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1190823

Description Corey Welton 2015-03-25 18:12:50 UTC

Description of problem:

Many, many dropped connections occur, sometimes, upon trying to sync a capsule.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.  Configure and register a capsule to a sat
2.  Initiate sync
3.

Actual results:

Mar 25 19:09:32 ibm-x3550m3-11 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: rhsm-qe-3.rhq.lab.eng.bos.redhat.com
Mar 25 19:09:32 ibm-x3550m3-11 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: rhsm-qe-3.rhq.lab.eng.bos.redhat.com
Mar 25 19:09:33 ibm-x3550m3-11 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: rhsm-qe-3.rhq.lab.eng.bos.redhat.com
Mar 25 19:09:33 ibm-x3550m3-11 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: rhsm-qe-3.rhq.lab.eng.bos.redhat.com
Mar 25 19:09:37 ibm-x3550m3-11 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: rhsm-qe-3.rhq.lab.eng.bos.redhat.com
Mar 25 19:09:37 ibm-x3550m3-11 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: rhsm-qe-3.rhq.lab.eng.bos.redhat.com
Mar 25 19:09:39 ibm-x3550m3-11 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: rhsm-qe-3.rhq.lab.eng.bos.redhat.com


[root@ibm-x3550m3-11 ~]# grep 'resetting dropped connection' -nriI /var/log/messages | wc -l
2405


Expected results:

Fast, clean sync

Additional info:

Syncs do eventually complete.  However it also appears, though not sure yet, that it slows down sync process, overall.

Comment 1 RHEL Program Management 2015-03-25 18:13:13 UTC

Since this issue was entered in Red Hat Bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

Comment 3 Corey Welton 2015-03-25 18:17:27 UTC

Found in version Satellite-6.1.0-RHEL-7-20150320.1

Comment 4 Mike McCune 2015-06-02 04:10:01 UTC

It appears that setting MaxRequestsPerChild to 0 in: /etc/httpd/conf.d/prefork.conf 

 MaxRequestsPerChild 0

this problem appears to go away.

The theory is we are maxing out the # of connections to the Satellite when we execute a node sync, setting this value to 0 is essentially unlimited.

Comment 5 Mike McCune 2015-06-02 15:01:12 UTC

To clarify, the MaxRequestsPerChild is set to 0 on the *Satellite*, not the Capsule

Comment 6 Alex Krzos 2015-06-02 18:18:32 UTC

It is unclear to me if there is truly a casualty here other than possibly higher cpu usage.  If KeepAlive is turned off you will see this message rapidly. (Every download is a new connection which runs more cpu usage than reusing connections with KeepAlive.)  Perhaps the build tested in this BZ did not have KeepAlive turned on yet.  With KeepAlive On you will see less of this message depending on the capacity of your Apache server.  Your milage will vary dependent upon hardware and the number of capsules/clients making requests against the Satellite 6.1 server.  Pulp by default is tuned to 5 threads downloading content while synchronizing, thus the more capsules/clients you have downloading content, the greater potential in saturating the maximum number of processes Apache will allow by configuration to serve content. (The default config has ServerLimit/MaxClients set to 256, everything else is queued on ListenBacklog)

There are several Apache Tunables which come into play here:

/etc/httpd/conf/httpd.conf:

KeepAlive
MaxKeepAliveRequests
KeepAliveTimeout

/etc/httpd/conf.d/prefork.conf:

StartServers        8
MinSpareServers     5
MaxSpareServers     20
ServerLimit         256
MaxClients          256
MaxRequestsPerChild 4000 # MaxConnectionsPerChild (Apache 2.4 - Accepts both)

Is there a specific issue we are seeing with syncing in addition to the log line dumped to /var/log/messages?

Comment 8 Corey Welton 2015-08-13 17:48:29 UTC

>Is there a specific issue we are seeing with syncing in addition to the log line dumped to /var/log/messages?

I don't know, other than it *appears* that it does this for a long time before sync commences.  But I don't know for sure.

Comment 13 Thom Carlin 2016-08-10 11:50:14 UTC

Also see this with Satellite 6.2 GA in QCI

Comment 15 Selim Jahangir 2016-08-23 07:25:27 UTC

case#01687634 - having the same error message while syncing with satellite 6.2 server

Error:

Aug 23 16:48:31 capsule server pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: satellite  server

Comment 16 Michael Hrivnak 2016-09-22 19:54:09 UTC

I know that message looks bad, but I think it's actually normal and can safely be ignored. It is not an error.

A polite http client will maintain a connection to a web server and make multiple requests over the same connection. Depending on the server's configuration, it may not want to allow a connection to stay open indefinitely. Servers are commonly configured to close a connection in one or more of these circumstances:

- the connection has been idle for some period of time
- the connection has already been used to handle some configurable maximum number of requests
- the process handling the connection has already handled some configurable maximum number of requests
- the server process is being gracefully restarted

When the connection gets closed, that's no problem. The client expects this to happen and just makes a new connection for the next request.

I think all of the cases cited here of having seen the "dropped connection" message can be explained by this very normal behavior. The message we are seeing is just a python library noting that a connection was dropped, and thus its connection pool is responding accordingly. I definitely see how the wording of the message appears problematic, but it can be ignored.

Comment 17 Bryan Kearney 2016-10-04 13:59:10 UTC

Michael, whose client reports this connections? I could move this to be an rfe in that project.

Comment 18 Michael Hrivnak 2016-10-05 15:18:18 UTC

The log statements come from the python "urllib3" library.

https://github.com/shazow/urllib3

Comment 19 Brad Buckingham 2016-10-14 14:31:19 UTC

Since the discussion has indicated a possible change to Satellite httpd configuration, I am going to move this bugzilla over to Installer to investigate.

Comment 20 Stephen Benjamin 2016-10-14 16:10:07 UTC

Created redmine issue http://projects.theforeman.org/issues/16954 from this bug

Comment 22 Bryan Kearney 2018-09-04 18:01:49 UTC

Thank you for your interest in Satellite 6. We have evaluated this request, and we do not expect this to be implemented in the product in the foreseeable future. We are therefore closing this out as WONTFIX. If you have any concerns about this, please feel free to contact Rich Jerrido or Bryan Kearney. Thank you.

Comment 23 Red Hat Bugzilla 2023-09-14 02:57:08 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days