Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2050234 - pulp_streamer runs out of file descriptors when upstream server is unavailable
Summary: pulp_streamer runs out of file descriptors when upstream server is unavailable
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Pulp
Version: 6.9.7
Hardware: All
OS: Linux
unspecified
high
Target Milestone: 6.13.0
Assignee: satellite6-bugs
QA Contact: Vladimír Sedmík
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-03 14:09 UTC by Julio Entrena Perez
Modified: 2023-05-03 13:21 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-05-03 13:21:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker SAT-14690 0 None None None 2023-01-05 07:13:39 UTC
Red Hat Product Errata RHSA-2023:2097 0 None None None 2023-05-03 13:21:16 UTC

Description Julio Entrena Perez 2022-02-03 14:09:56 UTC
Description of problem:
pulp_streamer runs out of file descriptors when package requested by clients is missing and upstream server is unavailable

Version-Release number of selected component (if applicable):
python-pulp-streamer-2.21.5.3-1.el7sat.noarch
satellite-6.9.7-1.el7sat.noarch

How reproducible:
Unknown, suspected always

Steps to Reproduce:
1. Configure a custom repository pointing to another server with on-demand policy
2. Synchronise the repo, make it available to many hosts
3. Shutdown the upstream server and have many hosts trying to download packages from the custom repository

Actual results:
pulp_streamer runs out of file descriptors and starts erroring with "Too many open files":

Jan 31 02:30:37 ecpvm003101 pulp_streamer: urllib3.connectionpool:WARNING: Retrying (Retry(total=1, connect=5, read=1, redirect=None)) after connection broken by 'ProtocolError('Connection aborted.', error(24, 'Too many open files'))': /rhel8/<repo>/<package>.rpm
Jan 31 02:30:37 ecpvm003101 pulp_streamer: urllib3.connectionpool:INFO: Starting new HTTP connection (5): reposerver.example.com
Jan 31 02:30:38 ecpvm003101 pulp_streamer: urllib3.connectionpool:WARNING: Retrying (Retry(total=2, connect=5, read=2, redirect=None)) after connection broken by 'ProtocolError('Connection aborted.', error(24, 'Too many open files'))': /rhel8/<repo>/<package>.rpm
Jan 31 02:30:38 ecpvm003101 pulp_streamer: urllib3.connectionpool:INFO: Starting new HTTP connection (4): reposerver.example.com

lsof shows that most file descriptors are consumed by connections in CLOSE_WAIT:

COMMAND     PID     USER   FD      TYPE             DEVICE    SIZE/OFF       NODE NAME
[...]
pulp_stre  5273       48 1019u     IPv4          457251852         0t0        TCP 127.0.0.1:8751->127.0.0.1:41998 (CLOSE_WAIT)
pulp_stre  5273       48 1020u     IPv4          456965314         0t0        TCP 127.0.0.1:8751->127.0.0.1:40808 (CLOSE_WAIT)
pulp_stre  5273       48 1021u     IPv4          456965280         0t0        TCP 127.0.0.1:8751->127.0.0.1:40800 (CLOSE_WAIT)
pulp_stre  5273       48 1022u     IPv4          456679856         0t0        TCP 127.0.0.1:8751->127.0.0.1:39244 (CLOSE_WAIT)
pulp_stre  5273       48 1023u     IPv4          457094161         0t0        TCP 127.0.0.1:8751->127.0.0.1:41836 (CLOSE_WAIT)

foreman-ssl_access_ssl.log shows that HTTP 502 is being returned (see bug 1432985):

1.2.3.4 - - [31/Jan/2022:02:29:39 +0000] "GET /streamer/var/lib/pulp/content/units/rpm/5a/a3140f62c74a19c00df434d3b278d57d9cfcf6ad742e60a05fff635e5bbeee/<package>.rpm?policy=eyJleHRlbnNpb25zIjogeyJyZW1vdGVfaXAiOiAiMTAuMTM1LjY2LjEzMyJ9LCAicmVzb3VyY2UiOiAiL3N0cmVhbWVyL3Zhci9saWIvcHVscC9jb250ZW50L3VuaXRzL3JwbS81YS9hMzE0MGY2MmM3NGExOWMwMGRmNDM0ZDNiMjc4ZDU3ZDljZmNmNmFkNzQyZTYwYTA1ZmZmNjM1ZTViYmVlZS9xdWFseXMtY2xvdWQtYWdlbnQueDg2XzY0LnJwbSIsICJleHBpcmF0aW9uIjogMTY0MzU5NjIxM30%3D;signature=oA6NDIG08afL1EfIILL00HEovuTDv4sKqesWGKmSkic5DBPn4HOWlQxPG4qpzsPlPrJKq4Nbn54_mkWTUyKdWaMdxOtnszXeLxnnxXxXYsGemrIfwGUkBFm8SKL_LjvBm-ji5Ln_o66u0I5XYSgkkuJ_857-J89Ol_Ij6T4bpxrvXo6HEFOdkF7UByMcxEhW0ehiItxG0m1uRkXDk2XnVV7zdUXJeWnPuB0edJh2Lw8qGnHukP2qMibfAAAPEhRDVX0SXB3Dw-SiidY73jT2GABOELTS7lyzpQBqno2ixezxrT-FLv4x02i6x7w6pX-K-PjCT8mgq6EiHasUDsDQpg%3D%3D HTTP/1.1" 502 649 "-" "pulp/2.21.5.3"

Expected results:
pulp_streamer does not run out of file descriptors when the upstream server is unavailable and a large number of hosts request a package that needs to be fetched from the unavailable server

Additional info:
While current versions default to Immediate for custom repositories, many customers that have upgraded from earlier versions retain the On-demand default.

Comment 3 Daniel Alley 2022-08-29 22:51:55 UTC
My gut feeling is that this only impacts Pulp 2 and therefore wouldn't impact 6.10+, but I am not sure that Satellite 6.10+ has been tested in this scenario.

Comment 4 Julio Entrena Perez 2022-08-30 08:51:55 UTC
It shouldn't be a difficult test, setup new custom repo with URL to sync from, using on demand sync the custom repo, disrupt connectivity to URL of custom repo, have a host try to download content from that custom repo, verify that Pulp does not run out of FDs and takes the entire Satellite down as a result.

Comment 6 Daniel Alley 2022-10-19 15:57:26 UTC
Moving to ON_QA, see comments 3 and 4

Comment 11 errata-xmlrpc 2023-05-03 13:21:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.13 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:2097


Note You need to log in before you can comment on or make changes to this bug.