Bug 1271781

Summary: [RFE] issue retry logic from subscription-manager when encountering a server response (104, 'Connection reset by peer')
Product: Red Hat Enterprise Linux 7 Reporter: John Sefler <jsefler>
Component: subscription-managerAssignee: candlepin-bugs
Status: CLOSED WORKSFORME QA Contact: John Sefler <jsefler>
Severity: high Docs Contact:
Priority: low    
Version: 7.2CC: akapse, csnyder, khowell, rcyriac, redakkan, rjerrido, sgraf, wpoteat
Target Milestone: rcKeywords: FutureFeature, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 1302364 Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
: 1273444 (view as bug list) Environment:
Last Closed: 2020-04-06 16:03:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1273444    

Description John Sefler 2015-10-14 17:33:24 UTC
Description of problem:
When using subscription-manager to access entitlement server subscription.rhn.stage.redhat.com:443/subscription (and likely production too) frequently a 104 'Connection reset by peer' response has been encountered and bugs have been opened against IT-Candlepin (reference bug 1232141 bug 1231308 ).  Please use this bug to incorporate a workaround within subscription-manager to catch these exceptions and attempt retry logic where possible.    

Version-Release number of selected component (if applicable):


How reproducible:
not easy to reproduce, but occurs frequently (especially during automated testing)

Steps to Reproduce:
I will paste in the comments that follow a few automated log traces that show what was happening when a 'Connection reset by peer' was encountered.


Actual results:
Stderr: Unable to verify server's identity: (104, 'Connection reset by peer')

Expected results:
  I'd suggest attempting a few (3?) retries to the candlepin RESTAPI with a small time delay (1sec?) before finally failing with the same 104 'Connection reset by peer' exception.


Additional info:

Comment 2 John Sefler 2015-10-14 17:41:23 UTC
2015-10-13 11:39:33.987  FINE: ssh root.eng.bos.redhat.com subscription-manager register --username=stage_auto_testuser1 --password=REDACTED --autosubscribe --force
2015-10-13 11:39:35.849  FINE: Stdout: Registering to: subscription.rhn.stage.redhat.com:443/subscription

2015-10-13 11:39:35.850  FINE: Stderr: Unable to verify server's identity: (104, 'Connection reset by peer')

2015-10-13 11:39:35.850  FINE: ExitCode: 70
2015-10-13 11:39:35.850  FINE: ssh root.eng.bos.redhat.com LINE_NUMBER=$(grep --line-number 'Making request:' /var/log/rhsm/rhsm.log | tail --lines=1 | cut --delimiter=':' --field=1); if [ -n "$LINE_NUMBER" ]; then tail -n +$LINE_NUMBER /var/log/rhsm/rhsm.log; fi;
2015-10-13 11:39:36.006  WARNING: Last request from /var/log/rhsm/rhsm.log:
2015-10-13 11:39:36,591 [DEBUG] subscription-manager:20325 @connection.py:523 - Making request: GET /subscription/users/stage_auto_testuser1/owners
2015-10-13 11:39:37,028 [ERROR] subscription-manager:20325 @managercli.py:160 - Error during registration: (104, 'Connection reset by peer')
2015-10-13 11:39:37,028 [ERROR] subscription-manager:20325 @managercli.py:161 - (104, 'Connection reset by peer')
Traceback (most recent call last):
  File "/usr/share/rhsm/subscription_manager/managercli.py", line 1061, in _do_command
    owner_key = self._determine_owner_key(admin_cp)
  File "/usr/share/rhsm/subscription_manager/managercli.py", line 1207, in _determine_owner_key
    owners = cp.getOwnerList(self.username)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 1063, in getOwnerList
    return self.conn.request_get(method)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 644, in request_get
    return self._request("GET", method)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 544, in _request
    response = conn.getresponse()
  File "/usr/lib64/python2.7/httplib.py", line 1051, in getresponse
    response.begin()
  File "/usr/lib64/python2.7/httplib.py", line 415, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.7/httplib.py", line 371, in _read_status
    line = self.fp.readline(_MAXLINE + 1)
  File "/usr/lib64/python2.7/socket.py", line 476, in readline
    data = self._sock.recv(self._rbufsize)
  File "/usr/lib64/python2.7/site-packages/M2Crypto/SSL/Connection.py", line 228, in read
    return self._read_bio(size)
  File "/usr/lib64/python2.7/site-packages/M2Crypto/SSL/Connection.py", line 213, in _read_bio
    return m2.ssl_read(self.ssl, size, self._timeout)
SSLError: (104, 'Connection reset by peer')

Comment 3 John Sefler 2015-10-14 17:43:33 UTC
2015-10-13 12:22:10.282  FINE: ssh root.eng.bos.redhat.com subscription-manager list --available
2015-10-13 12:22:14.486  FINE: Stdout:
2015-10-13 12:22:14.486  FINE: Stderr: Unable to verify server's identity: (104, 'Connection reset by peer')

2015-10-13 12:22:14.486  FINE: ExitCode: 70
2015-10-13 12:22:14.487  FINE: ssh root.eng.bos.redhat.com LINE_NUMBER=$(grep --line-number 'Making request:' /var/log/rhsm/rhsm.log | tail --lines=1 | cut --delimiter=':' --field=1); if [ -n "$LINE_NUMBER" ]; then tail -n +$LINE_NUMBER /var/log/rhsm/rhsm.log; fi;
2015-10-13 12:22:14.646  WARNING: Last request from /var/log/rhsm/rhsm.log:
2015-10-13 12:22:14,930 [DEBUG] subscription-manager:13723 @connection.py:523 - Making request: GET /subscription/owners/7628215/pools?consumer=2e4363ac-d162-4030-a897-ee20a806c0fd
2015-10-13 12:22:15,679 [ERROR] subscription-manager:13723 @managercli.py:160 - exception caught in subscription-manager
2015-10-13 12:22:15,679 [ERROR] subscription-manager:13723 @managercli.py:161 - (104, 'Connection reset by peer')
Traceback (most recent call last):
  File "/usr/sbin/subscription-manager", line 86, in <module>
    sys.exit(abs(main() or 0))
  File "/usr/sbin/subscription-manager", line 77, in main
    return managercli.ManagerCLI().main()
  File "/usr/share/rhsm/subscription_manager/managercli.py", line 2629, in main
    return CLI.main(self)
  File "/usr/share/rhsm/subscription_manager/cli.py", line 159, in main
    return cmd.main()
  File "/usr/share/rhsm/subscription_manager/managercli.py", line 489, in main
    return_code = self._do_command()
  File "/usr/share/rhsm/subscription_manager/managercli.py", line 2223, in _do_command
    filter_string=self.options.filter_string)
  File "/usr/share/rhsm/subscription_manager/managerlib.py", line 312, in get_available_entitlements
    overlapping, uninstalled, text, filter_string)
  File "/usr/share/rhsm/subscription_manager/managerlib.py", line 517, in get_filtered_pools_list
    self.identity.uuid, self.facts, active_on=active_on, filter_string=filter_string):
  File "/usr/share/rhsm/subscription_manager/managerlib.py", line 276, in list_pools
    active_on=active_on, owner=ownerid, filter_string=filter_string)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 1193, in getPoolsList
    results = self.conn.request_get(method)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 644, in request_get
    return self._request("GET", method)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 544, in _request
    response = conn.getresponse()
  File "/usr/lib64/python2.7/httplib.py", line 1051, in getresponse
    response.begin()
  File "/usr/lib64/python2.7/httplib.py", line 415, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.7/httplib.py", line 371, in _read_status
    line = self.fp.readline(_MAXLINE + 1)
  File "/usr/lib64/python2.7/socket.py", line 476, in readline
    data = self._sock.recv(self._rbufsize)
  File "/usr/lib64/python2.7/site-packages/M2Crypto/SSL/Connection.py", line 228, in read
    return self._read_bio(size)
  File "/usr/lib64/python2.7/site-packages/M2Crypto/SSL/Connection.py", line 213, in _read_bio
    return m2.ssl_read(self.ssl, size, self._timeout)
SSLError: (104, 'Connection reset by peer')

Comment 4 John Sefler 2015-10-14 17:44:47 UTC
2015-10-13 13:02:41.483  FINE: ssh root.eng.bos.redhat.com subscription-manager subscribe --pool=8a99f9874df893c8014e1cada9f3773e --quantity=3
2015-10-13 13:02:47.581  FINE: Stdout: Subscription 'Red Hat Enterprise Linux Server, Standard (Physical or Virtual Nodes)' must be attached using a quantity evenly divisible by 2

2015-10-13 13:02:47.581  FINE: Stderr: Unable to verify server's identity: (104, 'Connection reset by peer')

2015-10-13 13:02:47.581  FINE: ExitCode: 70
2015-10-13 13:02:47.581  FINE: ssh root.eng.bos.redhat.com LINE_NUMBER=$(grep --line-number 'Making request:' /var/log/rhsm/rhsm.log | tail --lines=1 | cut --delimiter=':' --field=1); if [ -n "$LINE_NUMBER" ]; then tail -n +$LINE_NUMBER /var/log/rhsm/rhsm.log; fi;
2015-10-13 13:02:47.741  WARNING: Last request from /var/log/rhsm/rhsm.log:
2015-10-13 13:02:48,423 [DEBUG] subscription-manager:29580 @connection.py:523 - Making request: GET /subscription/consumers/9fb8d5f5-a3fa-45a8-886f-71f4716344bd/certificates/serials
2015-10-13 13:02:48,775 [ERROR] subscription-manager:29580 @managercli.py:160 - Unable to attach: (104, 'Connection reset by peer')
2015-10-13 13:02:48,775 [ERROR] subscription-manager:29580 @managercli.py:161 - (104, 'Connection reset by peer')
Traceback (most recent call last):
  File "/usr/share/rhsm/subscription_manager/managercli.py", line 1541, in _do_command
    report = self.entcertlib.update()
  File "/usr/share/rhsm/subscription_manager/certlib.py", line 31, in update
    self.report = self.locker.run(self._do_update)
  File "/usr/share/rhsm/subscription_manager/certlib.py", line 17, in run
    return action()
  File "/usr/share/rhsm/subscription_manager/entcertlib.py", line 43, in _do_update
    return action.perform()
  File "/usr/share/rhsm/subscription_manager/entcertlib.py", line 119, in perform
    expected = self._get_expected_serials()
  File "/usr/share/rhsm/subscription_manager/entcertlib.py", line 254, in _get_expected_serials
    exp = self.get_certificate_serials_list()
  File "/usr/share/rhsm/subscription_manager/entcertlib.py", line 234, in get_certificate_serials_list
    reply = self.uep.getCertificateSerials(identity.uuid)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 1097, in getCertificateSerials
    return self.conn.request_get(method)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 644, in request_get
    return self._request("GET", method)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 544, in _request
    response = conn.getresponse()
  File "/usr/lib64/python2.7/httplib.py", line 1051, in getresponse
    response.begin()
  File "/usr/lib64/python2.7/httplib.py", line 415, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.7/httplib.py", line 371, in _read_status
    line = self.fp.readline(_MAXLINE + 1)
  File "/usr/lib64/python2.7/socket.py", line 476, in readline
    data = self._sock.recv(self._rbufsize)
  File "/usr/lib64/python2.7/site-packages/M2Crypto/SSL/Connection.py", line 228, in read
    return self._read_bio(size)
  File "/usr/lib64/python2.7/site-packages/M2Crypto/SSL/Connection.py", line 213, in _read_bio
    return m2.ssl_read(self.ssl, size, self._timeout)
SSLError: (104, 'Connection reset by peer')

Comment 5 John Sefler 2015-10-14 17:45:38 UTC
2015-10-13 13:10:55.935  FINE: ssh root.eng.bos.redhat.com subscription-manager service-level --show
2015-10-13 13:10:57.733  FINE: Stdout:
2015-10-13 13:10:57.733  FINE: Stderr: Unable to verify server's identity: (104, 'Connection reset by peer')

2015-10-13 13:10:57.734  FINE: ExitCode: 70
2015-10-13 13:10:57.735  FINE: ssh root.eng.bos.redhat.com LINE_NUMBER=$(grep --line-number 'Making request:' /var/log/rhsm/rhsm.log | tail --lines=1 | cut --delimiter=':' --field=1); if [ -n "$LINE_NUMBER" ]; then tail -n +$LINE_NUMBER /var/log/rhsm/rhsm.log; fi;
2015-10-13 13:10:57.892  WARNING: Last request from /var/log/rhsm/rhsm.log:
2015-10-13 13:10:58,561 [DEBUG] subscription-manager:3138 @connection.py:523 - Making request: GET /subscription/consumers/ed0c1966-6543-4070-8e49-649be5434264
2015-10-13 13:10:58,929 [ERROR] subscription-manager:3138 @managercli.py:160 - Error: Unable to retrieve service levels.
2015-10-13 13:10:58,930 [ERROR] subscription-manager:3138 @managercli.py:161 - (104, 'Connection reset by peer')
Traceback (most recent call last):
  File "/usr/share/rhsm/subscription_manager/managercli.py", line 880, in _do_command
    self.show_service_level()
  File "/usr/share/rhsm/subscription_manager/managercli.py", line 904, in show_service_level
    consumer = self.cp.getConsumer(self.identity.uuid)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 1001, in getConsumer
    return self.conn.request_get(method)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 644, in request_get
    return self._request("GET", method)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 544, in _request
    response = conn.getresponse()
  File "/usr/lib64/python2.7/httplib.py", line 1051, in getresponse
    response.begin()
  File "/usr/lib64/python2.7/httplib.py", line 415, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.7/httplib.py", line 371, in _read_status
    line = self.fp.readline(_MAXLINE + 1)
  File "/usr/lib64/python2.7/socket.py", line 476, in readline
    data = self._sock.recv(self._rbufsize)
  File "/usr/lib64/python2.7/site-packages/M2Crypto/SSL/Connection.py", line 228, in read
    return self._read_bio(size)
  File "/usr/lib64/python2.7/site-packages/M2Crypto/SSL/Connection.py", line 213, in _read_bio
    return m2.ssl_read(self.ssl, size, self._timeout)
SSLError: (104, 'Connection reset by peer')

Comment 6 John Sefler 2015-10-14 17:49:21 UTC
2015-10-13 13:11:28.854  FINE: ssh root.eng.bos.redhat.com subscription-manager register --username=stage_auto_testuser1 --password=redhat --name="SubscriptionServiceLevelConsumer" --force
2015-10-13 13:11:32.600  FINE: Stdout: Registering to: subscription.rhn.stage.redhat.com:443/subscription

2015-10-13 13:11:32.601  FINE: Stderr: Unable to verify server's identity: (104, 'Connection reset by peer')

2015-10-13 13:11:32.601  FINE: ExitCode: 70
2015-10-13 13:11:32.602  FINE: ssh root.eng.bos.redhat.com LINE_NUMBER=$(grep --line-number 'Making request:' /var/log/rhsm/rhsm.log | tail --lines=1 | cut --delimiter=':' --field=1); if [ -n "$LINE_NUMBER" ]; then tail -n +$LINE_NUMBER /var/log/rhsm/rhsm.log; fi;
2015-10-13 13:11:32.772  WARNING: Last request from /var/log/rhsm/rhsm.log:
2015-10-13 13:11:33,382 [DEBUG] subscription-manager:3702 @connection.py:523 - Making request: POST /subscription/consumers?owner=7628215
2015-10-13 13:11:33,796 [ERROR] subscription-manager:3702 @managercli.py:160 - Error during registration: (104, 'Connection reset by peer')
2015-10-13 13:11:33,796 [ERROR] subscription-manager:3702 @managercli.py:161 - (104, 'Connection reset by peer')
Traceback (most recent call last):
  File "/usr/share/rhsm/subscription_manager/managercli.py", line 1071, in _do_command
    content_tags=self.installed_mgr.tags)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 876, in registerConsumer
    return self.conn.request_post(url, params)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 647, in request_post
    return self._request("POST", method, params)
  File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 544, in _request
    response = conn.getresponse()
  File "/usr/lib64/python2.7/httplib.py", line 1051, in getresponse
    response.begin()
  File "/usr/lib64/python2.7/httplib.py", line 415, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.7/httplib.py", line 371, in _read_status
    line = self.fp.readline(_MAXLINE + 1)
  File "/usr/lib64/python2.7/socket.py", line 476, in readline
    data = self._sock.recv(self._rbufsize)
  File "/usr/lib64/python2.7/site-packages/M2Crypto/SSL/Connection.py", line 228, in read
    return self._read_bio(size)
  File "/usr/lib64/python2.7/site-packages/M2Crypto/SSL/Connection.py", line 213, in _read_bio
    return m2.ssl_read(self.ssl, size, self._timeout)
SSLError: (104, 'Connection reset by peer')

Comment 8 Kevin Howell 2017-08-07 14:45:34 UTC
Note to dev: potential solution should be targeted to only 104 (Connection reset by peer). Also consider exponential backoff, etc.

Comment 9 Kevin Howell 2017-08-21 14:20:22 UTC
*** Bug 1273444 has been marked as a duplicate of this bug. ***

Comment 13 Rehana 2020-04-06 16:03:12 UTC
Based on the recent test run history for the past several months (3 months ) , QE team never came across this issue. Based on the discussion with team closing this bug.