Description of problem: When I'm running concurrent registrations of 32 hosts, it causes that 1 of them fails with: "Unable to verify server's identity: ('The read operation timed out',)". IMO this is a regression, as I remember that earlier, with 6.2, we were able to register 50 clients concurrently without any tunings done on Satellite or client side (I can be wrong, but have strong feeling about that). Version-Release number of selected component (if applicable): satellite-6.2.11-2.0.el7sat.noarch How reproducible: 2 of 4 attempts (rate is bigger when registering more hosts concurrently) Steps to Reproduce: 1. Register 32 clients to the satellite in parallel with: # subscription-manager register --org Default_Organization \ --environment Library --username <user> --password <pass> --force Actual results: 1 registration fails Expected results: all registrations should pass
Example of one of the failed clients: b10-h01-r620container129.example.com - 172.18.0.129 From /var/log/rhsm/rhsm.log: 2017-09-09 08:32:59,368 [INFO] rhsmd:346:MainThread @rhsmd:263 - rhsmd started 2017-09-09 08:33:00,149 [INFO] subscription-manager:340:MainThread @managercli.py:518 - X-Correlation-ID: b7a518b3c9eb4dc7aab21b6a8190fc70 2017-09-09 08:33:00,149 [INFO] subscription-manager:340:MainThread @managercli.py:407 - Client Versions: {'python-rhsm': '1.19.9-1.el7', 'subscription-manager': '1.19.21-1.el7'} 2017-09-09 08:33:00,215 [INFO] subscription-manager:340:MainThread @managerlib.py:889 - Cleaned local data 2017-09-09 08:33:00,238 [ERROR] rhsmd:346:MainThread @cache.py:128 - Unable to read cache: /var/lib/rhsm/cache/rhsm_icon.json 2017-09-09 08:33:00,239 [ERROR] rhsmd:346:MainThread @cache.py:129 - [Errno 2] No such file or directory: '/var/lib/rhsm/cache/rhsm_icon.json' Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/subscription_manager/cache.py", line 123, in _read_cache f = open(self.CACHE_FILE) IOError: [Errno 2] No such file or directory: '/var/lib/rhsm/cache/rhsm_icon.json' 2017-09-09 08:33:10,501 [INFO] rhsmd:502:MainThread @rhsmd:263 - rhsmd started 2017-09-09 08:33:10,649 [INFO] subscription-manager:496:MainThread @managercli.py:518 - X-Correlation-ID: 6b91ae91eb154d92b0b76dd93642909a 2017-09-09 08:33:10,650 [INFO] subscription-manager:496:MainThread @managercli.py:407 - Client Versions: {'python-rhsm': '1.19.9-1.el7', 'subscription-manager': '1.19.21-1.el7'} 2017-09-09 08:33:16,232 [INFO] subscription-manager:578:MainThread @managercli.py:518 - X-Correlation-ID: 28825ea09ac84f84b64263aa7899309c 2017-09-09 08:33:16,233 [INFO] subscription-manager:578:MainThread @managercli.py:407 - Client Versions: {'python-rhsm': '1.19.9-1.el7', 'subscription-manager': '1.19.21-1.el7'} 2017-09-09 08:33:16,234 [INFO] subscription-manager:578:MainThread @connection.py:822 - Connection built: host=c08-h21-r630-vm2.rdu.openstack.engineering.redhat.com port=443 handler=/rhsm auth=identity_cert ca_dir=/etc/rhsm/ca/ insecure=False 2017-09-09 08:33:16,235 [INFO] subscription-manager:578:MainThread @connection.py:822 - Connection built: host=c08-h21-r630-vm2.rdu.openstack.engineering.redhat.com port=443 handler=/rhsm auth=none 2017-09-09 08:33:16,236 [INFO] subscription-manager:578:MainThread @managercli.py:407 - Client Versions: {'python-rhsm': '1.19.9-1.el7', 'subscription-manager': '1.19.21-1.el7'} 2017-09-09 08:33:16,236 [INFO] subscription-manager:578:MainThread @managercli.py:382 - Consumer Identity name=None uuid=None 2017-09-09 08:33:16,239 [INFO] subscription-manager:578:MainThread @managercli.py:382 - Consumer Identity name=None uuid=None 2017-09-09 08:33:16,241 [INFO] subscription-manager:578:MainThread @connection.py:822 - Connection built: host=c08-h21-r630-vm2.rdu.openstack.engineering.redhat.com port=443 handler=/rhsm auth=basic username=admin 2017-09-09 08:33:16,454 [INFO] subscription-manager:578:MainThread @dmiinfo.py:73 - Using dmidecode dump file: /dev/mem 2017-09-09 08:33:16,475 [WARNING] subscription-manager:578:MainThread @dmiinfo.py:127 - Error reading system DMI information: Failed to open memory buffer (/dev/mem): No such file or directory No SMBIOS nor DMI entry point found, sorry. Permission denied to memory file/device (/dev/mem) Permission denied to memory file/device (/dev/mem) Permission denied to memory file/device (/dev/mem) Permission denied to memory file/device (/dev/mem) Permission denied to memory file/device (/dev/mem) Permission denied to memory file/device (/dev/mem) Permission denied to memory file/device (/dev/mem) Permission denied to memory file/device (/dev/mem) Permission denied to memory file/device (/dev/mem) Permission denied to memory file/device (/dev/mem) Permission denied to memory file/device (/dev/mem) Permission denied to memory file/device (/dev/mem) Permission denied to memory file/device (/dev/mem) 2017-09-09 08:33:16,692 [INFO] subscription-manager:578:MainThread @custom.py:85 - Loading custom facts from: /etc/rhsm/facts/katello.facts 2017-09-09 08:33:23,130 [INFO] subscription-manager:578:MainThread @connection.py:552 - Response: status=200, request="GET /rhsm/" 2017-09-09 08:33:24,955 [INFO] subscription-manager:578:MainThread @connection.py:552 - Response: status=200, request="GET /rhsm/owners/Default_Organization/environments?name=Library" 2017-09-09 08:36:25,069 [ERROR] subscription-manager:578:MainThread @managercli.py:177 - Error during registration: ('The read operation timed out',) 2017-09-09 08:36:25,070 [ERROR] subscription-manager:578:MainThread @managercli.py:178 - ('The read operation timed out',) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/subscription_manager/managercli.py", line 1194, in _do_command content_tags=self.installed_mgr.tags) File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 920, in registerConsumer return self.conn.request_post(url, params) File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 649, in request_post return self._request("POST", method, params, headers=headers) File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 672, in _request info=info, headers=headers) File "/usr/lib64/python2.7/site-packages/rhsm/connection.py", line 539, in _request response = conn.getresponse() File "/usr/lib64/python2.7/httplib.py", line 1089, in getresponse response.begin() File "/usr/lib64/python2.7/httplib.py", line 444, in begin version, status, reason = self._read_status() File "/usr/lib64/python2.7/httplib.py", line 400, in _read_status line = self.fp.readline(_MAXLINE + 1) File "/usr/lib64/python2.7/socket.py", line 476, in readline data = self._sock.recv(self._rbufsize) File "/usr/lib64/python2.7/ssl.py", line 759, in recv return self.read(buflen) File "/usr/lib64/python2.7/ssl.py", line 653, in read v = self._sslobj.read(len or 1024) SSLError: ('The read operation timed out',) So to me it looks like client issued a POST call at 08:33:24, and after 3 minutes (default SSL timeout), at 08:36:25 it thrown that timeout exception.
Looking into Satellite logs: # grep -C 1 172\.18\.0\.129 /var/log/httpd/foreman-ssl_access_ssl.log 172.17.30.124 - - [09/Sep/2017:04:33:17 -0400] "GET /rhsm/ HTTP/1.1" 200 1746 "-" "RHSM/1.0 (cmd=subscription-manager)" 172.18.0.129 - - [09/Sep/2017:04:33:17 -0400] "GET /rhsm/ HTTP/1.1" 200 1746 "-" "RHSM/1.0 (cmd=subscription-manager)" 172.18.1.149 - - [09/Sep/2017:04:33:17 -0400] "GET /rhsm/ HTTP/1.1" 200 1746 "-" "RHSM/1.0 (cmd=subscription-manager)" -- 172.17.30.124 - - [09/Sep/2017:04:33:23 -0400] "GET /rhsm/owners/Default_Organization/environments?name=Library HTTP/1.1" 200 73 "-" "RHSM/1.0 (cmd=subscription-manager)" 172.18.0.129 - - [09/Sep/2017:04:33:23 -0400] "GET /rhsm/owners/Default_Organization/environments?name=Library HTTP/1.1" 200 73 "-" "RHSM/1.0 (cmd=subscription-manager)" 172.17.49.11 - - [09/Sep/2017:04:33:23 -0400] "GET /rhsm/status HTTP/1.1" 200 264 "-" "RHSM/1.0 (cmd=subscription-manager)" -- 172.17.0.2 - admin [09/Sep/2017:04:39:26 -0400] "GET /pulp/api/v2/tasks/d96adf20-5eed-464d-afed-f5146bce6ef4/ HTTP/1.1" 200 384 "-" "Ruby" 172.18.0.129 - - [09/Sep/2017:04:33:24 -0400] "POST /rhsm/environments/1/consumers HTTP/1.1" 200 10516 "-" "RHSM/1.0 (cmd=subscription-manager)" 172.17.0.2 - admin [09/Sep/2017:04:39:26 -0400] "POST /pulp/api/v2/consumers/d369dff7-9c9a-4dd5-a607-0e874cafc4b6/actions/content/regenerate_applicability/ HTTP/1.1" 202 172 "-" "Ruby" IMO it looks like first two GETs client did were processed normally, but POST request was obtained at (when moving time to UTC) 08:33:24. Judging by timestamps around it, it was responded 6 minutes later, at 08:39:26, so 3 minutes after client gave up. And its error code was 200, which is good, it was just slow.
Based on https://bugzilla.redhat.com/show_bug.cgi?id=1490138#c5, I believe this is related to https://bugzilla.redhat.com/show_bug.cgi?id=1483677
And to clarify, this is likely a regression due to https://bugzilla.redhat.com/show_bug.cgi?id=1434040 which moved where the package profile uploading was happening.
This will be resolved with the inclusion of 1514508
Verified in Satellite 6.2.15 I performed 1000 content hosts registrations with 50 concurrent at all times. On both RHEL 6 and RHEL 7 all 1000 hosts were registered in around 20 minutes. At no point did any host encounter the error identified in this bug report. I am attaching the registration logs for the 1000 hosts registered to the RHEL 7 Satellite 6.2.15 server.
Created attachment 1425724 [details] registration logs
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1673