Bug 1504737 - Upgraded capsule content sync fails with 'Pulp message bus connection issue'
Summary: Upgraded capsule content sync fails with 'Pulp message bus connection issue'
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Upgrades
Version: 6.3.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: Unspecified
Assignee: Eric Helms
QA Contact: Lukas Pramuk
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-20 13:58 UTC by Lukas Pramuk
Modified: 2021-06-10 13:18 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-15 10:27:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1557786 0 urgent CLOSED Add a Warning to Prevent Users From Trying to Upgrade a Satellite and Capsule(s) Using Cluster/Load Balancer/High Availa... 2021-06-10 15:23:16 UTC
Red Hat Knowledge Base (Solution) 3386051 0 None None None 2018-03-22 12:43:11 UTC

Internal Links: 1557786

Description Lukas Pramuk 2017-10-20 13:58:08 UTC
Description of problem:
Upgraded capsule content sync fails with 'Pulp message bus connection issue at https://cap.example.com/pulp/api/v2/.' This is truly upgrade bug since fresh 6.3 capsule operates just fine.

Version-Release number of selected component (if applicable):
@satellite-capsule-6.3.0-20.0.beta.el7sat.noarch
satellite-installer-6.3.0.7-1.beta.el7sat.noarch
pulp-server-2.13.4.1-1.el7sat.noarch
qpid-cpp-client-1.36.0-9.el7.x86_64
qpid-cpp-server-1.36.0-9.el7.x86_64

How reproducible:
after capsule upgrade 6.2.12 > 6.3.0

Steps to Reproduce:
1. assign capsule to LFE
2. upgrade capsule to 6.3
3. trigger capsule sync either in UI or CLI
@SAT # hammer capsule content synchronize --id 2
Could not synchronize capsule content:
  Pulp message bus connection issue at https://cap.example.com/pulp/api/v2/.

Actual results:
upgraded capsule content sync is failing due to bus connection issue

Expected results:
upgraded capsule content sync is successful

Additional info:
@CAPSULE

# service httpd  status -l 
...
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 323, in open
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)     self.attach(timeout=timeout)
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)   File "<string>", line 6, in attach
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 341, in attach
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)     if not self._ewait(lambda: self._transport_connected and not self._unlinked(), timeout=timeout):
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 274, in _ewait
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)     self.check_error()
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 267, in check_error
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)     raise e
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576) ConnectError: ("Connection hostname 'localhost' does not match names from peer certificate: ['cap.example.com', u'cap.example.com']",)

# service pulp_resource_manager status -l
...
Oct 20 09:28:42 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672)
Oct 20 09:29:14 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672) consumer: Cannot connect to qpid://localhost:5671//: ("Connection hostname 'localhost' does not match names from peer certificate: ['cap.example.com', u'cap.example.com']",).
Oct 20 09:29:14 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672) Trying again in 32.00 seconds...
Oct 20 09:29:14 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672)
Oct 20 09:29:46 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672) consumer: Cannot connect to qpid://localhost:5671//: ("Connection hostname 'localhost' does not match names from peer certificate: ['cap.example.com', u'cap.example.com']",).
Oct 20 09:29:46 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672) Trying again in 32.00 seconds...
Oct 20 09:29:46 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672)
Oct 20 09:30:18 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672) consumer: Cannot connect to qpid://localhost:5671//: ("Connection hostname 'localhost' does not match names from peer certificate: ['cap.example.com', u'cap.example.com']",).
Oct 20 09:30:18 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672) Trying again in 32.00 seconds...
Oct 20 09:30:18 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672)

# service pulp_celerybeat status -l
...
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 323, in open
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)     self.attach(timeout=timeout)
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)   File "<string>", line 6, in attach
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 341, in attach
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)     if not self._ewait(lambda: self._transport_connected and not self._unlinked(), timeout=timeout):
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 274, in _ewait
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)     self.check_error()
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 267, in check_error
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)     raise e
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856) ConnectError: ("Connection hostname 'localhost' does not match names from peer certificate: ['cap.example.com', u'cap.example.com']",)

Comment 1 Lukas Pramuk 2017-10-20 14:03:31 UTC
Qpid at 6.2 capsule is listening unrestricted at 0.0.0.0:5671
while at 6.3 capsule it got restricted to 127.0.0.1:5671

So probably cert migration/refresh is missing during upgrades

Comment 4 Lukas Pramuk 2017-10-20 15:21:29 UTC
# grep :5671 /etc/pulp/server.conf 
url: ssl://localhost:5671
broker_url: qpid://localhost:5671

# grep ^[^#] /etc/qpid/qpidd.conf
log-enable=error+
log-to-syslog=yes
auth=no
require-encryption=yes
ssl-require-client-authentication=yes
ssl-port=5671
ssl-cert-db=/etc/pki/katello/nssdb
ssl-cert-password-file=/etc/pki/katello/nssdb/nss_db_password-file
ssl-cert-name=broker
interface=lo

Comment 5 Lukas Pramuk 2017-10-20 15:33:20 UTC
Settings in comment#4 are the same for both upgraded and fresh 6.3 capsule.

Comment 6 Eric Helms 2017-10-23 23:55:20 UTC
I believe this is due to inaccurate upgrade commands for 6.2 to 6.3. Please test with:

Generating Certs

Add --certs-update-all to the capsule-certs-generate command, for example:

capsule-certs-generate --certs-tar ~/tmp/mycerts.tar.gz --foreman-proxy-fqdn pipeline-capsule-6-2-rhel7.woodford.example.com --certs-update-all


Upgrading Capsule

When upgrading the capsule, with our updated certs bundle, we need to ensure these new certs are deployed and the nssdb is regenerated. Do this by ensuring '--certs-regenerate true' and --'certs-deploy true' and '--certs-update-all' are included in the upgrade command. For example:

satellite-installer --upgrade --foreman-proxy-content-certs-tar ~/mycerts.tar.gz --certs-update-all --certs-regenerate true --certs-deploy true

Comment 7 Lukas Pramuk 2017-11-01 10:57:08 UTC
VERIFIED.

@Satellite/Capsule 6.3.0 Snap22

by manual reproducer in comment#0:


1. assign capsule to LFE


2. upgrade capsule to 6.3

@SAT:

# capsule-certs-generate --foreman-proxy-fqdn cap.example.com --certs-tar ~/cap.example.com.tar --certs-update-all
# scp ~/cap.example.com.tar cap.example.com:

@CAPSULE:

#satellite-installer --upgrade --foreman-proxy-content-certs-tar ~/cap.example.com.tar --certs-update-all --certs-regenerate true --certs-deploy true
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-qpid-router-server for update
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-qpid-router-client for update
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-foreman-proxy for update
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-foreman-proxy-client for update
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-foreman-client for update
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-puppet-client for update
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-qpid-client-cert for update
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-qpid-broker for update
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-apache for update
...
Upgrade completed!


3. trigger capsule sync either in UI or CLI

@SAT:

# hammer capsule content synchronize --id 2
[...............................................................................................................] [100%]

@CAPSULE:

# service pulp_celerybeat status -l
...
Nov 01 09:45:59 cap.example.com pulp[23023]: kombu.transport.qpid:INFO: Connected to qpid with SASL mechanism ANONYMOUS
Nov 01 09:55:59 cap.example.com pulp[23023]: celery.beat:INFO: Scheduler: Sending due task download_deferred_content (pulp.server.controllers.repository.queue_download_deferred)


>>> capsule sync was triggered successfully and pulp was able to connect to qpid message bus

Comment 9 Satellite Program 2018-02-21 16:54:37 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.
> > 
> > For information on the advisory, and where to find the updated files, follow the link below.
> > 
> > If the solution does not work for you, open a new bug report.
> > 
> > https://access.redhat.com/errata/RHSA-2018:0336

Comment 14 Mihir Lele 2018-04-15 10:27:56 UTC
Closing this bug out as the issue was related to the Satellite HA upgrade from 6.2 to 6.3 which is currently unsupported.


Note You need to log in before you can comment on or make changes to this bug.