Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1504737

Summary: Upgraded capsule content sync fails with 'Pulp message bus connection issue'
Product: Red Hat Satellite Reporter: Lukas Pramuk <lpramuk>
Component: UpgradesAssignee: Eric Helms <ehelms>
Status: CLOSED WONTFIX QA Contact: Lukas Pramuk <lpramuk>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.3.0CC: bbuckingham, bjarolim, cmarinea, ehelms, inecas, ktordeur, mbacovsk, mlele, takirby
Target Milestone: UnspecifiedKeywords: Regression, Reopened, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-15 10:27:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lukas Pramuk 2017-10-20 13:58:08 UTC
Description of problem:
Upgraded capsule content sync fails with 'Pulp message bus connection issue at https://cap.example.com/pulp/api/v2/.' This is truly upgrade bug since fresh 6.3 capsule operates just fine.

Version-Release number of selected component (if applicable):
@satellite-capsule-6.3.0-20.0.beta.el7sat.noarch
satellite-installer-6.3.0.7-1.beta.el7sat.noarch
pulp-server-2.13.4.1-1.el7sat.noarch
qpid-cpp-client-1.36.0-9.el7.x86_64
qpid-cpp-server-1.36.0-9.el7.x86_64

How reproducible:
after capsule upgrade 6.2.12 > 6.3.0

Steps to Reproduce:
1. assign capsule to LFE
2. upgrade capsule to 6.3
3. trigger capsule sync either in UI or CLI
@SAT # hammer capsule content synchronize --id 2
Could not synchronize capsule content:
  Pulp message bus connection issue at https://cap.example.com/pulp/api/v2/.

Actual results:
upgraded capsule content sync is failing due to bus connection issue

Expected results:
upgraded capsule content sync is successful

Additional info:
@CAPSULE

# service httpd  status -l 
...
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 323, in open
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)     self.attach(timeout=timeout)
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)   File "<string>", line 6, in attach
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 341, in attach
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)     if not self._ewait(lambda: self._transport_connected and not self._unlinked(), timeout=timeout):
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 274, in _ewait
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)     self.check_error()
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 267, in check_error
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576)     raise e
Oct 20 09:21:23 cap.example.com pulp[24586]: pulp.server.managers.status:ERROR: (24586-64576) ConnectError: ("Connection hostname 'localhost' does not match names from peer certificate: ['cap.example.com', u'cap.example.com']",)

# service pulp_resource_manager status -l
...
Oct 20 09:28:42 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672)
Oct 20 09:29:14 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672) consumer: Cannot connect to qpid://localhost:5671//: ("Connection hostname 'localhost' does not match names from peer certificate: ['cap.example.com', u'cap.example.com']",).
Oct 20 09:29:14 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672) Trying again in 32.00 seconds...
Oct 20 09:29:14 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672)
Oct 20 09:29:46 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672) consumer: Cannot connect to qpid://localhost:5671//: ("Connection hostname 'localhost' does not match names from peer certificate: ['cap.example.com', u'cap.example.com']",).
Oct 20 09:29:46 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672) Trying again in 32.00 seconds...
Oct 20 09:29:46 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672)
Oct 20 09:30:18 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672) consumer: Cannot connect to qpid://localhost:5671//: ("Connection hostname 'localhost' does not match names from peer certificate: ['cap.example.com', u'cap.example.com']",).
Oct 20 09:30:18 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672) Trying again in 32.00 seconds...
Oct 20 09:30:18 cap.example.com pulp[24433]: celery.worker.consumer:ERROR: (24433-80672)

# service pulp_celerybeat status -l
...
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 323, in open
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)     self.attach(timeout=timeout)
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)   File "<string>", line 6, in attach
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 341, in attach
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)     if not self._ewait(lambda: self._transport_connected and not self._unlinked(), timeout=timeout):
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 274, in _ewait
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)     self.check_error()
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 267, in check_error
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856)     raise e
Oct 19 14:21:14 cap.example.com pulp[24519]: celery.beat:CRITICAL: (24519-21856) ConnectError: ("Connection hostname 'localhost' does not match names from peer certificate: ['cap.example.com', u'cap.example.com']",)

Comment 1 Lukas Pramuk 2017-10-20 14:03:31 UTC
Qpid at 6.2 capsule is listening unrestricted at 0.0.0.0:5671
while at 6.3 capsule it got restricted to 127.0.0.1:5671

So probably cert migration/refresh is missing during upgrades

Comment 4 Lukas Pramuk 2017-10-20 15:21:29 UTC
# grep :5671 /etc/pulp/server.conf 
url: ssl://localhost:5671
broker_url: qpid://localhost:5671

# grep ^[^#] /etc/qpid/qpidd.conf
log-enable=error+
log-to-syslog=yes
auth=no
require-encryption=yes
ssl-require-client-authentication=yes
ssl-port=5671
ssl-cert-db=/etc/pki/katello/nssdb
ssl-cert-password-file=/etc/pki/katello/nssdb/nss_db_password-file
ssl-cert-name=broker
interface=lo

Comment 5 Lukas Pramuk 2017-10-20 15:33:20 UTC
Settings in comment#4 are the same for both upgraded and fresh 6.3 capsule.

Comment 6 Eric Helms 2017-10-23 23:55:20 UTC
I believe this is due to inaccurate upgrade commands for 6.2 to 6.3. Please test with:

Generating Certs

Add --certs-update-all to the capsule-certs-generate command, for example:

capsule-certs-generate --certs-tar ~/tmp/mycerts.tar.gz --foreman-proxy-fqdn pipeline-capsule-6-2-rhel7.woodford.example.com --certs-update-all


Upgrading Capsule

When upgrading the capsule, with our updated certs bundle, we need to ensure these new certs are deployed and the nssdb is regenerated. Do this by ensuring '--certs-regenerate true' and --'certs-deploy true' and '--certs-update-all' are included in the upgrade command. For example:

satellite-installer --upgrade --foreman-proxy-content-certs-tar ~/mycerts.tar.gz --certs-update-all --certs-regenerate true --certs-deploy true

Comment 7 Lukas Pramuk 2017-11-01 10:57:08 UTC
VERIFIED.

@Satellite/Capsule 6.3.0 Snap22

by manual reproducer in comment#0:


1. assign capsule to LFE


2. upgrade capsule to 6.3

@SAT:

# capsule-certs-generate --foreman-proxy-fqdn cap.example.com --certs-tar ~/cap.example.com.tar --certs-update-all
# scp ~/cap.example.com.tar cap.example.com:

@CAPSULE:

#satellite-installer --upgrade --foreman-proxy-content-certs-tar ~/cap.example.com.tar --certs-update-all --certs-regenerate true --certs-deploy true
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-qpid-router-server for update
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-qpid-router-client for update
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-foreman-proxy for update
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-foreman-proxy-client for update
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-foreman-client for update
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-puppet-client for update
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-qpid-client-cert for update
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-qpid-broker for update
Marking certificate /root/ssl-build/cap.example.com/cap.example.com-apache for update
...
Upgrade completed!


3. trigger capsule sync either in UI or CLI

@SAT:

# hammer capsule content synchronize --id 2
[...............................................................................................................] [100%]

@CAPSULE:

# service pulp_celerybeat status -l
...
Nov 01 09:45:59 cap.example.com pulp[23023]: kombu.transport.qpid:INFO: Connected to qpid with SASL mechanism ANONYMOUS
Nov 01 09:55:59 cap.example.com pulp[23023]: celery.beat:INFO: Scheduler: Sending due task download_deferred_content (pulp.server.controllers.repository.queue_download_deferred)


>>> capsule sync was triggered successfully and pulp was able to connect to qpid message bus

Comment 9 Satellite Program 2018-02-21 16:54:37 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.
> > 
> > For information on the advisory, and where to find the updated files, follow the link below.
> > 
> > If the solution does not work for you, open a new bug report.
> > 
> > https://access.redhat.com/errata/RHSA-2018:0336

Comment 14 Mihir Lele 2018-04-15 10:27:56 UTC
Closing this bug out as the issue was related to the Satellite HA upgrade from 6.2 to 6.3 which is currently unsupported.