Bug 1420474

Summary: goferd conumes 100% cpu after losing the connection to qdrouterd
Product: Red Hat Satellite Reporter: Andrew Kofink <akofink>
Component: QpidAssignee: Mike Cressman <mcressma>
Status: CLOSED CURRENTRELEASE QA Contact: Katello QA List <katello-qa-list>
Severity: high Docs Contact:
Priority: high    
Version: 6.2.6CC: aperotti, bbuckingham, bkearney, daniele, jcallaha, mmccune, omaciel
Target Milestone: UnspecifiedKeywords: Performance, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-05-17 17:36:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andrew Kofink 2017-02-08 19:20:53 UTC
Description of problem:
goferd continually connects and disconnects to qdrouterd

Version-Release number of selected component (if applicable):
6.2.6

How reproducible:
Intermittent

Steps to Reproduce:
This has been reproduced during errata install via katello-agent with an isolated capsule.
It has also been reproduced with internal capsule, as described in the attached redmine. Restarting goferd on the client has been shown to temporarily fix the issue.

Actual results:
Feb 2 13:49:04 hostname goferd: [INFO][pulp.agent.id-number-hidden] gofer.messaging.adapter.connect:35 - retry in 10 seconds
Feb 2 13:49:14 hostname goferd: [INFO][pulp.agent.id-number-hidden] gofer.messaging.adapter.connect:28 - connecting: proton+amqps://centosrhelupdateserver.domain.tld:5647
Feb 2 13:49:14 hostname goferd: [INFO][pulp.agent.id-number-hidden] gofer.messaging.adapter.proton.connection:87 - open: URL: amqps://centosrhelupdateserver.domain.tld:5647|SSL: ca: /etc/rhsm/ca/katello-default-ca.pem|key: None|certificate: /etc/pki/consumer/bundle.pem|host-validation: None
Feb 2 13:49:14 hostname goferd: [INFO][pulp.agent.id-number-hidden] root:485 - connecting to centosrhelupdateserver.domain.tld:5647...
Feb 2 13:50:17 hostname goferd: [INFO][pulp.agent.id-number-hidden] root:525 - Disconnected
Feb 2 13:50:17 hostname goferd: [ERROR][pulp.agent.id-number-hidden] gofer.messaging.adapter.connect:33 - connect: proton+amqps://centosrhelupdateserver.domain.tld:5647, failed: Connection amqps://centosrhelupdateserver.domain.tld:5647 disconnected

Expected results:
goferd should be able to connect and communicate with pulp

Additional info:

Comment 3 Satellite Program 2017-02-12 09:07:08 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/18386 has been resolved.

Comment 4 Mike McCune 2017-05-17 17:36:57 UTC
We are currently shipping qpid-proton-0.9-16 that includes the above mentioned fixes.

The main remaining issue around gofer stability we are still tracking is:

https://bugzilla.redhat.com/show_bug.cgi?id=1318015

Going to close this as currentrelease on 6.2.9 or later. Please re-open if this is still occurring on 6.2.9 or later releases.