Bug 1220771
| Summary: | On restarting the qpidd and other pulp services on Satellite-server raises some error related to qpid/pulp along with traceback | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Satellite | Reporter: | Sachin Ghai <sghai> | ||||
| Component: | Infrastructure | Assignee: | Stephen Benjamin <stbenjam> | ||||
| Status: | CLOSED DUPLICATE | QA Contact: | Sachin Ghai <sghai> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 6.1.0 | CC: | bbuckingham, sghai | ||||
| Target Milestone: | Unspecified | Keywords: | Triaged | ||||
| Target Release: | Unused | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2015-10-15 13:40:02 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1195450 | ||||||
| Attachments: |
|
||||||
With Satellite 6.1, the proper way to restart services on the capsule will be via 'katello-service restart'. Do you see the same errors when using that tool? If not, I'd recommend we close this one out as user could get errors restarting individual services. I don't see this on the latest GA snap 7, it all seems to be working fine, can you check again if the problem's gone away? I'll re-test with snap7. thanks. I can see error related to qpid with sat6.1 ga snap7 -- Jun 9 02:02:19 cloud-qe-14 pulp: celery.worker.consumer:ERROR: (7740-47776) consumer: Cannot connect to qpid://guest.lab.eng.bos.redhat.com:5671//: [Errno 104] Connection reset by peer. Jun 9 02:02:19 cloud-qe-14 pulp: celery.worker.consumer:ERROR: (7743-02272) consumer: Cannot connect to qpid://guest.lab.eng.bos.redhat.com:5671//: [Errno 104] Connection reset by peer. Jun 9 02:02:19 cloud-qe-14 pulp: celery.worker.consumer:ERROR: (7748-30816) consumer: Cannot connect to qpid://guest.lab.eng.bos.redhat.com:5671//: [Errno 104] Connection reset by peer. Jun 9 02:02:19 cloud-qe-14 pulp: celery.worker.consumer:ERROR: (7743-02272) Trying again in 2.00 seconds... Jun 9 02:02:19 cloud-qe-14 pulp: celery.worker.consumer:ERROR: (7740-47776) Trying again in 2.00 seconds... Jun 9 02:02:19 cloud-qe-14 pulp: celery.worker.consumer:ERROR: (7748-30816) Trying again in 2.00 seconds... Jun 9 02:02:19 cloud-qe-14 pulp: celery.worker.consumer:ERROR: (7743-02272) Jun 9 02:02:19 cloud-qe-14 pulp: celery.worker.consumer:ERROR: (7740-47776) Jun 9 02:02:19 cloud-qe-14 pulp: celery.worker.consumer:ERROR: (7748-30816) Jun 9 02:02:19 cloud-qe-14 pulp: celery.worker.consumer:ERROR: (7679-66816) consumer: Cannot connect to qpid://guest.lab.eng.bos.redhat.com:5671//: [Errno 104] Connection reset by peer. Jun 9 02:02:19 cloud-qe-14 pulp: celery.worker.consumer:ERROR: (7745-52384) consumer: Cannot connect to qpid://guest.lab.eng.bos.redhat.com:5671//: [Errno 104] Connection reset by peer. Jun 9 02:02:19 cloud-qe-14 pulp: celery.worker.consumer:ERROR: (7745-52384) Trying again in 2.00 seconds... Jun 9 02:02:19 cloud-qe-14 pulp: celery.worker.consumer:ERROR: (7745-52384) Jun 9 02:02:19 cloud-qe-14 qdrouterd: Tue Jun 9 02:02:19 2015 ROUTER (info) Removing Prefix 'pulp.' for routed links to 'broker' Jun 9 02:02:19 cloud-qe-14 qdrouterd: Tue Jun 9 02:02:19 2015 ROUTER (info) Removing Prefix 'qmf.' for routed links to 'broker' Jun 9 02:02:19 cloud-qe-14 pulp: celery.worker.consumer:ERROR: (7679-66816) Trying again in 2.00 seconds... Jun 9 02:02:19 cloud-qe-14 pulp: celery.worker.consumer:ERROR: (7679-66816) Oh I see! Sorry, I misunderstood. This is expected. qpidd is restarted, so the workers lose their connection. You could avoid the errors by doing this in three steps: 1. for i in pulp_resource_manager pulp_workers pulp_celerybeat; do service $i stop; done 2. service qpidd restart 3. for i in pulp_resource_manager pulp_workers pulp_celerybeat; do service $i start; done I'll leave this as a bug but for sat-future if that's OK. We could modify katello-service to do things in exactly this order, but the errors don't mean much, the workers lose their connections and then get restarted and reconnect and everything is healthy afterwards. BZ1269352 should fix the ordering so stop/start/restart all work as expected *** This bug has been marked as a duplicate of bug 1269352 *** |
Created attachment 1024548 [details] error in /var/log/messages on restarting qpid and pulp services on sat6 Description of problem: On restarting following services on satellite6 server throws following error in /var/log/messages: May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:WARNING: (19753-25856) ConnectionError: connection aborted May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:WARNING: (19678-84224) consumer: Connection to broker lost. Trying to re-establish the connection... May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:WARNING: (19678-84224) Traceback (most recent call last): May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:WARNING: (19678-84224) File "/usr/lib/python2.7/site-packages/celery/worker/consumer.py", line 278, in start May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:WARNING: (19678-84224) blueprint.start(self) May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:WARNING: (19678-84224) File "/usr/lib/python2.7/site-packages/celery/bootsteps.py", line 123, in start May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:WARNING: (19678-84224) step.start(parent) May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:WARNING: (19678-84224) File "/usr/lib/python2.7/site-packages/celery/worker/consumer.py", line 821, in start May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:WARNING: (19678-84224) c.loop(*c.loop_args()) May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:WARNING: (19678-84224) File "/usr/lib/python2.7/site-packages/celery/worker/loops.py", line 72, in asynloop May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:WARNING: (19678-84224) next(loop) May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:WARNING: (19678-84224) File "/usr/lib/python2.7/site-packages/kombu/async/hub.py", line 324, in create_loop May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:WARNING: (19678-84224) cb(*cbargs) May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:WARNING: (19678-84224) File "/usr/lib/python2.7/site-packages/kombu/transport/qpid.py", line 1559, in on_readable May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:WARNING: (19678-84224) raise self.session.saved_exception May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:WARNING: (19678-84224) ConnectionError: connection aborted May 12 16:50:58 dhcp207-123 pulp: pulp.server.async.scheduler:ERROR: connection aborted May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19747-23680) consumer: Cannot connect to qpid://guest.eng.pnq.redhat.com:5671//: [Errno 104] Connection reset by peer. May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19747-23680) Trying again in 2.00 seconds... May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19747-23680) May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19745-76192) consumer: Cannot connect to qpid://guest.eng.pnq.redhat.com:5671//: [Errno 104] Connection reset by peer. May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19745-76192) Trying again in 2.00 seconds... May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19753-25856) consumer: Cannot connect to qpid://guest.eng.pnq.redhat.com:5671//: [Errno 104] Connection reset by peer. May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19745-76192) May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19753-25856) Trying again in 2.00 seconds... May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19753-25856) May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19743-40512) consumer: Cannot connect to qpid://guest.eng.pnq.redhat.com:5671//: [Errno 104] Connection reset by peer. May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19743-40512) Trying again in 2.00 seconds... May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19749-68192) consumer: Cannot connect to qpid://guest.eng.pnq.redhat.com:5671//: [Errno 104] Connection reset by peer. May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19678-84224) consumer: Cannot connect to qpid://guest.eng.pnq.redhat.com:5671//: [Errno 104] Connection reset by peer. May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19749-68192) Trying again in 2.00 seconds... May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19751-89888) consumer: Cannot connect to qpid://guest.eng.pnq.redhat.com:5671//: [Errno 104] Connection reset by peer. May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19678-84224) Trying again in 2.00 seconds... May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19749-68192) May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19751-89888) Trying again in 2.00 seconds... May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19678-84224) May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19751-89888) May 12 16:50:58 dhcp207-123 pulp: celery.worker.consumer:ERROR: (19743-40512) May 12 16:50:58 dhcp207-123 qdrouterd: Tue May 12 16:50:58 2015 ROUTER (info) Removing Prefix 'pulp.' for routed links to 'broker' May 12 16:50:58 dhcp207-123 qdrouterd: Tue May 12 16:50:58 2015 ROUTER (info) Removing Prefix 'qmf.' for routed links to 'broker' May 12 16:50:58 dhcp207-123 systemd: Starting An AMQP message broker daemon.... May 12 16:50:58 dhcp207-123 systemd: Started An AMQP message broker daemon.. May 12 16:50:58 dhcp207-123 systemd: Stopping Pulp Resource Manager... May 12 16:50:58 dhcp207-123 celery: Please enter your password: Please enter your password: Please enter your password: May 12 16:50:58 dhcp207-123 celery: worker: Warm shutdown (MainProcess) May 12 16:50:59 dhcp207-123 celery: resource_manager.eng.pnq.redhat.com ready. Version-Release number of selected component (if applicable): sat6.1 GA snap3 How reproducible: always Steps to Reproduce: 1. for i in qpidd pulp_resource_manager pulp_workers pulp_celerybeat; do service $i restart; done 2. 3. Actual results: errors in /var/log/messages Expected results: Additional info: