Bug 1212200
| Summary: | [RHEL 6.7 beta] Myriad different errors appearing in /v/l/m following install; celery "ordereddict" abrts being sent. | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Satellite | Reporter: | Corey Welton <cwelton> | ||||||
| Component: | Content Management | Assignee: | Partha Aji <paji> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Tazim Kolhar <tkolhar> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | Unspecified | CC: | bbuckingham, bkearney, bmbouter, borgan, cwelton, daviddavis, dcaplan, dkliban, dkutalek, ggainey, ipanova, mhrivnak, mmccune, omaciel, pcreech, rchan, salmy, tkolhar, ttereshc, xdmoon | ||||||
| Target Milestone: | Unspecified | Keywords: | Triaged | ||||||
| Target Release: | Unused | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | |||||||||
| : | 1212967 (view as bug list) | Environment: | |||||||
| Last Closed: | 2015-08-12 13:55:17 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1115190, 1212967 | ||||||||
| Attachments: |
|
||||||||
Created attachment 1014927 [details]
/var/log/messages from sytstem outlining a bunch of issues
Created attachment 1014928 [details]
abrt report sent from celery
I noticed in my initial attachment that actually, celerybeat is stopped.
Also, restarting services, I see this (same as in abrt)
celery init v10.0.
Using config script: /etc/default/pulp_resource_manager
Traceback (most recent call last):
File "/usr/bin/celery", line 5, in <module>
from pkg_resources import load_entry_point
File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2655, in <module>
working_set.require(__requires__)
File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 648, in require
needed = self.resolve(parse_requirements(requirements))
File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 546, in resolve
raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: ordereddict
celery init v10.0.
Using config script: /etc/default/pulp_workers
Traceback (most recent call last):
File "/usr/bin/celery", line 5, in <module>
from pkg_resources import load_entry_point
File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2655, in <module>
working_set.require(__requires__)
File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 648, in require
needed = self.resolve(parse_requirements(requirements))
File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 546, in resolve
raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: ordereddict
celery init v10.0.
Using configuration: /etc/default/pulp_workers, /etc/default/pulp_celerybeat
Starting pulp_celerybeat...
Traceback (most recent call last):
File "/usr/bin/celery", line 5, in <module>
from pkg_resources import load_entry_point
File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2655, in <module>
working_set.require(__requires__)
File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 648, in require
needed = self.resolve(parse_requirements(requirements))
File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 546, in resolve
raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: ordereddict
This has been reproduced in a separate QE environment. What version of celery do you have installed? "ordereddict" is new in python 2.7. I'm surprised anything is trying to use it, unless there is a new version of some dependency that is trying to use it. That would be a bug in that package if it's an el6 package. I've identified the problem. It's unfortunately a little complicated. TL;DR a small packaging change [0] to python-libs in rhel 6.7 caused a surprising backport into the python standard library (that was done in RHEL 6.5) to become a problem. We can work around it by modifying our python-kombu package. Python 2.7 introduced the "ordereddict" module. In python 2.6, you can install a package called "python-ordereddict" that is a backport. Starting in RHEL 6.5, Red Hat decided to backport "ordereddict" directly into the python standard library. [0] I'm not sure why it wasn't sufficient to keep using the python-ordereddict package like the rest of the python community would expect. RPMs like python-kombu require the rpm "python-ordereddict" on el6. Also, at the python packageing level (setup.py), packages like kombu require the "ordereddict" python package on el6. In RHEL 6.7, they added a "Provides: python-ordereddict" statement to python-libs. [1] With RHEL 6.5 and 6.6, the python-ordereddict RPM was still getting installed, along with its egg-info (python package metadata), because that Provides statement was missing. Now on 6.7, the python-ordereddict RPM does not get installed, because python-libs "Provides" it. But it does not (and cannot) provide the egg-info. The kombu python package still has a runtime dependency on the "ordereddict" python package, which it cannot find, because there is no egg-info. Thus we see the traceback above. The work-around is for us to remove the dependency on "ordereddict" from kombu's setup.py on el6. I am also going to file a bug against RHEL 6.7 so they know about the issue, although they are unlikely to make any changes in response. This conflict between rpm packaging and python packaging was caused by backporting stuff into the standard library, and the only real fix I can think of would be to undo that. [0] https://bugzilla.redhat.com/show_bug.cgi?id=929258 [1] https://bugzilla.redhat.com/show_bug.cgi?id=1199997 The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug. The Pulp upstream bug priority is at Urgent. Updating the external tracker on this bug. The Pulp upstream bug status is at POST. Updating the external tracker on this bug. The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug. The Pulp upstream bug status is at ON_QA. Updating the external tracker on this bug. VERIFIED: # rpm -qa | grep foreman foreman-1.7.2.21-1.el7sat.noarch ruby193-rubygem-foreman_discovery-2.0.0.13-1.el7sat.noarch foreman-libvirt-1.7.2.21-1.el7sat.noarch ruby193-rubygem-foreman_gutterball-0.0.1.9-1.el7sat.noarch foreman-postgresql-1.7.2.21-1.el7sat.noarch ruby193-rubygem-foreman_bootdisk-4.0.2.13-1.el7sat.noarch dell-pem710-01.rhts.eng.bos.redhat.com-foreman-proxy-client-1.0-1.noarch foreman-ovirt-1.7.2.21-1.el7sat.noarch rubygem-hammer_cli_foreman-0.1.4.11-1.el7sat.noarch foreman-selinux-1.7.2.13-1.el7sat.noarch foreman-gce-1.7.2.21-1.el7sat.noarch ruby193-rubygem-foreman-redhat_access-0.1.0-1.el7sat.noarch ruby193-rubygem-foreman-tasks-0.6.12.5-1.el7sat.noarch rubygem-hammer_cli_foreman_tasks-0.0.3.4-1.el7sat.noarch rubygem-hammer_cli_foreman_docker-0.0.3.6-1.el7sat.noarch ruby193-rubygem-foreman_docker-1.2.0.12-1.el7sat.noarch ruby193-rubygem-foreman_hooks-0.3.7-2.el7sat.noarch rubygem-hammer_cli_foreman_bootdisk-0.1.2.7-1.el7sat.noarch foreman-proxy-1.7.2.4-1.el7sat.noarch dell-pem710-01.rhts.eng.bos.redhat.com-foreman-client-1.0-1.noarch dell-pem710-01.rhts.eng.bos.redhat.com-foreman-proxy-1.0-2.noarch foreman-vmware-1.7.2.21-1.el7sat.noarch rubygem-hammer_cli_foreman_discovery-0.0.1.10-1.el7sat.noarch foreman-compute-1.7.2.21-1.el7sat.noarch foreman-debug-1.7.2.21-1.el7sat.noarch steps: 1. Install product this should be done on a RHE 6.7 beta compose. 2. Upload manifest that grants appropriate subscriptions for SCL and Sat6 beta. 3. Attempt to create and sync three repos: * RH SCL repo for 6Server [red hat content] * RH Capsule 6.1 beta [red hat content] * Internal mirror of RHEL beta [custom repo] 4 Sync complete The Pulp upstream bug status is at VERIFIED. Updating the external tracker on this bug. This bug is slated to be released with Satellite 6.1. This bug was fixed in version 6.1.1 of Satellite which was released on 12 August, 2015. The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug. |
Description of problem: After installing beta release of sat 6.1.0 against a 6.7 beta compose (details forthcoming), attempting to sync content seems to have hung indefinitely. A deeper look into /v/l/m and abrt indicates a number of crashes and general strange behavior that may or may not be all related. Furthermore, it appears anyway that services are running as expected. Version-Release number of selected component (if applicable): 6.1.0 beta (for RHEL 6) RHEL 6.7 beta compose RHEL-6.7-20150401.0 How reproducible: unsure...yet Steps to Reproduce: 1. Install product this should be done on a RHE 6.7 beta compose. 2. Upload manifest that grants appropriate subscriptions for SCL and Sat6 beta. 3. Attempt to create and sync three repos: * RH SCL repo for 6Server [red hat content] * RH Capsule 6.1 beta [red hat content] * Internal mirror of RHEL beta [custom repo] 4. View sync performance Actual results: * Sync seems to hang indefinitely ("Pending") * A wide variety of issues appear in logs, i.e.: Apr 15 12:50:55 cloud-qe-17 qpidd[28715]: 2015-04-15 12:50:55 [Security] error Rejected un-encrypted connection. Apr 15 12:50:55 cloud-qe-17 qpidd[28715]: 2015-04-15 12:50:55 [Protocol] error Connection qpid.127.0.0.1:5672-127.0.0.1:58080 closed by error: connection-forced: Connection must be encrypted.(320) Apr 15 13:14:02 cloud-qe-17 puppet-master[1727]: Failed to find cloud-qe-17.idmqe.lab.eng.bos.redhat.com via exec: Execution of '/etc/puppet/node.rb cloud-qe-17.idmqe.lab.eng.bos.redhat.com' returned 1: Apr 15 13:14:02 cloud-qe-17 puppet-agent[1678]: Unable to fetch my node definition, but the agent run will continue: Apr 15 13:14:02 cloud-qe-17 puppet-agent[1678]: Error 400 on SERVER: Failed to find cloud-qe-17.idmqe.lab.eng.bos.redhat.com via exec: Execution of '/etc/puppet/node.rb cloud-qe-17.idmqe.lab.eng.bos.redhat.com' returned 1: Apr 15 12:44:54 cloud-qe-17 dbus: avc: received policyload notice (seqno=11) Apr 15 12:44:54 cloud-qe-17 dbus: [system] Reloaded configuration Apr 15 12:44:55 cloud-qe-17 dbus: avc: received policyload notice (seqno=12) Apr 15 12:44:55 cloud-qe-17 dbus: [system] Reloaded configuration Apr 15 12:44:56 cloud-qe-17 abrt: detected unhandled Python exception in '/usr/bin/celery' Apr 15 12:44:56 cloud-qe-17 abrtd: New client connected Apr 15 12:44:56 cloud-qe-17 abrtd: Directory 'pyhook-2015-04-15-12:44:56-29599' creation detected Apr 15 12:44:56 cloud-qe-17 abrt-server[29614]: Saved Python crash dump of pid 29599 to /var/spool/abrt/pyhook-2015-04-15-12:44:56-29599 Apr 15 12:44:57 cloud-qe-17 abrt: detected unhandled Python exception in '/usr/bin/celery' Apr 15 12:44:57 cloud-qe-17 abrtd: New client connected Apr 15 12:44:57 cloud-qe-17 abrt-server[29704]: Not saving repeating crash in '/usr/bin/celery' Apr 15 12:44:57 cloud-qe-17 abrt: detected unhandled Python exception in '/usr/bin/celery' Apr 15 12:44:57 cloud-qe-17 abrtd: New client connected Apr 15 12:44:57 cloud-qe-17 abrt-server[29773]: Not saving repeating crash in '/usr/bin/celery' Apr 15 12:44:58 cloud-qe-17 abrtd: New problem directory /var/spool/abrt/pyhook-2015-04-15-12:44:56-29599, processing Apr 15 12:44:58 cloud-qe-17 abrtd: Sending an email... More complete logs + an abrtd error thrown to to root's email will be posted. Expected results: Additional info: