Bug 1212200

Summary: [RHEL 6.7 beta] Myriad different errors appearing in /v/l/m following install; celery "ordereddict" abrts being sent.
Product: Red Hat Satellite Reporter: Corey Welton <cwelton>
Component: Content ManagementAssignee: Partha Aji <paji>
Status: CLOSED CURRENTRELEASE QA Contact: Tazim Kolhar <tkolhar>
Severity: high Docs Contact:
Priority: unspecified    
Version: UnspecifiedCC: bbuckingham, bkearney, bmbouter, borgan, cwelton, daviddavis, dcaplan, dkliban, dkutalek, ggainey, ipanova, mhrivnak, mmccune, omaciel, pcreech, rchan, salmy, tkolhar, ttereshc, xdmoon
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1212967 (view as bug list) Environment:
Last Closed: 2015-08-12 13:55:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1115190, 1212967    
Attachments:
Description Flags
/var/log/messages from sytstem outlining a bunch of issues
none
abrt report sent from celery none

Description Corey Welton 2015-04-15 19:34:33 UTC
Description of problem:
After installing beta release of sat 6.1.0 against a 6.7 beta compose (details forthcoming), attempting to sync content seems to have hung indefinitely.

A deeper look into /v/l/m and abrt indicates a number of crashes and general strange behavior that may or may not be all related.

Furthermore, it appears anyway that services are running as expected.

Version-Release number of selected component (if applicable):
6.1.0 beta (for RHEL 6)
RHEL 6.7 beta compose RHEL-6.7-20150401.0

How reproducible:
unsure...yet

Steps to Reproduce:
1.  Install product  this should be done on a RHE 6.7 beta compose.
2.  Upload manifest that grants appropriate subscriptions for SCL and Sat6 beta.
3.  Attempt to create and sync three repos:
  * RH SCL repo for 6Server [red hat content]
  * RH Capsule 6.1 beta  [red hat content]
  * Internal mirror of RHEL beta [custom repo]
4. View sync performance

Actual results:

* Sync seems to hang indefinitely ("Pending")
* A wide variety of issues appear in logs, i.e.:

Apr 15 12:50:55 cloud-qe-17 qpidd[28715]: 2015-04-15 12:50:55 [Security] error Rejected un-encrypted connection.
Apr 15 12:50:55 cloud-qe-17 qpidd[28715]: 2015-04-15 12:50:55 [Protocol] error Connection qpid.127.0.0.1:5672-127.0.0.1:58080 closed by error: connection-forced: Connection must be encrypted.(320)


Apr 15 13:14:02 cloud-qe-17 puppet-master[1727]: Failed to find cloud-qe-17.idmqe.lab.eng.bos.redhat.com via exec: Execution of '/etc/puppet/node.rb cloud-qe-17.idmqe.lab.eng.bos.redhat.com' returned 1: 
Apr 15 13:14:02 cloud-qe-17 puppet-agent[1678]: Unable to fetch my node definition, but the agent run will continue:
Apr 15 13:14:02 cloud-qe-17 puppet-agent[1678]: Error 400 on SERVER: Failed to find cloud-qe-17.idmqe.lab.eng.bos.redhat.com via exec: Execution of '/etc/puppet/node.rb cloud-qe-17.idmqe.lab.eng.bos.redhat.com' returned 1: 

Apr 15 12:44:54 cloud-qe-17 dbus: avc:  received policyload notice (seqno=11)
Apr 15 12:44:54 cloud-qe-17 dbus: [system] Reloaded configuration
Apr 15 12:44:55 cloud-qe-17 dbus: avc:  received policyload notice (seqno=12)
Apr 15 12:44:55 cloud-qe-17 dbus: [system] Reloaded configuration
Apr 15 12:44:56 cloud-qe-17 abrt: detected unhandled Python exception in '/usr/bin/celery'
Apr 15 12:44:56 cloud-qe-17 abrtd: New client connected
Apr 15 12:44:56 cloud-qe-17 abrtd: Directory 'pyhook-2015-04-15-12:44:56-29599' creation detected
Apr 15 12:44:56 cloud-qe-17 abrt-server[29614]: Saved Python crash dump of pid 29599 to /var/spool/abrt/pyhook-2015-04-15-12:44:56-29599
Apr 15 12:44:57 cloud-qe-17 abrt: detected unhandled Python exception in '/usr/bin/celery'
Apr 15 12:44:57 cloud-qe-17 abrtd: New client connected
Apr 15 12:44:57 cloud-qe-17 abrt-server[29704]: Not saving repeating crash in '/usr/bin/celery'
Apr 15 12:44:57 cloud-qe-17 abrt: detected unhandled Python exception in '/usr/bin/celery'
Apr 15 12:44:57 cloud-qe-17 abrtd: New client connected
Apr 15 12:44:57 cloud-qe-17 abrt-server[29773]: Not saving repeating crash in '/usr/bin/celery'
Apr 15 12:44:58 cloud-qe-17 abrtd: New problem directory /var/spool/abrt/pyhook-2015-04-15-12:44:56-29599, processing
Apr 15 12:44:58 cloud-qe-17 abrtd: Sending an email...

More complete logs + an abrtd error thrown to to root's email will be posted.


Expected results:


Additional info:

Comment 1 Corey Welton 2015-04-15 19:37:35 UTC
Created attachment 1014927 [details]
/var/log/messages from sytstem outlining a bunch of issues

Comment 2 Corey Welton 2015-04-15 19:43:04 UTC
Created attachment 1014928 [details]
abrt report sent from celery

Comment 4 Corey Welton 2015-04-16 19:55:22 UTC
I noticed in my initial attachment that actually, celerybeat is stopped.

Also, restarting services,  I see this (same as in abrt)

celery init v10.0.
Using config script: /etc/default/pulp_resource_manager
Traceback (most recent call last):
  File "/usr/bin/celery", line 5, in <module>
    from pkg_resources import load_entry_point
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2655, in <module>
    working_set.require(__requires__)
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 648, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 546, in resolve
    raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: ordereddict
celery init v10.0.
Using config script: /etc/default/pulp_workers
Traceback (most recent call last):
  File "/usr/bin/celery", line 5, in <module>
    from pkg_resources import load_entry_point
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2655, in <module>
    working_set.require(__requires__)
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 648, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 546, in resolve
    raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: ordereddict
celery init v10.0.
Using configuration: /etc/default/pulp_workers, /etc/default/pulp_celerybeat
Starting pulp_celerybeat...
Traceback (most recent call last):
  File "/usr/bin/celery", line 5, in <module>
    from pkg_resources import load_entry_point
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2655, in <module>
    working_set.require(__requires__)
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 648, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 546, in resolve
    raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: ordereddict

Comment 5 Corey Welton 2015-04-17 13:23:23 UTC
This has been reproduced in a separate QE environment.

Comment 6 Michael Hrivnak 2015-04-17 14:42:16 UTC
What version of celery do you have installed?

"ordereddict" is new in python 2.7. I'm surprised anything is trying to use it, unless there is a new version of some dependency that is trying to use it. That would be a bug in that package if it's an el6 package.

Comment 7 Michael Hrivnak 2015-04-17 18:27:38 UTC
I've identified the problem. It's unfortunately a little complicated.

TL;DR a small packaging change [0] to python-libs in rhel 6.7 caused a surprising backport into the python standard library (that was done in RHEL 6.5) to become a problem. We can work around it by modifying our python-kombu package.

Python 2.7 introduced the "ordereddict" module. In python 2.6, you can install a package called "python-ordereddict" that is a backport.

Starting in RHEL 6.5, Red Hat decided to backport "ordereddict" directly into the python standard library. [0] I'm not sure why it wasn't sufficient to keep using the python-ordereddict package like the rest of the python community would expect.

RPMs like python-kombu require the rpm "python-ordereddict" on el6. Also, at the python packageing level (setup.py), packages like kombu require the "ordereddict" python package on el6.

In RHEL 6.7, they added a "Provides: python-ordereddict" statement to python-libs. [1] With RHEL 6.5 and 6.6, the python-ordereddict RPM was still getting installed, along with its egg-info (python package metadata), because that Provides statement was missing.

Now on 6.7, the python-ordereddict RPM does not get installed, because python-libs "Provides" it. But it does not (and cannot) provide the egg-info. The kombu python package still has a runtime dependency on the "ordereddict" python package, which it cannot find, because there is no egg-info. Thus we see the traceback above.

The work-around is for us to remove the dependency on "ordereddict" from kombu's setup.py on el6. I am also going to file a bug against RHEL 6.7 so they know about the issue, although they are unlikely to make any changes in response. This conflict between rpm packaging and python packaging was caused by backporting stuff into the standard library, and the only real fix I can think of would be to undo that.


[0] https://bugzilla.redhat.com/show_bug.cgi?id=929258
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1199997

Comment 8 pulp-infra@redhat.com 2015-04-20 19:29:54 UTC
The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug.

Comment 9 pulp-infra@redhat.com 2015-04-20 19:29:56 UTC
The Pulp upstream bug priority is at Urgent. Updating the external tracker on this bug.

Comment 12 pulp-infra@redhat.com 2015-04-21 12:00:23 UTC
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.

Comment 13 pulp-infra@redhat.com 2015-04-21 18:30:21 UTC
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.

Comment 26 pulp-infra@redhat.com 2015-05-19 00:30:29 UTC
The Pulp upstream bug status is at ON_QA. Updating the external tracker on this bug.

Comment 27 Tazim Kolhar 2015-05-20 10:25:33 UTC
VERIFIED:
# rpm -qa | grep foreman
foreman-1.7.2.21-1.el7sat.noarch
ruby193-rubygem-foreman_discovery-2.0.0.13-1.el7sat.noarch
foreman-libvirt-1.7.2.21-1.el7sat.noarch
ruby193-rubygem-foreman_gutterball-0.0.1.9-1.el7sat.noarch
foreman-postgresql-1.7.2.21-1.el7sat.noarch
ruby193-rubygem-foreman_bootdisk-4.0.2.13-1.el7sat.noarch
dell-pem710-01.rhts.eng.bos.redhat.com-foreman-proxy-client-1.0-1.noarch
foreman-ovirt-1.7.2.21-1.el7sat.noarch
rubygem-hammer_cli_foreman-0.1.4.11-1.el7sat.noarch
foreman-selinux-1.7.2.13-1.el7sat.noarch
foreman-gce-1.7.2.21-1.el7sat.noarch
ruby193-rubygem-foreman-redhat_access-0.1.0-1.el7sat.noarch
ruby193-rubygem-foreman-tasks-0.6.12.5-1.el7sat.noarch
rubygem-hammer_cli_foreman_tasks-0.0.3.4-1.el7sat.noarch
rubygem-hammer_cli_foreman_docker-0.0.3.6-1.el7sat.noarch
ruby193-rubygem-foreman_docker-1.2.0.12-1.el7sat.noarch
ruby193-rubygem-foreman_hooks-0.3.7-2.el7sat.noarch
rubygem-hammer_cli_foreman_bootdisk-0.1.2.7-1.el7sat.noarch
foreman-proxy-1.7.2.4-1.el7sat.noarch
dell-pem710-01.rhts.eng.bos.redhat.com-foreman-client-1.0-1.noarch
dell-pem710-01.rhts.eng.bos.redhat.com-foreman-proxy-1.0-2.noarch
foreman-vmware-1.7.2.21-1.el7sat.noarch
rubygem-hammer_cli_foreman_discovery-0.0.1.10-1.el7sat.noarch
foreman-compute-1.7.2.21-1.el7sat.noarch
foreman-debug-1.7.2.21-1.el7sat.noarch

steps:
1.  Install product  this should be done on a RHE 6.7 beta compose.
2.  Upload manifest that grants appropriate subscriptions for SCL and Sat6 beta.
3.  Attempt to create and sync three repos:
  * RH SCL repo for 6Server [red hat content]
  * RH Capsule 6.1 beta  [red hat content]
  * Internal mirror of RHEL beta [custom repo]
4 Sync complete

Comment 28 pulp-infra@redhat.com 2015-05-21 15:00:21 UTC
The Pulp upstream bug status is at VERIFIED. Updating the external tracker on this bug.

Comment 29 Bryan Kearney 2015-08-11 13:18:16 UTC
This bug is slated to be released with Satellite 6.1.

Comment 30 Bryan Kearney 2015-08-12 13:55:17 UTC
This bug was fixed in version 6.1.1 of Satellite which was released on 12 August, 2015.

Comment 31 pulp-infra@redhat.com 2015-09-14 13:00:40 UTC
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.