996302 – Pulp is segfaulting, seems to be a threading issue

Bug 996302 - Pulp is segfaulting, seems to be a threading issue

Summary: Pulp is segfaulting, seems to be a threading issue

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	Pulp
Classification:	Retired
Component:	user-experience
Sub Component:
Version:	2.1.1
Hardware:	x86_64
OS:	Linux
Priority:	low
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	pulp-bugs
QA Contact:	Preethi Thomas
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-08-12 21:30 UTC by Jim
Modified:	2014-09-15 16:09 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2014-09-15 16:09:45 UTC
Embargoed:

Attachments	(Terms of Use)

Description Jim 2013-08-12 21:30:10 UTC

Description of problem:

We run pulp-admin to upload rpms in a script as our CI system generates new rpms. However, every couple of days, the pulp-admin processes fall into zombie states, and pulp's apache vhost hangs. Further it looks as though pulp itself segfaults. 

The admin client is run on the same host as where the repositories live.

Here are the relevant bits of the .pulp/admin.log http://pastebin.com/20Y5fD9n  and here are my apache logs, which indicate that pulp segfaults http://pastebin.com/V80Fq7VL

This happens on multiple servers, including ones which do not directly have rpms being uploaded thereto, but are syncing with the initial server as a feed.

Here is an excerpt of the apache error log of one of the client-servers http://pastebin.com/D1dSsTwn

Version-Release number of selected component (if applicable):


# rpm -qa|grep pulp                                                                                                                                                                 
pulp-builtins-admin-extensions-2.1.3-1.el6.noarch                                                                                                                                                            
python-pulp-rpm-extension-2.1.3-1.el6.noarch                                                                                                                                                                 
python-oauth2-1.5.170-3.pulp.el6.noarch                                                                                                                                                                      
python-pulp-common-2.1.3-1.el6.noarch                                                                                                                                                                        
python-pulp-puppet-common-2.1.3-1.el6.noarch                                                                                                                                                                 
m2crypto-0.21.1.pulp-8.el6.x86_64                                                                                                                                                                            
python-pulp-client-lib-2.1.3-1.el6.noarch                                                                                                                                                                    
mod_wsgi-3.4-1.pulp.el6.x86_64
pulp-rpm-admin-extensions-2.1.3-1.el6.noarch
python-isodate-0.5.0-1.pulp.el6.noarch
python-rhsm-1.8.0-1.pulp.el6.x86_64
python-pulp-rpm-common-2.1.3-1.el6.noarch
pulp-nodes-common-2.1.3-1.el6.noarch
pulp-admin-client-2.1.3-1.el6.noarch
pulp-nodes-admin-extensions-2.1.3-1.el6.noarch
python-pulp-bindings-2.1.3-1.el6.noarch
pulp-puppet-admin-extensions-2.1.3-1.el6.noarch
pulp-server-2.1.3-1.el6.noarch
pulp-rpm-plugins-2.1.3-1.el6.noarch
pulp-nodes-parent-2.1.3-1.el6.noarch


How reproducible:

It happens regularly, as we continuously upload and sync down rpms.

I don't yet know a way to deterministically cause it, and as it looks like a threading issue, this might be difficult.


Steps to Reproduce:
1.
2.
3.

Actual results:

segfault, pulp stops working, and admin needs to restart httpd.

Expected results:

No segfault, pulp keeps working, no admin intervention.


Additional info:

Comment 1 Jim 2013-08-12 21:50:16 UTC

This box is running:
# cat /etc/redhat-release 
CentOS release 6.2 (Final)

this is the  upload script
#!/bin/bash
# Login to pulp
/usr/bin/sudo pulp-admin login -u admin -p admin

# Remove any existing copies of the uploaded RPMs
if [ -e ./magic-$CI_GERRIT_CHANGE_NUMBER-$CI_GERRIT_PATCHSET_NUMBER.x86_64.rpm ]; then
  /usr/bin/sudo pulp-admin rpm repo remove rpm --repo-id=magic --match 'name=magic' --match 'version=$CI_GERRIT_CHANGE_NUMBER' --match 'release=$CI_GERRIT_PATCHSET_NUMBER'
  /usr/bin/sudo pulp-admin rpm repo publish run --repo-id magic
fi
if [ -e ./other_magic$CI_GERRIT_CHANGE_NUMBER-$CI_GERRIT_PATCHSET_NUMBER.noarch.rpm ]; then
  /usr/bin/sudo pulp-admin rpm repo remove rpm --repo-id=magic --match 'name=other_magic' --match 'version=$CI_GERRIT_CHANGE_NUMBER' --match 'release=$CI_GERRIT_PATCHSET_NUMBER'
  /usr/bin/sudo pulp-admin rpm repo publish run --repo-id magic
fi
/usr/bin/sudo pulp-admin orphan remove --all

# Upload packages
if [ -e ./magic-$CI_GERRIT_CHANGE_NUMBER-$CI_GERRIT_PATCHSET_NUMBER.x86_64.rpm ]; then
  /usr/bin/sudo /usr/bin/pulp-admin rpm repo uploads rpm  --repo-id magic --file=./magic-$CI_GERRIT_CHANGE_NUMBER-$CI_GERRIT_PATCHSET_NUMBER.x86_64.rpm 
fi
if [ -e ./other_magic$CI_GERRIT_CHANGE_NUMBER-$CI_GERRIT_PATCHSET_NUMBER.noarch.rpm ]; then
  /usr/bin/sudo /usr/bin/pulp-admin rpm repo uploads rpm  --repo-id magic --file=./other_magic$CI_GERRIT_CHANGE_NUMBER-$CI_GERRIT_PATCHSET_NUMBER.noarch.rpm
fi
/usr/bin/sudo pulp-admin rpm repo publish run --repo-id magic

# Remove extra files
if [ -e ./magic-$CI_GERRIT_CHANGE_NUMBER-$CI_GERRIT_PATCHSET_NUMBER.x86_64.rpm ]; then
  /usr/bin/sudo /bin/rm  ./magic-$CI_GERRIT_CHANGE_NUMBER-$CI_GERRIT_PATCHSET_NUMBER.x86_64.rpm 
fi
if [ -e ./other_magic$CI_GERRIT_CHANGE_NUMBER-$CI_GERRIT_PATCHSET_NUMBER.noarch.rpm ]; then
  /usr/bin/sudo /bin/rm  ./other_magic$CI_GERRIT_CHANGE_NUMBER-$CI_GERRIT_PATCHSET_NUMBER.noarch.rpm
fi


---------------- here is the cron script on my client-servers (WTB rate limited nodes....)

# cat /etc/cron.d/pulp_5min 
*/5 * * * * root /opt/share/synch_pulp_5min.sh
You have new mail in /var/spool/mail/root

# cat /opt/share/synch_pulp_5min.sh 
#!/bin/bash
/usr/bin/pulp-admin login -u admin -p admin
/usr/bin/pulp-admin rpm repo sync run --repo-id=magic

Comment 2 Michael Hrivnak 2013-09-27 22:29:51 UTC

Please let us know if this is still an issue with pulp 2.2.

Comment 3 Randy Barlow 2014-01-03 21:22:32 UTC

Hi Jim! Can you comment on this bug about whether it is still an issue? We have also released 2.3.1 now.

Comment 4 Jim 2014-09-15 16:09:45 UTC

I have not seen this again since updating

Note You need to log in before you can comment on or make changes to this bug.