Bug 754787 - Disruption in Internet Connectivity leave a large number of sleeping grinder processes
Summary: Disruption in Internet Connectivity leave a large number of sleeping grinder ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Pulp
Classification: Retired
Component: user-experience
Version: 1.0.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: ---
Assignee: John Matthews
QA Contact: Preethi Thomas
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-11-17 18:06 UTC by Ernest W. Durbin III
Modified: 2013-09-09 16:36 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-02-24 20:15:18 UTC


Attachments (Terms of Use)
graph of processes on pulp server at time of IP connection break ~2300 (27.72 KB, image/png)
2011-11-17 18:08 UTC, Ernest W. Durbin III
no flags Details
tarball of logfiles snipped around connection disruption ~2300 (8.94 KB, application/x-gzip)
2011-11-17 18:10 UTC, Ernest W. Durbin III
no flags Details

Description Ernest W. Durbin III 2011-11-17 18:06:46 UTC
Description of problem:

When a scheduled sync begins and the internet connection for the machine is not operational,  the grinder processes started by the sync process appear to sleep. Even once the internet connection has been restored, these processes do not recover or die.

Version-Release number of selected component (if applicable):
Pulp community release 18
    pulp 0.0.244-5.fc15
    grinder 0.0.127-1.fc15
Fedora 15
    2.6.40.6-0.fc15.x86_64

How reproducible:
I'm uncertain on this. Only have one Pulp box going

Steps to Reproduce:
1. Configure a repository for automated syncs. Schedule one in future
2. Before, or possibly during a scheduled sync tear down the internet connection
3. Verify that a number of grinder procs are sleeping `ps aux | grep grinder\/activeobject\.pyc`
  
Actual results:
The repository does not sync, a number of grinder object processes are sleeping. In order to remove them, a `service pulp-server restart` works.

Expected results:
Graceful failure... wait for next sync?

Additional info:

other things to follow.

Comment 1 Ernest W. Durbin III 2011-11-17 18:08:55 UTC
Created attachment 534281 [details]
graph of processes on pulp server at time of IP connection break ~2300

Comment 2 Ernest W. Durbin III 2011-11-17 18:10:56 UTC
Created attachment 534282 [details]
tarball of logfiles snipped around connection disruption ~2300

Comment 3 John Matthews 2011-12-15 16:04:21 UTC
I was able to replicate this issue without using scheduled syncs

1) pulp-admin repo create --id bad_url --feed http://bad_url_should_fail_no_data.com

2) pulp-admin repo sync --id bad_url

Repeat the sync several times and you see on each attempt the grinder activeobject processes are left running.

Example below:

$ ps auxf | grep grinder | wc -l
62
[jmatthews@jwm-devel pulp{master}$ sudo pulp-admin repo sync --id bad_url -F
Sync for repository bad_url started
Sync: Error

Item Details: 
error:  Exception: Traceback (most recent call last):

  File "/shared/repo/grinder/src/grinder/activeobject.py", line 429, in process
    retval = method(*args, **kwargs)

  File "/shared/repo/grinder/src/grinder/RepoFetch.py", line 51, in fetchItem
    verify_options=self.verify_options)

  File "/shared/repo/grinder/src/grinder/BaseFetch.py", line 328, in fetch
    checksum, headers, retryTimes, packages_location)

  File "/shared/repo/grinder/src/grinder/BaseFetch.py", line 328, in fetch
    checksum, headers, retryTimes, packages_location)

  File "/shared/repo/grinder/src/grinder/BaseFetch.py", line 270, in fetch
    curl.perform()

error: (6, 'Could not resolve host: bad_url_should_fail_no_data.com; Cannot allocate memory')


[jmatthews@jwm-devel pulp{master}$ ps auxf | grep grinder | wc -l
77

Comment 4 John Matthews 2011-12-15 18:15:41 UTC
Issue was that we were not explicitly killing the activeobject processes if we encountered an exception when fetching metadata.

Commit is here:
http://git.fedorahosted.org/git/?p=grinder.git;a=commitdiff;h=e9758fc8f07e7fda58da8dd021d7bb845b74c993

Comment 5 Jeff Ortel 2011-12-15 20:18:09 UTC
build: 0.255

Comment 6 Preethi Thomas 2012-01-03 21:59:24 UTC
[root@katello-test ~]# rpm -q pulp
pulp-0.0.255-1.el6.noarch
[root@katello-test ~]# 
[root@katello-test ~]# pulp-admin -u admin -p admin repo create --id bad_url --feed http://bad_url_should_fail_no_data.com
Successfully created repository [ bad_url ]

[root@katello-test ~]# pulp-admin repo sync --id bad_url -F
error:  error: operation failed: sslv3 alert certificate expired
[root@katello-test ~]# 
[root@katello-test ~]# 
[root@katello-test ~]# pulp-admin -u admin -p admin repo sync --id bad_url -F
Sync for repository bad_url started
Sync: Error

Item Details: 
error:  Exception: Traceback (most recent call last):

  File "/usr/lib/python2.6/site-packages/grinder/activeobject.py", line 429, in process
    retval = method(*args, **kwargs)

  File "/usr/lib/python2.6/site-packages/grinder/RepoFetch.py", line 51, in fetchItem
    verify_options=self.verify_options)

  File "/usr/lib/python2.6/site-packages/grinder/BaseFetch.py", line 328, in fetch
    checksum, headers, retryTimes, packages_location)

  File "/usr/lib/python2.6/site-packages/grinder/BaseFetch.py", line 328, in fetch
    checksum, headers, retryTimes, packages_location)

  File "/usr/lib/python2.6/site-packages/grinder/BaseFetch.py", line 270, in fetch
    curl.perform()

error: (6, "Couldn't resolve host 'bad_url_should_fail_no_data.com'")


[root@katello-test ~]# pulp-admin -u admin -p admin repo sync --id bad_url
Sync for repository bad_url started
Use "repo status" to check on the progress
[root@katello-test ~]# pulp-admin -u admin -p admin repo sync --id bad_url
Sync for repository bad_url started
Use "repo status" to check on the progress
[root@katello-test ~]# pulp-admin -u admin -p admin repo sync --id bad_url
Sync for repository bad_url started
Use "repo status" to check on the progress
[root@katello-test ~]# pulp-admin -u admin -p admin repo sync --id bad_url
Sync for repository bad_url started
Use "repo status" to check on the progress
[root@katello-test ~]# pulp-admin -u admin -p admin repo sync --id bad_url
Sync for repository bad_url started
Use "repo status" to check on the progress
[root@katello-test ~]# ps auxf | grep grinder | wc -l
1
[root@katello-test ~]# rpm -q pulp
pulp-0.0.255-1.el6.noarch
[root@katello-test ~]#

Comment 7 Preethi Thomas 2012-02-24 20:15:18 UTC
Pulp v1.0 is released
Closed Current Release.

Comment 8 Preethi Thomas 2012-02-24 20:16:55 UTC
Pulp v1.0 is released.


Note You need to log in before you can comment on or make changes to this bug.