Bug 624580 - reposync core dumps on large downloads (likely: pycurl, Py_None DECREF issue)
Summary: reposync core dumps on large downloads (likely: pycurl, Py_None DECREF issue)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: python-pycurl
Version: 13
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Jeffrey C. Ollie
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-08-17 05:02 UTC by James Antill
Modified: 2014-01-21 06:19 UTC (History)
5 users (show)

Fixed In Version: python-pycurl-7.19.0-7.fc13
Doc Type: Bug Fix
Doc Text:
Clone Of: 624559
Environment:
Last Closed: 2010-09-11 03:38:51 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description James Antill 2010-08-17 05:02:43 UTC
+++ This bug was initially created as a clone of Bug #624559 +++

Description of problem:

When doing something like 

reposync -c fedora.repo --repoid=fedora-source-13 --source

After the last package downloads I get a supposed 'core dump'

[fedora-source-13: 9152  of 9152  ] Downloading zzuf-0.13-1.fc13.src.rpm
zzuf-0.13-1.fc13.src.rpm           | 456 kB     00:00     
Fatal Python error: deallocating None
Aborted (core dumped)

No core seems to be dumped though.. not sure why.

Version-Release number of selected component (if applicable):
# rpm -qf /usr/bin/reposync /usr/bin/python
yum-utils-1.1.26-11.el6.noarch
python-2.6.5-3.el6.x86_64


How reproducible:
80% with EL6 beta2. Large downloads cause it to occur often. Short 10-30 package ones it does not occur.

--- Additional comment from pm-rhel on 2010-08-16 19:03:12 EDT ---

Since this issue was entered in bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

--- Additional comment from dmalcolm on 2010-08-16 19:10:20 EDT ---

Something is messing up the reference counts for the "None" singleton: either libpython, or an extension module; my guess is an extension module.

This is the tp_dealloc for PyNone_Type:

static void
none_dealloc(PyObject* ignore)
{
    /* This should never get called, but we also don't want to SEGV if
     * we accidentally decref None out of existence.
     */
    Py_FatalError("deallocating None");
}

--- Additional comment from pm-rhel on 2010-08-16 19:18:33 EDT ---

This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

--- Additional comment from dmalcolm on 2010-08-16 19:19:32 EDT ---

The None singleton normally has a fairly large reference count.

For example, starting a python process under gdb shows this refcount at the first interactive prompt:
(gdb) p _Py_NoneStruct 
$2 = {ob_refcnt = 877, ob_type = 0x77f2a0}

What I suspect is happening is that something isn't doing a Py_INCREF when it should on None, and so when you have enough of these in a loop, None ends up with a lower refcount than it should.

At some point your presumably issuing lots of Py_DECREF on the singleton, and this drives it below one, hitting the Py_FatalError.

My hope is that a backtrace might suggest the rough location of the bug (by analogy between the cleanup vs the setup)

--- Additional comment from james.antill on 2010-08-17 01:01:24 EDT ---

Having a look at the obvious candidate, pycurl. I see that the .reset() function does:


    /* Last, free the options */
    for (i = 0; i < OPTIONS_SIZE; i++) {
        if (self->options[i] != NULL) {
            free(self->options[i]);
            self->options[i] = NULL;
        }
    }

    return Py_None;
}

...unlike every other usage which is the (I assume) correct:


    Py_INCREF(Py_None);
    return Py_None;

...urlgrabber calls .reset() for every download.

Comment 1 James Antill 2010-08-17 05:05:55 UTC
This really should be fixed before people start doing F-13 => F-14 yum updates. I assume dito. for the anaconda install (I'm shocked people haven't hit it there already, testing F-14).

Comment 2 Chris Lumens 2010-08-17 13:17:34 UTC
We're getting some weird FTP-related behavior with unexplained socket timeouts but I haven't heard of anything else going on that could be attributed to pycurl.

Comment 3 James Antill 2010-08-17 13:59:36 UTC
 I thought about it more overnight, and I guessed that it's possible when running in anaconda there are a enough more Py_None values used so the reference count is high enough that it's hard to trigger it.

Comment 4 Jeffrey C. Ollie 2010-08-17 16:28:01 UTC
I checked upstream CVS and found that this patch was added a few months ago:

http://pycurl.cvs.sourceforge.net/viewvc/pycurl/pycurl/src/pycurl.c?r1=1.148&r2=1.149

Does this look like a reasonable fix?  If so I'll get new packages built ASAP.

Of course PyCURL looks like an almost dead project upstream...

Comment 5 Dave Malcolm 2010-08-17 16:35:05 UTC
(In reply to comment #4)
> I checked upstream CVS and found that this patch was added a few months ago:
> 
> http://pycurl.cvs.sourceforge.net/viewvc/pycurl/pycurl/src/pycurl.c?r1=1.148&r2=1.149
Sorry for the duplicated work; I've been looking at this from the RHEL side; see:
https://bugzilla.redhat.com/show_bug.cgi?id=624559#c8

You may want to add yourself to the CC on that bug.

> Does this look like a reasonable fix?  If so I'll get new packages built ASAP.
Yes.

(In the RHEL bug I'm advocating an absolutely minimal one-liner fix out of paranoia; Fedora probably should simply use the upstream fix)

> Of course PyCURL looks like an almost dead project upstream...

Comment 6 James Antill 2010-08-17 17:05:59 UTC
Yeh, the part inside the reset function looks safe ... but the minimal one liner is probably best for RHEL-6.0 :)

The very last change in the upstream patch looks weird though (adding a res global), I'm wondering if that's a copy and paste error.

Comment 7 seth vidal 2010-08-17 17:40:57 UTC
Jeff,
 You're wrong about pycurl being ALMOST dead. I'm reasonably certain it is actually dead.

The last guy who was doing any kind of maintenance work on it works on opensolaris pkg manager and I suspect he's going to be busy for a while.

Comment 8 Jeffrey C. Ollie 2010-08-17 18:32:12 UTC
(In reply to comment #6)
> Yeh, the part inside the reset function looks safe ... but the minimal one
> liner is probably best for RHEL-6.0 :)
>
> The very last change in the upstream patch looks weird though (adding a res
> global), I'm wondering if that's a copy and paste error.

Well I'll stick with the minimal patch that RHEL uses.  I'm a believer in KISS and I don't have a lot of time to investigate any problems that might be added by the patch.

I'll have new packages building soon.

Comment 9 Fedora Update System 2010-08-17 18:59:42 UTC
python-pycurl-7.19.0-7.fc14 has been submitted as an update for Fedora 14.
http://admin.fedoraproject.org/updates/python-pycurl-7.19.0-7.fc14

Comment 10 Fedora Update System 2010-08-17 19:00:23 UTC
python-pycurl-7.19.0-7.fc13 has been submitted as an update for Fedora 13.
http://admin.fedoraproject.org/updates/python-pycurl-7.19.0-7.fc13

Comment 11 Fedora Update System 2010-08-17 19:01:56 UTC
python-pycurl-7.19.0-7.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/python-pycurl-7.19.0-7.fc12

Comment 12 Daniel Stenberg 2010-08-17 21:37:23 UTC
I'm convinced the pycurl guys will all appreciate your help. It is still widely used, only undermanned.

Comment 13 Fedora Update System 2010-08-18 19:51:37 UTC
python-pycurl-7.19.0-7.fc14 has been pushed to the Fedora 14 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update python-pycurl'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/python-pycurl-7.19.0-7.fc14

Comment 14 Fedora Update System 2010-09-11 03:38:45 UTC
python-pycurl-7.19.0-7.fc14 has been pushed to the Fedora 14 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 15 Fedora Update System 2010-09-11 08:56:39 UTC
python-pycurl-7.19.0-7.fc12 has been pushed to the Fedora 12 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 16 Fedora Update System 2010-09-11 09:06:24 UTC
python-pycurl-7.19.0-7.fc13 has been pushed to the Fedora 13 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.