Created attachment 373806 [details] strace canto &>canto-strace Description of problem: Canto crashes with segmentation fault on startup. Version-Release number of selected component (if applicable): canto-0.7.4-1.fc12.x86_64 How reproducible: 100% Steps to Reproduce: 1. canto 2. Application starts, loads feeds 3. Crashes at beginning main loop Additional info: - No change when run without existing ~/.canto - Also crashes with self-built canto-0.7.5 - see attached strace ~/.canto/log contains: Canto v 0.7.4 (sh:) Time: Wed Nov 25 18:15:37 2009 Config parsed successfully. Populating feeds... Precaching: [] Curses initialized. GUI initialized. Signals set. Beginning main loop.
Created attachment 373823 [details] abrt backtrace Hello, thanks for reporting the bug. I forgot to push the canto 0.7.5 update the past weekend, yet, as you already stated, the crash occurs here (i686) too. However, the crash appears to be caused by python internals. (based upon backtraces) @dmalcolm: This is the backtrace on my system (i686, fc12) for canto-0.7.4 abadger1999 unfortunately was unable to help me.
Ref canto issue tracker: http://github.com/themoken/Canto/issues#issue/5
Sorry for the belated response. This is possibly a symptom of bug 539917 FWIW I'm seeing various "random" crashes inside Python running canto on F11 (canto-0.7.4-1.fc11.i586). I reproduced the backtrace from comment #1; frame #2 (PyEval_EvalFrameEx) is at: /usr/lib/python2.6/site-packages/canto/interface_draw.py (436): status which is here: def status(self, bar, height, width, str): self.simple_out([(str, u" ", u"")], 0, height, width, [bar]) frame #1 (call_function) is the call to "simple_out" frame #0 (list_dealloc) is crashing, decreffing a PyListObject, "w" below: 3738 while ((*pp_stack) > pfunc) { 3739 w = EXT_POP(*pp_stack); 3740 Py_DECREF(w); 3741 PCALL(PCALL_POP); 3742 } 3743 return x; (gdb) pyo op object : <refcnt 0 at 0xb704f12c> type : list refcount: 0 address : 0xb704f12c and the list's ob_item seems to be corrupt
I spent a little time trying to track this down. It looks like the heap is getting corrupted either during the call to canto's canto/widecurse.c:mvw (implementation of widecurse.core), or shortly afterwards. I'm not sure where the specific problem is. However, I did notice that in widecurse.c, various functions ("disable_color" etc) return Py_None without doing an INCREF on that object; this _is_ a bug, though I set breakpoints on these functions and they didn't seem to be being called. They need to have a: Py_INCREF(Py_None); before the return Py_None; or to replace it with this macro: return Py_RETURN_NONE; which does the INCREF (FWIW, at the time of the crash, (gdb) p _Py_NoneStruct $70 = {ob_refcnt = 5849, ob_type = 0x672c820} which is much greater than 0, so it looks like None isn't getting freed, so this is a different bug) Hope this is helpful.
> or to replace it with this macro: > return Py_RETURN_NONE; > which does the INCREF Sorry, this should simply read: Py_RETURN_NONE; as the macro contains the "return" statement (it's in python's object.h)
(In reply to comment #5) > > or to replace it with this macro: > > return Py_RETURN_NONE; > > which does the INCREF > Sorry, this should simply read: > Py_RETURN_NONE; > as the macro contains the "return" statement (it's in python's object.h) Thanks alot Dave, I've fixed this in git. As you said, different bug though. If there's any help I can offer wrt this bug, let me know (I'm the author of canto).
I'm reinstalling my primary machine; I hope to have another look at this when that's done. This may well be simply a problem with Fedora's python curses module; see bug 539917. If it isn't that, then I believe that something is corrupting an internal data structure, and the program later crashes when structure is read (threads? heap corruption?) If that's the case, then I don't think the strace approach described in the upstream report is going to help, and it's going to be hard to track this down. Some approaches for locating this: (difficult) use gdb to try to track down the segfault. Invoke it thus: [david@brick ~]$ gdb --args python /usr/bin/canto (gdb) run when it segfaults: (gdb) bt though it's somewhat tricky dealing with this due to the way curses has reset the terminal. (much more involved): rebuild python without memory arenas (i.e. configure --without-pymalloc), and run it under valgrind: valgrind python /usr/bin/canto
Created attachment 375383 [details] gdb backtrace OK, this is a backtrace of canto from git (commit 0b2b790a9870ac8703eb006c8f43f01c2b574723) run with gdb. Hope it might help.
I should add that gdb requested PyXML-debuginfo which wasn't in the repos, so I downloaded the fc12 updates-candidates from koji and installed them. PyXML-debuginfo-0.8.4-17.fc12.x86_64 PyXML-0.8.4-17.fc12.x86_64
Hello, could I get some update on this? Apparently, the last activity on the github bug tracker was on 12-01 too Unfortunately, the problem still persists.
I finally got around to debugging myself in an FC12 VM. I can confirm that this is most likely a symptom of #539917. I messed around with the window object passed to the extension and it's behavior is very strange. Removing the addch calls removes the segfault (unsurprisingly, considering they're the only calls made from the extension into curses), but the coordinates given to all of the calls are valid (inside the coordinates) so they shouldn't cause a problem (even though they do). Compiling and installing a fresh copy of 2.6.4 and ensuring that the _curses.so is linked against ncursesw solved the problem. As such, I'm willing to bet that closing #539917 will fix this.
Thanks for checking into this; sorry about the lack of activity here. Marking bug 539917 as blocking this (also to generate a URL linking to that bug).
Thanks for your efforts, Jack. I hope with your information, we'll be able to solve this bug rather soon.
Created attachment 384817 [details] patch to python-2.6.2-config.patch
Just to confirm, I've successfully rebuilt python with the above changes to the config patch (essentially the same as in bug 242583) and canto is once again working.
Thanks Jack and Mark! I'm writing up some more notes on this in bug 539917.
*** Bug 558301 has been marked as a duplicate of this bug. ***
*** Bug 558115 has been marked as a duplicate of this bug. ***
Hello everyone, what is the status of this bug? Given that the patch seems to be working, it'd be great if we could fix it soon as I've got an increasing number of users reporting the bugs :(
Sorry about the delay. I've submitted https://admin.fedoraproject.org/updates/F12/FEDORA-2010-0393 to "testing", which has a fix for this bug. See also bug 539917
python-2.6.2-4.fc12 has been pushed to the Fedora 12 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update python'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F12/FEDORA-2010-0393
canto works with python-2.6.2-4 update
*** Bug 563028 has been marked as a duplicate of this bug. ***
python-2.6.2-4.fc12 has been pushed to the Fedora 12 stable repository. If problems still persist, please make note of it in this bug report.