Bug 566055 - debug-gdb.py reports runtime error in gdb backtrace.
Summary: debug-gdb.py reports runtime error in gdb backtrace.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: python
Version: 13
Hardware: x86_64
OS: Linux
low
low
Target Milestone: ---
Assignee: Dave Malcolm
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-02-17 00:58 UTC by Steve Tyler
Modified: 2010-05-10 23:50 UTC (History)
4 users (show)

Fixed In Version: python3-3.1.2-4.fc13
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-04-09 04:21:57 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
gdb backtrace for all threads (100.78 KB, text/plain)
2010-02-17 12:08 UTC, Steve Tyler
no flags Details
gdb backtrace with large python data structures (147.35 KB, text/plain)
2010-02-18 14:15 UTC, Steve Tyler
no flags Details
gdb backtrace with version 2 of libpython.py (151.65 KB, text/plain)
2010-02-20 13:20 UTC, Steve Tyler
no flags Details
gdb backtrace with large data structures and version 2 of libpython.py (161.69 KB, text/plain)
2010-02-20 14:02 UTC, Steve Tyler
no flags Details

Description Steve Tyler 2010-02-17 00:58:53 UTC
Not sure if this is a bug or a feature. (I'm sort of a gdb/python noob.) Close if as expected.

(gdb) bt
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007ffff7b2afc8 in PyThread_acquire_lock (lock=0x7fffd40add10, waitflag=1) at Python/thread_pthread.h:349
#2  0x00007ffff7b2f1a4 in lock_PyThread_acquire_lock (self=0x1add7b0, args=Traceback (most recent call last):
  File "/usr/lib/debug/usr/lib64/libpython2.6.so.1.0.debug-gdb.py", line 437, in to_string
    proxyval = PyObjectPtr.from_pyobject_ptr(self.gdbval).proxyval()
  File "/usr/lib/debug/usr/lib64/libpython2.6.so.1.0.debug-gdb.py", line 149, in proxyval
    tp_name = self.type().field('tp_name').string()
RuntimeError: Cannot access memory at address 0x19
) at Modules/threadmodule.c:46
#3  0x00007ffff7b029d3 in call_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:3764
...

Captured running rawhide: desktop-x86_64-20100216.03.iso
after pychess aborted with an assertion failure.

Comment 1 Dave Malcolm 2010-02-17 01:33:56 UTC
Thanks for filing this!  It definitely looks like a bug in my libpython-gdb.py code (as well as in the underlying process being debugged).

It looks like the data in the process being debugged has become corrupted somehow; I need to make the gdb.py code more resilient at dealing with arbitrarily-corrupt data.

If you type:
(gdb) frame 2
(gdb) print *args
it should take you to frame 2, and print the internals args without going through my code.  What does it say?

What does it say if you then try:
(gdb) print args->tp_type

Thanks!

Comment 2 Steve Tyler 2010-02-17 11:21:51 UTC
I'm seeing RuntimeError messages in several threads after the abort. Will attach a full backtrace separately.

(gdb) thread
[Current thread is 21 (Thread 0x7fffaedf1710 (LWP 2438))]
(gdb) frame 2
#2  0x00007ffff7b2f1a4 in lock_PyThread_acquire_lock (self=0x1ab18b8, args=Traceback (most recent call last):
  File "/usr/lib/debug/usr/lib64/libpython2.6.so.1.0.debug-gdb.py", line 437, in to_string
    proxyval = PyObjectPtr.from_pyobject_ptr(self.gdbval).proxyval()
  File "/usr/lib/debug/usr/lib64/libpython2.6.so.1.0.debug-gdb.py", line 149, in proxyval
    tp_name = self.type().field('tp_name').string()
RuntimeError: Cannot access memory at address 0x19
) at Modules/threadmodule.c:46
46		i = PyThread_acquire_lock(self->lock_lock, i);
(gdb) print *args
$4 = {ob_refcnt = 549755813888, ob_type = 0x1}
(gdb) print args->tp_type
There is no member named tp_type.
(gdb) thread 5
[Switching to thread 5 (Thread 0x7fffcf2d2710 (LWP 2416))]#0  tupledealloc (op=0x7fffb0038bd8) at Objects/tupleobject.c:185
185		Py_TRASHCAN_SAFE_END(op)
(gdb) frame 31
#31 0x00007fffee5e8a70 in _wrap_gdk_window_process_updates (self=0x7fffb005f640, args=Traceback (most recent call last):
  File "/usr/lib/debug/usr/lib64/libpython2.6.so.1.0.debug-gdb.py", line 437, in to_string
    proxyval = PyObjectPtr.from_pyobject_ptr(self.gdbval).proxyval()
  File "/usr/lib/debug/usr/lib64/libpython2.6.so.1.0.debug-gdb.py", line 149, in proxyval
    tp_name = self.type().field('tp_name').string()
RuntimeError: Error reading string from inferior: Input/output error
, kwargs=<\ at remote 0x7fffcf2cf690>) at gdk.c:10989
10989	    gdk_window_process_updates(GDK_WINDOW(self->obj), update_children);
(gdb) print *args
$5 = {ob_refcnt = 0, ob_type = 0x1c9d780}
(gdb) print args->tp_type
There is no member named tp_type.
(gdb) 

== version info ==
[liveuser@localhost ~]$ uname -a
Linux localhost.localdomain 2.6.33-0.44.rc8.git0.fc13.x86_64 #1 SMP Sat Feb 13 02:29:32 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
[liveuser@localhost ~]$ rpm -q python gdb pychess
python-2.6.4-20.fc13.x86_64
gdb-7.0.50.20100203-15.fc13.x86_64
pychess-0.10-0.3.20100214svn.fc13.noarch
[liveuser@localhost ~]$

Comment 3 Steve Tyler 2010-02-17 12:08:03 UTC
Created attachment 394718 [details]
gdb backtrace for all threads

info thread
t a a bt

[liveuser@localhost ~]$ grep -n RuntimeError gdb1.txt 
77:RuntimeError: Cannot access memory at address 0x19
115:RuntimeError: Cannot access memory at address 0x19
153:RuntimeError: Cannot access memory at address 0x19
217:RuntimeError: Cannot access memory at address 0x98a
263:RuntimeError: Cannot access memory at address 0x19
323:RuntimeError: Cannot access memory at address 0x987
369:RuntimeError: Cannot access memory at address 0x19
429:RuntimeError: Cannot access memory at address 0x984
475:RuntimeError: Cannot access memory at address 0x19
572:RuntimeError: Cannot access memory at address 0x19
636:RuntimeError: Cannot access memory at address 0x19
741:RuntimeError: Error reading string from inferior: Input/output error
810:RuntimeError: Cannot access memory at address 0x19
817:RuntimeError: Cannot access memory at address 0x19
863:RuntimeError: Cannot access memory at address 0x19
905:RuntimeError: Cannot access memory at address 0x19
[liveuser@localhost ~]$

Comment 4 Steve Tyler 2010-02-17 12:21:55 UTC
This seems to be a variant ...

(gdb) thread
[Current thread is 18 (Thread 0x7fffb61fc710 (LWP 2435))]
(gdb) frame
#1  0x00007ffff7b335ea in posix_waitpid (self=<value optimized out>, args=Traceback (most recent call last):
  File "/usr/lib/debug/usr/lib64/libpython2.6.so.1.0.debug-gdb.py", line 437, in to_string
    proxyval = PyObjectPtr.from_pyobject_ptr(self.gdbval).proxyval()
  File "/usr/lib/debug/usr/lib64/libpython2.6.so.1.0.debug-gdb.py", line 149, in proxyval
    tp_name = self.type().field('tp_name').string()
  File "/usr/lib/debug/usr/lib64/libpython2.6.so.1.0.debug-gdb.py", line 101, in field
    if self.is_null():
  File "/usr/lib/debug/usr/lib64/libpython2.6.so.1.0.debug-gdb.py", line 122, in is_null
    return 0 == long(self._gdbval)
RuntimeError: Cannot access memory at address 0x98a
) at Modules/posixmodule.c:5899
5899		pid = waitpid(pid, &status, options);
(gdb) print *args
Cannot access memory at address 0x982
(gdb) print args->tp_type
There is no member named tp_type.
(gdb)

Comment 5 Dave Malcolm 2010-02-17 21:15:28 UTC
(In reply to comment #2)
Thanks for the info!

> I'm seeing RuntimeError messages in several threads after the abort. Will
> attach a full backtrace separately.

(snip)

> (gdb) print *args
> $4 = {ob_refcnt = 549755813888, ob_type = 0x1}
OK: the data pointed to by "args" is completely corrupt.  My gdb code clearly needs to do a better job of handling this.

> (gdb) print args->tp_type
> There is no member named tp_type.
Oops, I meant to write "ob_type", but no need: I see the (bogus) value above.

(snip)

> (gdb) thread 5
> [Switching to thread 5 (Thread 0x7fffcf2d2710 (LWP 2416))]#0  tupledealloc
> (op=0x7fffb0038bd8) at Objects/tupleobject.c:185
> 185  Py_TRASHCAN_SAFE_END(op)
> (gdb) frame 31
> #31 0x00007fffee5e8a70 in _wrap_gdk_window_process_updates
> (self=0x7fffb005f640, args=Traceback (most recent call last):
>   File "/usr/lib/debug/usr/lib64/libpython2.6.so.1.0.debug-gdb.py", line 437,
> in to_string
>     proxyval = PyObjectPtr.from_pyobject_ptr(self.gdbval).proxyval()
>   File "/usr/lib/debug/usr/lib64/libpython2.6.so.1.0.debug-gdb.py", line 149,
> in proxyval
>     tp_name = self.type().field('tp_name').string()
> RuntimeError: Error reading string from inferior: Input/output error
> , kwargs=<\ at remote 0x7fffcf2cf690>) at gdk.c:10989
> 10989     gdk_window_process_updates(GDK_WINDOW(self->obj), update_children);
> (gdb) print *args
> $5 = {ob_refcnt = 0, ob_type = 0x1c9d780}

This is also bogus, in an interestingly different way...

> (gdb) print args->tp_type
> There is no member named tp_type.
Ditto from above; I meant "ob_type"; sorry about that.

Looks like ob_type is pointing to _something_, but not a valid PyTypeObject.

Thanks for reporting this; it's giving me plenty of good ideas on what I need to do to make this code more robust.

> == version info ==
> [liveuser@localhost ~]$ uname -a
> Linux localhost.localdomain 2.6.33-0.44.rc8.git0.fc13.x86_64 #1 SMP Sat Feb 13
> 02:29:32 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
> [liveuser@localhost ~]$ rpm -q python gdb pychess
> python-2.6.4-20.fc13.x86_64
> gdb-7.0.50.20100203-15.fc13.x86_64
> pychess-0.10-0.3.20100214svn.fc13.noarch

I'm afraid I can't help right now with the pychess issue; is that in bugzilla yet?

Comment 6 Dave Malcolm 2010-02-17 21:22:04 UTC
(In reply to comment #4)
(snip)
>     if self.is_null():
>   File "/usr/lib/debug/usr/lib64/libpython2.6.so.1.0.debug-gdb.py", line 122,
> in is_null
>     return 0 == long(self._gdbval)
> RuntimeError: Cannot access memory at address 0x98a
> ) at Modules/posixmodule.c:5899
> 5899  pid = waitpid(pid, &status, options);
> (gdb) print *args
> Cannot access memory at address 0x982

Thanks; looks like a slightly different bug in my code.

Looks like I need to be rather more aggressive in my attempts to break my code.

Comment 7 Steve Tyler 2010-02-17 23:48:37 UTC
(In reply to comment #5)
> (In reply to comment #2)
> Thanks for the info!

Glad to help. This is a very nice extension to gdb.

FYI, the reason installing libpython2.6.so.1.0-gdb.py on F12 didn't work earlier (per Bug 565920) is that I have a 64-bit system. Created a link and now have it working on F12 and can reproduce the RuntimeError messages from a core file on F12.

I have lots of core files ... would you like one as a test case? (<=280M)

[stephent@walnut ~]$ ls -l /usr/lib/debug/usr/lib64/libpython2.6.so.1.0-gdb.py
lrwxrwxrwx 1 root root 49 2010-02-17 14:51 /usr/lib/debug/usr/lib64/libpython2.6.so.1.0-gdb.py -> /usr/lib/debug/usr/lib/libpython2.6.so.1.0-gdb.py

...

> > (gdb) print *args
> > $4 = {ob_refcnt = 549755813888, ob_type = 0x1}
> OK: the data pointed to by "args" is completely corrupt.  My gdb code clearly
> needs to do a better job of handling this.

Thanks for confirming that there is corruption. No wonder pychess aborts! :-)

> > (gdb) print args->tp_type
> > There is no member named tp_type.
> Oops, I meant to write "ob_type", but no need: I see the (bogus) value above.

OK.

...
> > (gdb) print *args
> > $5 = {ob_refcnt = 0, ob_type = 0x1c9d780}
> 
> This is also bogus, in an interestingly different way...
> 
> > (gdb) print args->tp_type
> > There is no member named tp_type.
> Ditto from above; I meant "ob_type"; sorry about that.
> 
> Looks like ob_type is pointing to _something_, but not a valid PyTypeObject.

I tried dereferencing ob_type and got a lot of "stuff". As I recall, there were further RuntimeError messages. Not sure if can reproduce this case, but would it be of interest?

> Thanks for reporting this; it's giving me plenty of good ideas on what I need
> to do to make this code more robust.

Will you be able to make it so that corruption like this is still clearly identifiable in a backtrace?

> > == version info ==
...
> > pychess-0.10-0.3.20100214svn.fc13.noarch
> 
> I'm afraid I can't help right now with the pychess issue; is that in bugzilla
> yet?    

All the pychess devs. are aware of the problem -- Bug 562471 and upstream.

Comment 8 Steve Tyler 2010-02-17 23:56:29 UTC
(In reply to comment #6)
> (In reply to comment #4)
> (snip)
> >     if self.is_null():
> >   File "/usr/lib/debug/usr/lib64/libpython2.6.so.1.0.debug-gdb.py", line 122,
> > in is_null
> >     return 0 == long(self._gdbval)
> > RuntimeError: Cannot access memory at address 0x98a
> > ) at Modules/posixmodule.c:5899
> > 5899  pid = waitpid(pid, &status, options);
> > (gdb) print *args
> > Cannot access memory at address 0x982
> 
> Thanks; looks like a slightly different bug in my code.
> 
> Looks like I need to be rather more aggressive in my attempts to break my code.    

FYI, the pychess packager, Thomas Spura, referred me to your work on this feature (Bug 562471, Comment 19).
https://bugzilla.redhat.com/show_bug.cgi?id=562471#c19

Comment 9 Steve Tyler 2010-02-18 14:15:18 UTC
Created attachment 394908 [details]
gdb backtrace with large python data structures

This backtrace has some large python data structures in it. A search for Human.py will find the most extreme case. Is this what is expected?

This backtrace is from a core file saved after pychess aborted with an assertion error. (I was actually trying to see if removing gstreamer-python affected the behavior -- pychess has sound-effects.)

$ gdb python core.3125
info thread
t a a bt

==
[stephent@walnut ~]$ uname -a
Linux walnut 2.6.31.12-174.2.19.fc12.x86_64 #1 SMP Thu Feb 11 07:07:16 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
[stephent@walnut ~]$ rpm -q python gdb pychess
python-2.6.2-2.fc12.x86_64
gdb-7.0.1-31.fc12.x86_64
pychess-0.10-0.3.20100214svn.fc12.noarch

Comment 10 Steve Tyler 2010-02-18 14:53:45 UTC
(In reply to comment #9)
> Created an attachment (id=394908) [details]
> gdb backtrace with large python data structures

Clarification: They wrap in gdb and are unreadable. ("set width 0" doesn't seem to work -- not sure why.)

Comment 11 Dave Malcolm 2010-02-20 03:30:30 UTC
Thanks for all your testing on this - I hope that I've fixed the majority of these issues in the -gdb.py code (plus various other improvements)

See the git repository here:
http://fedorapeople.org/gitweb?p=dmalcolm/public_git/libpython.git;a=blob_plain;f=libpython.py;hb=HEAD

Could you try the latest version of the libpython.py file there, copying it into place as before?  If you do get time to try this out, can you please run "sha256sum" on the file and state the result, so that we can be sure exactly what version of the debug code you're testing?  (hopefully I'll be able to merge this into the core python upstream at some point)

I'm at a conference (PyCon Atlanta) with intermittent internet connectivity at best, so I may not be able to respond for a few days, alas.

Comment 12 Dave Malcolm 2010-02-20 03:32:26 UTC
(In reply to comment #11)
(snip)
> Could you try the latest version of the libpython.py file there, copying it
> into place as before?  If you do get time to try this out, can you please run
...the place in question being something like /usr/lib/debug/usr/lib(64)/libpython26.so.1-gdb.py (caveat: going from memory here)

Comment 13 Steve Tyler 2010-02-20 13:20:58 UTC
Created attachment 395255 [details]
gdb backtrace with version 2 of libpython.py

Thanks, Dave! This run of pychess aborted with a gdk/X BadIDChoice error. The backtrace does not show any RuntimeError msgs. It looks like "unknown" now tags bogus addresses. Not sure how to test the new range limiting (I looked at the diffs).

I removed the previous version (in both places) and copied in the new one as follows:

[stephent@walnut fedora]$ ls -l /usr/lib/debug/usr/lib64/libpython2.6.so.1.0-gdb.py
-rw-rw-r-- 1 stephent stephent 19517 2010-02-20 03:53 /usr/lib/debug/usr/lib64/libpython2.6.so.1.0-gdb.py
[stephent@walnut fedora]$ sha256sum /usr/lib/debug/usr/lib64/libpython2.6.so.1.0-gdb.py
d178552bae26cb68a97eb14da73592e955c8538c8ab74b3f7a4b95ce7623d672  /usr/lib/debug/usr/lib64/libpython2.6.so.1.0-gdb.py

(Haven't used git before -- it doesn't seem to have a concept of revision/version numbers, so I have assigned this version 2 and checked it into RCS.:-))

==
python-2.6.2-2.fc12.x86_64
pychess-0.10-0.3.20100214svn.fc12.noarch

Comment 14 Steve Tyler 2010-02-20 14:02:37 UTC
Created attachment 395260 [details]
gdb backtrace with large data structures and version 2 of libpython.py

This is from a core file dumped after pychess aborted with an assertion failure (xcb_io.c:176). There are some very large data structures that give gdb and my Sempron 140 fits (100% CPU for several seconds).

File /usr/lib/python2.6/site-packages/pychess/Players/Human.py, line 115 ...
File /usr/lib/python2.6/site-packages/pychess/widgets/BoardView.py, line 692 ...

Comment 15 Dave Malcolm 2010-02-20 19:24:10 UTC
(In reply to comment #13)
(snip)

> I removed the previous version (in both places) and copied in the new one as
> follows:
> 
Looks good; thanks for trying this out.

Comment 16 Bug Zapper 2010-03-15 15:08:30 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 13 development cycle.
Changing version to '13'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 17 Fedora Update System 2010-04-02 17:20:13 UTC
python-2.6.4-25.fc13 has been submitted as an update for Fedora 13.
http://admin.fedoraproject.org/updates/python-2.6.4-25.fc13

Comment 18 Fedora Update System 2010-04-02 17:20:27 UTC
python3-3.1.2-4.fc13 has been submitted as an update for Fedora 13.
http://admin.fedoraproject.org/updates/python3-3.1.2-4.fc13

Comment 19 Fedora Update System 2010-04-02 18:43:00 UTC
python3-3.1.2-4.fc13 has been pushed to the Fedora 13 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update python3'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/python3-3.1.2-4.fc13

Comment 20 Fedora Update System 2010-04-02 18:43:32 UTC
python-2.6.4-25.fc13 has been pushed to the Fedora 13 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update python'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/python-2.6.4-25.fc13

Comment 21 Fedora Update System 2010-04-09 04:21:51 UTC
python-2.6.4-25.fc13 has been pushed to the Fedora 13 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 22 Fedora Update System 2010-05-10 23:50:38 UTC
python3-3.1.2-4.fc13 has been pushed to the Fedora 13 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.