842659 – Valrgind is unable to execute current Xorg builds (unhandled dwarf2 abbrev form code)

Bug 842659 - Valrgind is unable to execute current Xorg builds (unhandled dwarf2 abbrev form code)

Summary: Valrgind is unable to execute current Xorg builds (unhandled dwarf2 abbrev fo...

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	valgrind
Sub Component:
Version:	rawhide
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Jakub Jelinek
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	836746 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-07-24 11:49 UTC by Zdenek Kabelac
Modified:	2012-08-21 09:29 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2012-08-19 15:53:51 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Full valgrind trace of Xorg (5.91 KB, text/plain) 2012-07-24 11:49 UTC, Zdenek Kabelac	no flags	Details
New valgrind trace with Xorg - evaluate_Dwarf3_Expr?? (20.60 KB, text/plain) 2012-08-01 14:09 UTC, Zdenek Kabelac	no flags	Details
Still unusable with valgrind-3.8.0-0.1.TEST1.svn12858.fc18.x86_64 (4.58 KB, text/plain) 2012-08-09 09:20 UTC, Zdenek Kabelac	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
KDE Software Compilation	298864	0	None	None	None	2012-08-03 20:49:10 UTC
KDE Software Compilation	302901	0	None	None	None	2012-07-24 12:44:59 UTC

Description Zdenek Kabelac 2012-07-24 11:49:14 UTC

Created attachment 599989 [details]
Full valgrind trace of Xorg

Description of problem:

Current valgrind build in rawhide is not able to execute Xorg.

### unhandled dwarf2 abbrev form code 0x1f21
### unhandled dwarf2 abbrev form code 0x1f21
### unhandled dwarf2 abbrev form code 0x1f20
### unhandled dwarf2 abbrev form code 0x1f21
### unhandled dwarf2 abbrev form code 0x1f21
### unhandled dwarf2 abbrev form code 0x1f20
### unhandled dwarf2 abbrev form code 0x1f21
### unhandled dwarf2 abbrev form code 0x1f21
### unhandled dwarf2 abbrev form code 0x1f21
### unhandled dwarf2 abbrev form code 0x1f20
--00:00:00:01.685 12281-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting
--00:00:00:01.685 12281-- si_code=1;  Faulting address: 0x411A000;  sp: 0x409634038


(N.B.  Xorg.nos is just copy of Xorg without suid bit executed via root)


Version-Release number of selected component (if applicable):
valgrind-3.7.0-4.fc18.x86_64

How reproducible:


Steps to Reproduce:
1. try to run Xorg in valgrind
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Zdenek Kabelac 2012-07-24 11:50:03 UTC

Version of Xorg used for testing:

xorg-x11-server-Xorg-1.12.99.902-1.20120717.fc18.x86_64

Comment 2 Mark Wielaard 2012-07-24 12:44:59 UTC

These are DW_FORM_GNU_ref_alt and DW_FORM_GNU_strp_alt which are generated by dwz. Upstream has a patch.

r12742 | sewardj | 2012-07-14 11:59:01 +0200 (Sat, 14 Jul 2012) | 4 lines

Initial support for DWZ compressed debuginfo -- don't crash, at least,
when reading it.  Bug 302901 comment 3.  (Jakub Jelinek, jakub)

Comment 3 Mark Wielaard 2012-07-26 15:20:06 UTC

*** Bug 836746 has been marked as a duplicate of this bug. ***

Comment 4 Zdenek Kabelac 2012-08-01 14:07:48 UTC

Hmm my up-to-date rawhide:

$ rpm -q --changelog  valgrind | head 
* St čec 25 2012 Mark Wielaard <mjw> 3.7.0-6
- handle dwz DWARF compressor output (#842659, KDE#302901)
- allow glibc 2.16.

However my Xorg is still not traceable - through the reported valgrind problem seems to be slightly different.

I'll attach trace again

Comment 5 Zdenek Kabelac 2012-08-01 14:09:44 UTC

Created attachment 601750 [details]
New valgrind trace with Xorg -  evaluate_Dwarf3_Expr??

Comment 6 Dave Airlie 2012-08-03 03:21:49 UTC

yeah I'm getting funky traces on Xorg as well

==4071== Invalid read of size 8
==4071==    at 0x4E841C: ??? (in /usr/bin/Xorg.debug)
==4071==    by 0x571381: DRI2UpdatePrime (in /usr/bin/Xorg.debug)
==4071==    by 0x5B9F837: radeon_dri2_copy_region2 (radeon_dri2.c:583)
==4071==    by 0x570D41: ??? (in /usr/bin/Xorg.debug)
==4071==    by 0x571FA5: DRI2SwapBuffers (in /usr/bin/Xorg.debug)
==4071==    by 0x573203: ??? (in /usr/bin/Xorg.debug)
==4071==    by 0x4395A9: ??? (in /usr/bin/Xorg.debug)
==4071==    by 0x428059: ??? (in /usr/bin/Xorg.debug)
==4071==    by 0x3180821A04: (below main) (in /usr/lib64/libc-2.16.so)

Comment 7 Mark Wielaard 2012-08-03 08:50:55 UTC

hmmm, might be that my backport to 3.7.0 went wrong. Upstream is going into code freeze for 3.8.0 this weekend. I'll concentrate on updating to 3.8.0 for rawhide.

Comment 8 Mark Wielaard 2012-08-03 21:04:46 UTC

3.8.0-SVN seems to work nicely on rawhide, but I don't have packages yet. I did find a missing commit for DW_FORM_ref_addr handling that I have backported to the 3.7.0 package. valgrind-3.7.0-7.fc18 is currently building: http://koji.fedoraproject.org/koji/taskinfo?taskID=4355880

Comment 9 Zdenek Kabelac 2012-08-09 09:20:01 UTC

Created attachment 603212 [details]
Still  unusable with valgrind-3.8.0-0.1.TEST1.svn12858.fc18.x86_64

Even with shiny new valgrind-3.8.0-0.1.TEST1.svn12858.fc18.x86_64  Xorg cannot be executed within valgrind environment.

Have you tested it with Xorg ?

Comment 10 Mark Wielaard 2012-08-09 09:32:24 UTC

(In reply to comment #9)
> Even with shiny new valgrind-3.8.0-0.1.TEST1.svn12858.fc18.x86_64  Xorg
> cannot be executed within valgrind environment.

That is a bummer.

> Have you tested it with Xorg ?

No, but I clearly should.
What valgrind command line do you use?

Comment 11 Zdenek Kabelac 2012-08-09 09:49:59 UTC

(In reply to comment #10)
> (In reply to comment #9)
> > Even with shiny new valgrind-3.8.0-0.1.TEST1.svn12858.fc18.x86_64  Xorg
> > cannot be executed within valgrind environment.
> 
> That is a bummer.
> 
> > Have you tested it with Xorg ?
> 
> No, but I clearly should.
> What valgrind command line do you use?

My simple way to do this is

mv /usr/bin/Xorg /usr/bin/Xorg.bin
cp /usr/bin/Xorg.bin /usr/bin/Xorg.nos
chmod -s /usr/bin/Xorg.nos

vi /usr/bin/Xorg.valg
------------
#!/bin/sh

exec /path/to/valgrind/script /usr/bin/Xorg.nos "$@"
-------------

vi /root/.xinitrc
-------------
#!/bin/sh

exec /usr/bin/xterm
-------------


ln -s /usr/bin/Xorg.valg /usr/bin/Xorg

+++++++
startx
+++++++

now you should be able to easily switch between original suid binary and valrgind execution for root (non-suid).
My valgrind script executes Xorg.nos with multiple extra arguments - I guess you have your own for this.

Options I'm using:
--show-reachable=yes 
--track-fds=yes
--max-stackframe=300000
--leak-check=full --track-origins=yes
--num-callers=40
--malloc-fill=aa
--free-fill=ee
--log-file=/tmp/valglog

To revert back to real X just ln -sf /usr/bin/Xorg.bin  /usr/bin/Xorg

NB: when crash happens in valgind - screen remains black and unusable.

Comment 12 Mark Wielaard 2012-08-19 15:53:51 UTC

Just created valgrind-3.8.0-4.fc19
http://koji.fedoraproject.org/koji/buildinfo?buildID=349093
Which contains the final valgrind 3.8.0 upstream release which has all the new DWARF extension work plus an extra patch to work around bug #849435. With that I get good backtraces for issues like:

==2197== Conditional jump or move depends on uninitialised value(s)
==2197==    at 0xCF744B1: ps2SendPacket (pnp.c:617)
==2197==    by 0xCF71424: SetupMouse (mouse.c:2917)
==2197==    by 0xCF718F7: MouseProc (mouse.c:1745)
==2197==    by 0x431239: EnableDevice (devices.c:386)
==2197==    by 0x4981A0: xf86NewInputDevice (xf86Xinput.c:875)
==2197==    by 0xCD69F08: VMMousePreInit (vmmouse.c:280)
==2197==    by 0x497D90: xf86NewInputDevice (xf86Xinput.c:846)
==2197==    by 0x4AE2B5: device_added (udev.c:231)
==2197==    by 0x4AE912: config_udev_init (udev.c:386)
==2197==    by 0x4AD868: config_init (config.c:48)
==2197==    by 0x48B81D: InitInput (xf86Init.c:967)
==2197==    by 0x428018: main (main.c:265)
==2197==  Uninitialised value was created by a stack allocation
==2197==    at 0xCF74430: ps2SendPacket (pnp.c:582)

==2197== Conditional jump or move depends on uninitialised value(s)
==2197==    at 0x6DD8F34: fbBltOne (fbbltone.c:330)
==2197==    by 0x6DE0598: fbPushFill (fbpush.c:119)
==2197==    by 0x6DE083F: fbPushImage (fbpush.c:167)
==2197==    by 0x6DE0924: fbPushPixels (fbpush.c:187)
==2197==    by 0x52A5C8: damagePushPixels (damage.c:1498)
==2197==    by 0x58066D: miDCPutUpCursor (midispcur.c:326)
==2197==    by 0x590B9C: miSpriteRestoreCursor.part.15 (misprite.c:929)
==2197==    by 0x58B53F: miPointerUpdateSprite (mipointer.c:436)
==2197==    by 0x58B91C: miPointerDisplayCursor (mipointer.c:201)
==2197==    by 0x4DC02A: CursorDisplayCursor (cursor.c:166)
==2197==    by 0x525BF6: AnimCurDisplayCursor (animcur.c:224)
==2197==    by 0x442E08: UpdateSpriteForScreen (events.c:3261)
==2197==    by 0x483DF9: xf86WarpCursor (xf86Cursor.c:473)
==2197==    by 0x58B7C0: miPointerSetCursorPosition (mipointer.c:277)
==2197==    by 0x52630E: AnimCurSetCursorPosition (animcur.c:244)
==2197==    by 0x442A98: InitializeSprite (events.c:3177)
==2197==    by 0x431333: EnableDevice (devices.c:365)
==2197==    by 0x432EA4: InitCoreDevices (devices.c:690)
==2197==    by 0x42800A: main (main.c:264)
==2197==  Uninitialised value was created by a heap allocation
==2197==    at 0x4C2A6DC: malloc (vg_replace_malloc.c:270)
==2197==    by 0x454F27: AllocatePixmap (pixmap.c:117)
==2197==    by 0x6DDF5CF: fbCreatePixmapBpp (fbpixmap.c:53)
==2197==    by 0x451118: ServerBitsFromGlyph (glyphcurs.c:96)
==2197==    by 0x42DEA3: AllocGlyphCursor (cursor.c:352)
==2197==    by 0x42E149: CreateRootCursor (cursor.c:472)
==2197==    by 0x427F8F: main (main.c:241)

If there are other issues running Xorg under valgrind or bad diagnostics from valgrind for Xorg please open a new bug report.

Comment 13 Jan Kratochvil 2012-08-19 16:22:56 UTC

I had the problems with what is now F-18, why you push it only for F-19?

Comment 14 Mark Wielaard 2012-08-19 20:04:50 UTC

(In reply to comment #13)
> I had the problems with what is now F-18, why you push it only for F-19?

Patience :) f18 has now branched from rawhide/f19 branched so work needs to be done and tested twice now. The f19 package should install fine on f18 if you want to test it. After some time an update will also hit f18:
https://admin.fedoraproject.org/updates/valgrind-3.8.0-4.fc18

Comment 15 Mark Wielaard 2012-08-21 09:29:49 UTC

Note that I closed this because all issues reported against Xorg that I could replicate have been resolved. But it might be that people are still seeing one other issue.

If you are seeing "valgrind: the 'impossible' happened" with a backtrace indicating read_debuginfo_dwarf3 was involved then please watch https://bugzilla.redhat.com/show_bug.cgi?id=849783 which has an easier reproducer.

Note You need to log in before you can comment on or make changes to this bug.