Created attachment 599989 [details] Full valgrind trace of Xorg Description of problem: Current valgrind build in rawhide is not able to execute Xorg. ### unhandled dwarf2 abbrev form code 0x1f21 ### unhandled dwarf2 abbrev form code 0x1f21 ### unhandled dwarf2 abbrev form code 0x1f20 ### unhandled dwarf2 abbrev form code 0x1f21 ### unhandled dwarf2 abbrev form code 0x1f21 ### unhandled dwarf2 abbrev form code 0x1f20 ### unhandled dwarf2 abbrev form code 0x1f21 ### unhandled dwarf2 abbrev form code 0x1f21 ### unhandled dwarf2 abbrev form code 0x1f21 ### unhandled dwarf2 abbrev form code 0x1f20 --00:00:00:01.685 12281-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting --00:00:00:01.685 12281-- si_code=1; Faulting address: 0x411A000; sp: 0x409634038 (N.B. Xorg.nos is just copy of Xorg without suid bit executed via root) Version-Release number of selected component (if applicable): valgrind-3.7.0-4.fc18.x86_64 How reproducible: Steps to Reproduce: 1. try to run Xorg in valgrind 2. 3. Actual results: Expected results: Additional info:
Version of Xorg used for testing: xorg-x11-server-Xorg-1.12.99.902-1.20120717.fc18.x86_64
These are DW_FORM_GNU_ref_alt and DW_FORM_GNU_strp_alt which are generated by dwz. Upstream has a patch. r12742 | sewardj | 2012-07-14 11:59:01 +0200 (Sat, 14 Jul 2012) | 4 lines Initial support for DWZ compressed debuginfo -- don't crash, at least, when reading it. Bug 302901 comment 3. (Jakub Jelinek, jakub)
*** Bug 836746 has been marked as a duplicate of this bug. ***
Hmm my up-to-date rawhide: $ rpm -q --changelog valgrind | head * St čec 25 2012 Mark Wielaard <mjw> 3.7.0-6 - handle dwz DWARF compressor output (#842659, KDE#302901) - allow glibc 2.16. However my Xorg is still not traceable - through the reported valgrind problem seems to be slightly different. I'll attach trace again
Created attachment 601750 [details] New valgrind trace with Xorg - evaluate_Dwarf3_Expr??
yeah I'm getting funky traces on Xorg as well ==4071== Invalid read of size 8 ==4071== at 0x4E841C: ??? (in /usr/bin/Xorg.debug) ==4071== by 0x571381: DRI2UpdatePrime (in /usr/bin/Xorg.debug) ==4071== by 0x5B9F837: radeon_dri2_copy_region2 (radeon_dri2.c:583) ==4071== by 0x570D41: ??? (in /usr/bin/Xorg.debug) ==4071== by 0x571FA5: DRI2SwapBuffers (in /usr/bin/Xorg.debug) ==4071== by 0x573203: ??? (in /usr/bin/Xorg.debug) ==4071== by 0x4395A9: ??? (in /usr/bin/Xorg.debug) ==4071== by 0x428059: ??? (in /usr/bin/Xorg.debug) ==4071== by 0x3180821A04: (below main) (in /usr/lib64/libc-2.16.so)
hmmm, might be that my backport to 3.7.0 went wrong. Upstream is going into code freeze for 3.8.0 this weekend. I'll concentrate on updating to 3.8.0 for rawhide.
3.8.0-SVN seems to work nicely on rawhide, but I don't have packages yet. I did find a missing commit for DW_FORM_ref_addr handling that I have backported to the 3.7.0 package. valgrind-3.7.0-7.fc18 is currently building: http://koji.fedoraproject.org/koji/taskinfo?taskID=4355880
Created attachment 603212 [details] Still unusable with valgrind-3.8.0-0.1.TEST1.svn12858.fc18.x86_64 Even with shiny new valgrind-3.8.0-0.1.TEST1.svn12858.fc18.x86_64 Xorg cannot be executed within valgrind environment. Have you tested it with Xorg ?
(In reply to comment #9) > Even with shiny new valgrind-3.8.0-0.1.TEST1.svn12858.fc18.x86_64 Xorg > cannot be executed within valgrind environment. That is a bummer. > Have you tested it with Xorg ? No, but I clearly should. What valgrind command line do you use?
(In reply to comment #10) > (In reply to comment #9) > > Even with shiny new valgrind-3.8.0-0.1.TEST1.svn12858.fc18.x86_64 Xorg > > cannot be executed within valgrind environment. > > That is a bummer. > > > Have you tested it with Xorg ? > > No, but I clearly should. > What valgrind command line do you use? My simple way to do this is mv /usr/bin/Xorg /usr/bin/Xorg.bin cp /usr/bin/Xorg.bin /usr/bin/Xorg.nos chmod -s /usr/bin/Xorg.nos vi /usr/bin/Xorg.valg ------------ #!/bin/sh exec /path/to/valgrind/script /usr/bin/Xorg.nos "$@" ------------- vi /root/.xinitrc ------------- #!/bin/sh exec /usr/bin/xterm ------------- ln -s /usr/bin/Xorg.valg /usr/bin/Xorg +++++++ startx +++++++ now you should be able to easily switch between original suid binary and valrgind execution for root (non-suid). My valgrind script executes Xorg.nos with multiple extra arguments - I guess you have your own for this. Options I'm using: --show-reachable=yes --track-fds=yes --max-stackframe=300000 --leak-check=full --track-origins=yes --num-callers=40 --malloc-fill=aa --free-fill=ee --log-file=/tmp/valglog To revert back to real X just ln -sf /usr/bin/Xorg.bin /usr/bin/Xorg NB: when crash happens in valgind - screen remains black and unusable.
Just created valgrind-3.8.0-4.fc19 http://koji.fedoraproject.org/koji/buildinfo?buildID=349093 Which contains the final valgrind 3.8.0 upstream release which has all the new DWARF extension work plus an extra patch to work around bug #849435. With that I get good backtraces for issues like: ==2197== Conditional jump or move depends on uninitialised value(s) ==2197== at 0xCF744B1: ps2SendPacket (pnp.c:617) ==2197== by 0xCF71424: SetupMouse (mouse.c:2917) ==2197== by 0xCF718F7: MouseProc (mouse.c:1745) ==2197== by 0x431239: EnableDevice (devices.c:386) ==2197== by 0x4981A0: xf86NewInputDevice (xf86Xinput.c:875) ==2197== by 0xCD69F08: VMMousePreInit (vmmouse.c:280) ==2197== by 0x497D90: xf86NewInputDevice (xf86Xinput.c:846) ==2197== by 0x4AE2B5: device_added (udev.c:231) ==2197== by 0x4AE912: config_udev_init (udev.c:386) ==2197== by 0x4AD868: config_init (config.c:48) ==2197== by 0x48B81D: InitInput (xf86Init.c:967) ==2197== by 0x428018: main (main.c:265) ==2197== Uninitialised value was created by a stack allocation ==2197== at 0xCF74430: ps2SendPacket (pnp.c:582) ==2197== Conditional jump or move depends on uninitialised value(s) ==2197== at 0x6DD8F34: fbBltOne (fbbltone.c:330) ==2197== by 0x6DE0598: fbPushFill (fbpush.c:119) ==2197== by 0x6DE083F: fbPushImage (fbpush.c:167) ==2197== by 0x6DE0924: fbPushPixels (fbpush.c:187) ==2197== by 0x52A5C8: damagePushPixels (damage.c:1498) ==2197== by 0x58066D: miDCPutUpCursor (midispcur.c:326) ==2197== by 0x590B9C: miSpriteRestoreCursor.part.15 (misprite.c:929) ==2197== by 0x58B53F: miPointerUpdateSprite (mipointer.c:436) ==2197== by 0x58B91C: miPointerDisplayCursor (mipointer.c:201) ==2197== by 0x4DC02A: CursorDisplayCursor (cursor.c:166) ==2197== by 0x525BF6: AnimCurDisplayCursor (animcur.c:224) ==2197== by 0x442E08: UpdateSpriteForScreen (events.c:3261) ==2197== by 0x483DF9: xf86WarpCursor (xf86Cursor.c:473) ==2197== by 0x58B7C0: miPointerSetCursorPosition (mipointer.c:277) ==2197== by 0x52630E: AnimCurSetCursorPosition (animcur.c:244) ==2197== by 0x442A98: InitializeSprite (events.c:3177) ==2197== by 0x431333: EnableDevice (devices.c:365) ==2197== by 0x432EA4: InitCoreDevices (devices.c:690) ==2197== by 0x42800A: main (main.c:264) ==2197== Uninitialised value was created by a heap allocation ==2197== at 0x4C2A6DC: malloc (vg_replace_malloc.c:270) ==2197== by 0x454F27: AllocatePixmap (pixmap.c:117) ==2197== by 0x6DDF5CF: fbCreatePixmapBpp (fbpixmap.c:53) ==2197== by 0x451118: ServerBitsFromGlyph (glyphcurs.c:96) ==2197== by 0x42DEA3: AllocGlyphCursor (cursor.c:352) ==2197== by 0x42E149: CreateRootCursor (cursor.c:472) ==2197== by 0x427F8F: main (main.c:241) If there are other issues running Xorg under valgrind or bad diagnostics from valgrind for Xorg please open a new bug report.
I had the problems with what is now F-18, why you push it only for F-19?
(In reply to comment #13) > I had the problems with what is now F-18, why you push it only for F-19? Patience :) f18 has now branched from rawhide/f19 branched so work needs to be done and tested twice now. The f19 package should install fine on f18 if you want to test it. After some time an update will also hit f18: https://admin.fedoraproject.org/updates/valgrind-3.8.0-4.fc18
Note that I closed this because all issues reported against Xorg that I could replicate have been resolved. But it might be that people are still seeing one other issue. If you are seeing "valgrind: the 'impossible' happened" with a backtrace indicating read_debuginfo_dwarf3 was involved then please watch https://bugzilla.redhat.com/show_bug.cgi?id=849783 which has an easier reproducer.