Bug 593926

Summary: looped solist
Product: [Fedora] Fedora Reporter: sjoerd <bugzilla001>
Component: gdbAssignee: Jan Kratochvil <jan.kratochvil>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 12CC: anton, bugzilla001, dvlasenk, iprikryl, jan.kratochvil, jmoskovc, kevin, kklic, mnowak, npajkovs, pmuldoon, swagiaal
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: gdb-7.0.1-46.fc12 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-05-25 18:38:06 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 578136    
Attachments:
Description Flags
Coredump none

Description sjoerd 2010-05-20 04:18:42 UTC
Description of problem:

starting compiz-fusion and change window manager from Metacity to Compiz.

How reproducible:
Always
  
Actual results:

abrt bug report.

Expected results:

Ignoring the abrt bug report it seems its all working.

Additional info:

Tried to report the bug trough abrt but doesn't work :(
First try missed all the 35 debug info so I manually did a debuginfo-install xfdesktop and made a refresh of the abrt report.
It told me it's missing 12 of the the abrt debug info.

Still no go to report.

Get's weirder....

Getting the debug info via abrt the gdb got 100% CPU and its eating my available RAM.

PID   USER   PR  NI VIRT  RES  SHR  S %CPU   %MEM TIME+   COMMAND
31326 sjoerd 20  0  6298m 6.1g 6544 R 100.2  51.3 6:42.12 gdb

After more than 6 minutes it grows more the 50% of my actual RAM amount so I had to stopped it with a kill -15 <pid>.

Comment 1 Kevin Fenzi 2010-05-20 05:21:19 UTC
I'm a bit confused here. You are running Xfce, but using metacity and compiz?

What is the crash that abrt is trying to report? 

Perhaps this should be a bug against abrt? 

Where does xfdesktop enter into things?

Comment 2 sjoerd 2010-05-20 14:51:25 UTC
I can switch from different managers like:
compiz
blackbox
kwin
openbox
icewm
fvwm
metacity
xfwm4

Only when switching to compiz I get this crash in xfdesktop:

Package:    	xfdesktop-4.6.1-3.fc12
Latest Crash:	do 20 mei 2010 16:33:30 
Command:    	xfdesktop --display :0.0 --sm-client-id 2bfcc1a99-b8ab-4687-8916-c4d5ee1c25fb
Reason:     	Process /usr/bin/xfdesktop was killed by signal 11 (SIGSEGV)
Comment:    	None
Bug Reports:	

And the gdb debug line is:

gdb -batch -ex set debug-file-directory /usr/lib/debug:/var/cache/abrt-di/usr/lib/debug -ex file /usr/bin/xfdesktop -ex core-file /var/cache/abrt/ccpp-1274366010-17004/coredump -ex thread apply all backtrace 2048 full -ex info sharedlib -ex print (char*)__abort_msg -ex print (char*)__glib_assert_msg -ex info registers -ex disassemble

After killing gdb abrt goes on and is telling me that some debug info is absent:

Debuginfo absent: 0afdfc236fadadc174323855991e9afa63360400
Debuginfo absent: 1ac7a3286afd0de56f0058a15381dd04f19fc637
Debuginfo absent: 2b03aa03331551a7e2d7fd2470b99af863e7c9e1
Debuginfo absent: 2e68954d07473d030484df60148109e3b80b8599
Debuginfo absent: 37c77837c31b54148c07786c65de59f6de739680
Debuginfo absent: 4c75954cd91375d7352db96ab5fecd71ad4cf10d
Debuginfo absent: 8ae7bb8df77cf7fa71cbcd68cf6abca1b1bd37d4
Debuginfo absent: 9b1838e94352154046ccb8e95f396253514a11fb
Debuginfo absent: a9794c12389c0e5731b26e33ef8ee92e5b8015fd
Debuginfo absent: ced98f19673e83ee5e2eae7b2c7d517f010ce796
Debuginfo absent: d4d63cfe79cdc769150ef3cc5ccbe49c7bb8e1a2
Debuginfo absent: e1fa559bb294c88aba88704ea01f926c942a15e5
Debuginfo absent: fc651a62f9d1f56f2ac50d96384164a1ed17cbbb

* Reporting disabled because the backtrace is unusable.

Comment 3 Kevin Fenzi 2010-05-21 15:38:26 UTC
Well, I guess this is an abrt bug first off... reassigning to them to see if they can figure this out. Once thats solved, we should have a trace we can use to see what the real crash is. 

Of course, Xfce doesn't claim to support using compiz as your window manager, so I could imagine all kinds of issues being possible.

Comment 4 Denys Vlasenko 2010-05-21 16:16:21 UTC
sjoerd, can you attach your /var/cache/abrt/ccpp-1274366010-17004/coredump (bzipped) to this bug?

Does gdb command also eat CPU if you run it by hand?
You'll need re-add some quoting for it to work:

gdb -batch \
 -ex "set debug-file-directory /usr/lib/debug:/var/cache/abrt-di/usr/lib/debug" \
 -ex "file /usr/bin/xfdesktop" \
 -ex "core-file /var/cache/abrt/ccpp-1274366010-17004/coredump" \
 -ex "thread apply all backtrace 2048 full" \
 -ex "info sharedlib" \
 -ex "print (char*)__abort_msg" \
 -ex "print (char*)__glib_assert_msg" \
 -ex "info registers" \
 -ex "disassemble"

What is your version of abrt?

Comment 5 sjoerd 2010-05-21 17:56:23 UTC
Created attachment 415741 [details]
Coredump

Compressed bzip2 coredump.

Comment 6 sjoerd 2010-05-21 18:08:24 UTC
abrt version:
# rpm -qa abrt
abrt-1.0.9-2.fc12.x86_64

Running gdb by hand as you requested gives me the same result.

  PID USER PR NI VIRT  RES  SHR  S %CPU %MEM    TIME+  COMMAND            
32725 root 20  0 10.2g 9.9g 1796 R 99.9 83.8  10:54.74 gdb

Killed the process as my system gets eventually very slow to responses.

-- 
smolt public key: pub_e362cc1a-42bf-4898-926b-418d22ee4e73

Comment 7 Denys Vlasenko 2010-05-21 18:17:42 UTC
Thanks! Surprisingly, it's the "info sharedlib" command:

$ gdb -batch -ex "core-file coredump" -ex "info sharedlib"

results in:

Missing separate debuginfo for the main executable file
Try: yum --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/73/ee4e8f73062f1f874cf52bac31a223bf8b77cf
Core was generated by `xfdesktop --display :0.0 --sm-client-id 2bfcc1a99-b8ab-4687-8916-c4d5ee1c25fb'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000003c9b683370 in ?? ()
Missing separate debuginfo for /usr/lib64/libxfcegui4.so.4
Try: yum --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/a4/30483a8a37ab7f08051653d651678b4c041dbd
Missing separate debuginfo for /usr/lib64/libxfconf-0.so.2
Try: yum --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/5b/4fe9f4c08fe93786dc3d29dea2fad3a52f46f2
Missing separate debuginfo for /usr/lib64/libthunar-vfs-1.so.2
Try: yum --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/6c/667ce35dd1e9fdaabdb672819f2e178b913298
Missing separate debuginfo for /usr/lib64/libexo-0.3.so.0
Try: yum --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/d5/f09da64f2aad49a21b8d4694b297e67048d618
Missing separate debuginfo for /usr/lib64/libxfce4util.so.4
Try: yum --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/5d/1fdf10da528cea55e4828c633e42d5e9f6faaa
Missing separate debuginfo for /usr/lib64/libthunarx-1.so.2
Try: yum --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/92/e70c611c625f89a4f90bc446b894f47e77f315
Missing separate debuginfo for /usr/lib64/libexo-hal-0.3.so.0
Try: yum --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/c7/1402354472f7208fc214f0ba63fbdf95c66e8e
Missing separate debuginfo for /usr/lib64/libxcb-aux.so.0
Try: yum --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/9b/1838e94352154046ccb8e95f396253514a11fb
Missing separate debuginfo for /usr/lib64/libxcb-event.so.1
Try: yum --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/0a/fdfc236fadadc174323855991e9afa63360400
Missing separate debuginfo for /usr/lib64/libxcb-atom.so.1
Try: yum --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/8a/e7bb8df77cf7fa71cbcd68cf6abca1b1bd37d4
Missing separate debuginfo for 
Try: yum --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/73/ee4e8f73062f1f874cf52bac31a223bf8b77cf

Note last, bogus filename in "Missing separate debuginfo for" message.
Here gdb stops talking and starts eating 100% CPU and consume memory.

Reassigning to gdb.

Comment 8 Jan Kratochvil 2010-05-21 19:08:29 UTC
It is probably this bug:
http://sourceware.org/ml/gdb-patches/2010-04/msg00820.html
The shared library list is probably corrupted there.
The fix currently isn't in Fedora GDB.

Comment 9 Fedora Update System 2010-05-24 18:51:38 UTC
gdb-7.1-22.fc13 has been submitted as an update for Fedora 13.
http://admin.fedoraproject.org/updates/gdb-7.1-22.fc13

Comment 10 Fedora Update System 2010-05-24 19:17:09 UTC
gdb-7.0.1-46.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/gdb-7.0.1-46.fc12

Comment 11 sjoerd 2010-05-25 03:25:40 UTC
Updated gdb to 7.0.1-46.fc12 and reported my original bug at:
https://bugzilla.redhat.com/show_bug.cgi?id=595574

Still getting some warnings with gdb:
warning: Corrupted shared library list

Dunno if this is a problem but the new gdb package seems to work.

Comment 12 Jan Kratochvil 2010-05-25 05:06:55 UTC
# warning: Corrupted shared library list
is correct in this case as it prevents:
# Here gdb stops talking and starts eating 100% CPU and consume memory.

For sure for some reason during the crash the list of loaded shared libraries got corrupted.  Memory corruption happens in general during crashes.  One can probably reconstruct it more various way by hand but GDB currently does not even attempt to do some solist reconstruction.

In your backtrace in the Bug 595574 I do not see any incompleteness due to the incomplete library list.

Comment 13 sjoerd 2010-05-25 05:32:10 UTC
I think it has to do with manually installing some debugpackages before reporting the actual bug 595574

# debuginfo-install gamin gtk2-engines hal-libs libXau libXcomposite libXcursor libXdamage libXext libXfixes libXinerama libXrandr libXrender libXres libcap-ng libuuid libxcb pixman startup-notification xcb-util

That was reported by gdb when manually run gdb as reported by Denys on comment 4.

Maybe I should have saved the output but I didn't save it.

Comment 14 Jan Kratochvil 2010-05-25 05:51:37 UTC
"Corrupted shared library list" is really unrelated to which debuginfo packages you have / have not installed.

Comment 15 sjoerd 2010-05-25 06:00:41 UTC
Then I'm lost what that message means.

I'm sorry but that's all I did before reporting the new bug besides updating the gdb rpm.

Comment 16 Jan Kratochvil 2010-05-25 06:29:13 UTC
(In reply to comment #15)
> Then I'm lost what that message means.

It means that there is one data structure (solist) corrupted in the core dump.  While it is a rare case it is perfectly OK.  Core dump means there is something wrong, the program crashed.  One cannot expect all the data structures are perfect in a crashed program.  If everything would be perfect in a core dump the program would not have a reason to crash and it would not dump the core.


> I'm sorry but that's all I did before reporting the new bug besides updating
> the gdb rpm.    

Updating gdb changed that gdb can now find out the solist is corrupted and stop at the corrupted point.  Older gdb did lock up on the corrupted solist.  As gdb is used for analysis of crashed program gdb must cope with any corruptions in the crashed program.  Older gdb failed to cope with corrupted (looped) solist.

Comment 17 Fedora Update System 2010-05-25 18:38:01 UTC
gdb-7.0.1-46.fc12 has been pushed to the Fedora 12 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 18 Fedora Update System 2010-05-25 18:43:26 UTC
gdb-7.1-22.fc13 has been pushed to the Fedora 13 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 19 sjoerd 2010-05-26 01:40:11 UTC
(In reply to comment #16)
> (In reply to comment #15)
> > Then I'm lost what that message means.
> 
> It means that there is one data structure (solist) corrupted in the core dump. 
> While it is a rare case it is perfectly OK.  Core dump means there is something
> wrong, the program crashed.  One cannot expect all the data structures are
> perfect in a crashed program.  If everything would be perfect in a core dump
> the program would not have a reason to crash and it would not dump the core.
> 
> 
> > I'm sorry but that's all I did before reporting the new bug besides updating
> > the gdb rpm.    
> 
> Updating gdb changed that gdb can now find out the solist is corrupted and stop
> at the corrupted point.  Older gdb did lock up on the corrupted solist.  As gdb
> is used for analysis of crashed program gdb must cope with any corruptions in
> the crashed program.  Older gdb failed to cope with corrupted (looped) solist.    

Thank you for your insight of this.
I didn't know. If you need any more info on this just let me no.