Bug 1191586 - abrt fails to generate core_backtrace from ruby coredumps
Summary: abrt fails to generate core_backtrace from ruby coredumps
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: satyr
Version: 21
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Martin Milata
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-02-11 14:44 UTC by Richard Z.
Modified: 2015-12-02 17:24 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-12-02 08:54:24 UTC


Attachments (Terms of Use)
ruby-qt testcase (2.48 KB, application/x-bzip)
2015-02-12 21:16 UTC, Richard Z.
no flags Details
Correct core_backtrace (5.38 KB, text/plain)
2015-02-13 12:12 UTC, Martin Milata
no flags Details
strace of abrt -d ... (26.83 KB, application/x-bzip)
2015-02-16 21:20 UTC, Richard Z.
no flags Details
output of failing reporter-ureport -vvv -d ... (3.82 KB, application/x-bzip)
2015-02-25 13:24 UTC, Richard Z.
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Bugzilla 1129756 None None None Never

Internal Links: 1129756

Description Richard Z. 2015-02-11 14:44:00 UTC
Hi,

I hit a reproducible segfault in ruby-qt and in abrt clicked to report it.
Abrt says processing done and when I look at the logs it says me 

--- Running report_uReport ---
A bug was already filed about this problem:
Bugzilla: URL=https://bugzilla.redhat.com/show_bug.cgi?id=1181160

#######

Bug 1181160 has been filled by me some time ago but is clearly distinct, my current bug has the stacktrace

$ ruby rselect
/home/rz/ruby/ruby-qt/rselect_ui.rb:32:in `method_missing': undefined method `resize' for #<StartQT4:0x86cb8b4> (NoMethodError)
        from /home/rz/ruby/ruby-qt/rselect_ui.rb:32:in `setupUi'
        from rselect:17:in `initialize'
        from rselect:36:in `new'
        from rselect:36:in `<main>'
rselect: [BUG] Segmentation fault at 0x000000
ruby 2.1.5p273 (2014-11-13 revision 48405) [i386-linux]

-- Control frame information -----------------------------------------------
c:0001 p:0000 s:0002 E:001dac TOP    [FINISH]


-- C level backtrace information -------------------------------------------
/lib/libruby.so.2.1() [0x415b7419]
/lib/libruby.so.2.1() [0x415b74e8]
/lib/libruby.so.2.1() [0x41485b11]
/lib/libruby.so.2.1(rb_bug+0x41) [0x41485ec1]
/lib/libruby.so.2.1() [0x41542fe4]
[0xb77b5bb4]
/lib/libQtGui.so.4(_ZN12QApplicationD1Ev+0xb7) [0x42a69287]
/lib/libsmokeqtgui.so.3(_ZN12__smokeqtgui14x_QApplicationD0Ev+0x3f) [0xb6fd445f]
/lib/libsmokeqtgui.so.3(_ZN12__smokeqtgui18xcall_QApplicationEsPvPN5Smoke9StackItemE+0x1165) [0xb6fbdd35]
/lib/libqtruby4shared.so.2(_Z14smokeruby_freePv+0x425) [0xb6c4d485]
/lib/libruby.so.2.1() [0x4149ffb7]
/lib/libruby.so.2.1(rb_gc_call_finalizer_at_exit+0x1e4) [0x414a6594]
/lib/libruby.so.2.1(ruby_cleanup+0x35b) [0x4148c40b]
/lib/libruby.so.2.1(ruby_run_node+0x39) [0x4148c6b9]
/usr/bin/ruby-mri() [0x80486a9]
/lib/libc.so.6(__libc_start_main+0xde) [0x4fd25e7e]
/usr/bin/ruby-mri() [0x80486cf]

When I look at problem details it says
reported-to:
uReport: BTHASH=356a332497ed46f0f0fc1f32f7b29d77997d6242
ABRT Server: URL=https://retrace.fedoraproject.org/faf/reports/bthash/356a332497ed46f0f0fc1f32f7b29d77997d6242
ABRT Server: URL=https://retrace.fedoraproject.org/faf/reports/461330/
Bugzilla: URL=https://bugzilla.redhat.com/show_bug.cgi?id=1181160

At this point some criticism on the ABRT problem details window: I can not copypaste the whole contents of it - when I start at the bottom of reported_to and try to select above the reported_to line it stops selecting.

Comment 1 Jakub Filak 2015-02-11 14:56:57 UTC
Hello, thank you for the report. ABRT client apparently failed to get a meaningful backtrace from the file coredump and ABRT server found a dupe for that useless backtrace. To be able to properly fix this issue, we need to have the file coredump from the same directory as the file reported_to.

Could you please open a new bug report for the flaws in the details window.

Comment 2 Richard Z. 2015-02-11 15:05:12 UTC
Is that what you need?

# cat  core_backtrace 
{   "signal": 6
,   "executable": "/usr/bin/ruby-mri"
,   "stacktrace":
      [ {   "crash_thread": true
        ,   "frames":
              [ {   "address": 3078314956
                ,   "build_id": "c89e128b2efc5a51f7199f87a0d20db6cadc0997"
                ,   "build_id_offset": 3020
                } ]
        }
      , {   "frames":
              [ {   "address": 3078314956
                ,   "build_id": "c89e128b2efc5a51f7199f87a0d20db6cadc0997"
                ,   "build_id_offset": 3020
                } ]
        } ]
}

# gdb /usr/bin/ruby coredump
(gdb) bt
#0  0xb77b5bcc in __kernel_vsyscall ()
#1  0x4fd3b297 in ?? ()
#2  0x4fed7000 in ?? ()
#3  0x414e5110 in ?? ()
#4  0x086cd024 in ?? ()
#5  0x086fc7d0 in ?? ()

Don't have the possibility to download all debuginfo packages at the moment but might bzip and email the coredump.

Comment 3 Jakub Filak 2015-02-11 15:24:19 UTC
Martin, can you please look into this?

Comment 4 Richard Z. 2015-02-11 15:26:32 UTC
gdb works better if I use the name of the actual binary (ruby-mri) instead of the wrapper script.

(gdb) bt
#0  0xffffffff in __kernel_vsyscall ()
#1  0x4fd3b297 in raise () at /lib/libc.so.6
#2  0x4fd3cb69 in abort () at /lib/libc.so.6
#3  0x41485ec6 in rb_bug () at /lib/libruby.so.2.1
#4  0x41542fe4 in sigsegv () at /lib/libruby.so.2.1
#5  0xffffffff in <signal handler called> ()
#6  0x42a69287 in QApplication::~QApplication() () at /lib/libQtGui.so.4
#7  0xffffffff in __smokeqtgui::x_QApplication::~x_QApplication() ()
    at /lib/libsmokeqtgui.so.3
#8  0xffffffff in __smokeqtgui::xcall_QApplication(short, void*, Smoke::StackItem*) () at /lib/libsmokeqtgui.so.3
#9  0xffffffff in smokeruby_free(void*) () at /lib/libqtruby4shared.so.2
#10 0x4149ffb7 in finalize_list () at /lib/libruby.so.2.1
#11 0x414a6594 in rb_gc_call_finalizer_at_exit () at /lib/libruby.so.2.1
#12 0x4148c40b in ruby_cleanup () at /lib/libruby.so.2.1
#13 0x4148c6b9 in ruby_run_node () at /lib/libruby.so.2.1
#14 0x080486a9 in main ()

Comment 5 Martin Milata 2015-02-11 17:27:12 UTC
I tried to reproduce this with a segfaulting ruby script but core_backtrace turns out fine.

Can you perhaps share your script or a part of it that still crashes?

Comment 6 Richard Z. 2015-02-11 20:47:22 UTC
one test script which I made earlier and which appears to trigger the abrt problem. It seems to be slightly indeterministic so if it does not work I will upload the original testcase which consists of more files.

# cat ruby/rtest
#!/usr/bin/ruby

require "tkscrollbox"

top=TkRoot.new
frame=TkFrame.new(top)

# single, browse, mutliple, extended

list_w = TkListbox.new(frame, 'selectmode' => 'multiple')
scroll_bar = TkScrollbar.new(frame,                                          
                             'command' => proc { |*args| list_w.yview *args })
scroll_bar.pack('side' => 'left', 'fill' => 'y')                             
list_w.yscrollcommand(proc { |first,last|
                        scroll_bar.set(first,last) })

list_w.insert "end", "text1"
list_w.insert "end", "text2"
list_w.insert "end", "text3"
list_w.insert "end", "text4"
list_w.insert "end", "text5"
list_w.insert "end", "text6"
list_w.insert "end", "text7"
list_w.insert "end", "text8"
list_w.insert "end", "text9"


list_w.pack
frame.pack

Tk.mainloop

Comment 7 Martin Milata 2015-02-12 15:09:39 UTC
Thanks - your example doesn't segfault on my freshly installed F21 VM, though. Please upload the other test case if you can, multiple files are not a problem.

Interesting facts:
On F20 the script usually terminates with an exception but sometimes the application runs OK (on F21, it always runs OK).
On F21 without X server running the script segfaults but the resulting core_backtrace looks normal.

Comment 8 Richard Z. 2015-02-12 21:16:33 UTC
Created attachment 991150 [details]
ruby-qt testcase

# run as
$ ruby rselect

Comment 9 Richard Z. 2015-02-12 21:18:02 UTC
hope that one works better? My first try with qt-designer;)

Comment 10 Martin Milata 2015-02-13 12:08:35 UTC
Works better (i.e. crashes:))

The core_backtrace I get on F21 (w/ no relevant debuginfo packages) looks normal, though.

If you reproduce the problem on your system, do you still get the core_backtrace you posted in comment 2?

Comment 11 Martin Milata 2015-02-13 12:12:48 UTC
Created attachment 991350 [details]
Correct core_backtrace

Comment 12 Richard Z. 2015-02-13 12:23:19 UTC
yes, exactly the same. Also, running the test in an xterm I see the automaticaly displayed "C level backtrace information" mentioned in the bug description while abrt backtrace is still wrong.

Is there a way to re-run the analysis from command line?

Comment 13 Richard Z. 2015-02-16 14:00:58 UTC
Not sure if that helps but today I have generated another segfault with ruby and this time the abrt-stacktrace "worked":
 https://retrace.fedoraproject.org/faf/reports/538814/

It would seem that abrt did not catch the actual segfault but another one which happened while ruby was doing its own backtrace reporting and stack mangling (indicated by "rb_print_backtrace" in the backtrace)? 
There is also some unwind-dw2 in that backtrace, wondering if that causes the problem for abrt?

The other segfault testcase (the qt gui) still reliably generates segfaults without any usable baktrace.

Comment 14 Martin Milata 2015-02-16 16:47:17 UTC
The core_backtrace can be re-generated by running "abrt-action-generate-core-backtrace -d /var/tmp/abrt/ccpp-<date_of_the_crash>".

The other segfault really looks like the Ruby SEGV handler segfaulted itself.

WRT the first problem, can you perhaps try uninstalling any relevant -debug packages (or all -debug packages) and check if the core_backtrace is still unusable?

Comment 15 Richard Z. 2015-02-16 21:18:44 UTC
already had the idea removing all debuginfo packages and also cleaned up a few leftovers from F19 - the ruby-qt testcase still results in the same problem.

Running "abrt" by hand does not produce much insight, problem stays same but I will attach the strace of it - perhaps you can diff it with an strace on your (virtual) machine to see if there are any notable differences.

Comment 16 Richard Z. 2015-02-16 21:20:54 UTC
Created attachment 992368 [details]
strace of abrt -d ...

Comment 17 Martin Milata 2015-02-17 14:33:38 UTC
Oh, looks like you have 32-bit system. That might be an important detail - I'll try to reproduce the problem on an i686.

Meanwhile, can you please try running "eu-stack -e /usr/bin/ruby-mri --core coredump" and post the result?

Comment 18 Martin Milata 2015-02-17 15:41:16 UTC
Never mind - I managed to reproduce it on my VM. eu-stack works fine so the problem is in abrt/satyr.

Comment 19 Martin Milata 2015-02-24 16:42:49 UTC
Looks like an incomplete workaround for elfutils bug #1129756 caused this (original issue: https://github.com/abrt/satyr/issues/163).

Patch posted for review: https://github.com/abrt/satyr/pull/218

Can you please test the scratch build with the fixed package if you have time? http://koji.fedoraproject.org/koji/taskinfo?taskID=9054273

Comment 20 Richard Z. 2015-02-25 12:36:03 UTC
after update it does detect is a "new problem", unfortunately bug reporting fails as


--- Running report_uReport ---
Server responded with an error: 'uReport data is invalid.'
reporter-ureport failed with exit code 1
('report_uReport' exited with 1)

In "Problem details" window a largish stacktrace is displayed which looks plausible and nothing giving me further hints what actually went wrong with reporting.

Comment 21 Martin Milata 2015-02-25 13:08:28 UTC
Weird. Can you please run "reporter-ureport -vvv -d /var/tmp/abrt/ccpp-<date_of_the_crash>" and attach the output?

Comment 22 Richard Z. 2015-02-25 13:24:12 UTC
Created attachment 995208 [details]
output of failing reporter-ureport -vvv -d ...

Comment 23 Richard Z. 2015-02-25 13:29:57 UTC
Btw I have updated as
 rpm -Uhv https://kojipkgs.fedoraproject.org//work/tasks/4273/9054273/satyr-0.16-2.fc21.i386.rpm

Comment 24 Martin Milata 2015-02-25 17:18:29 UTC
Marek, do you have any idea why is the uReport rejected? Is it because of the frame without filename (corresponding to signal handler)?

Comment 25 Marek Bryša 2015-02-26 06:24:09 UTC
Yes, that's the cause:
"Element 'stacktrace' is invalid: List element is invalid: Element 'frames' is invalid: List element is invalid: Element 'file_name' is missing"

Comment 26 Martin Milata 2015-02-27 14:30:53 UTC
I think I've got it - another patch posted for review: https://github.com/abrt/satyr/pull/219

Now the signal handler (and other VDSO frames) should have "linux-gate.so.1" as a file_name.

Comment 27 Richard Z. 2015-04-30 18:29:11 UTC
are there any test builds that we could test?

Comment 28 Fedora End Of Life 2015-11-04 15:50:58 UTC
This message is a reminder that Fedora 21 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 21. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '21'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 21 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 29 Fedora End Of Life 2015-12-02 08:54:29 UTC
Fedora 21 changed to end-of-life (EOL) status on 2015-12-01. Fedora 21 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.