Bug 1125990 - RFE: Add support for kernel source code, aka 'gdb list'
Summary: RFE: Add support for kernel source code, aka 'gdb list'
Keywords:
Status: NEW
Alias: None
Product: Fedora EPEL
Classification: Fedora
Component: retrace-server
Version: el6
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Matej Marušák
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-08-01 14:31 UTC by Dave Wysochanski
Modified: 2018-07-19 06:07 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:


Attachments (Terms of Use)

Description Dave Wysochanski 2014-08-01 14:31:21 UTC
Description of problem:
This is an RFE to add kernel source support into retrace-server in some way.
This should be an optional feature.  The design needs worked out and of course this will increase storage requirements.

We could support it directly in retrace-server or via a custom hook 'extension' to retrace-server, provided we implement the custom hooks (see https://bugzilla.redhat.com/show_bug.cgi?id=1082376) and the kernel version of the vmcore is easily available (it is today in /cores/retrace/tasks/<taskid>/kernelver).


Version-Release number of selected component (if applicable):
retrace-server-1.11-4.el6.noarch


Actual results:
Unable to list source code via 'gdb list' on a retrace-server machine.  People have to clone git trees or rpmbuild -bp the kernel source to look at the source tree for every vmcore.


Expected results:
Ability to use 'gdb list' to show source code of vmcores.

crash> bt -l
PID: 6674   TASK: ffff88007b8f0ae0  CPU: 0   COMMAND: "test_locks5"
...
    [exception RIP: locks_remove_flock+253]
    RIP: ffffffff811d92bd  RSP: ffff88007b8f3de8  RFLAGS: 00010246
    RAX: 0000000000000081  RBX: ffff88007abe73c0  RCX: 0000000000000000
    RDX: ffff88007b8f0ae0  RSI: ffff880037cd17c0  RDI: 0000000000000282
    RBP: ffff88007b8f3eb8   R8: ffff880079ccfc80   R9: 0000000000000002
    R10: 0000000000000001  R11: 0000000000000000  R12: ffff88007b28d730
    R13: ffff88007b8f3de8  R14: ffff88007b329780  R15: ffff8800376996c0
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
 #7 [ffff88007b8f3de0] locks_remove_flock at ffffffff811d9283
    /usr/src/debug/kernel-2.6.32-431.17.1.el6/linux-2.6.32-431.17.1.el6.x86_64/fs/locks.c: 2026
 #8 [ffff88007b8f3ec0] __fput at ffffffff8118a3c0
    /usr/src/debug/kernel-2.6.32-431.17.1.el6/linux-2.6.32-431.17.1.el6.x86_64/fs/file_table.c: 248
 #9 [ffff88007b8f3f10] fput at ffffffff8118a525
    /usr/src/debug/kernel-2.6.32-431.17.1.el6/linux-2.6.32-431.17.1.el6.x86_64/fs/file_table.c: 200
#10 [ffff88007b8f3f20] filp_close at ffffffff8118584d
    /usr/src/debug/kernel-2.6.32-431.17.1.el6/linux-2.6.32-431.17.1.el6.x86_64/fs/open.c: 977
#11 [ffff88007b8f3f50] sys_close at ffffffff81185925
    /usr/src/debug/kernel-2.6.32-431.17.1.el6/linux-2.6.32-431.17.1.el6.x86_64/fs/open.c: 1007
#12 [ffff88007b8f3f80] tracesys at ffffffff8100b288 (via system_call)
    /usr/src/debug////////kernel-2.6.32-431.17.1.el6/linux-2.6.32-431.17.1.el6.x86_64/arch/x86/kernel/entry_64.S: 606
    RIP: 00007f7fddfd65ad  RSP: 00007f7fde3f5e70  RFLAGS: 00000293
    RAX: ffffffffffffffda  RBX: ffffffff8100b288  RCX: ffffffffffffffff
    RDX: 0000000000000004  RSI: 0000000000400000  RDI: 0000000000000003
    RBP: 00007f7fde3f5eb0   R8: 00000000ffffffff   R9: 00007f7fde3f6700
    R10: 00007f7fde3f5c00  R11: 0000000000000293  R12: 00007f7fde3f69c0
    R13: 00007fffb3475d70  R14: 00007f7fd80008c0  R15: 0000000000000003
    ORIG_RAX: 0000000000000003  CS: 0033  SS: 002b
crash> gdb list fs/locks.c:2026
2021            }
2022    
2023            lock_kernel();
2024            before = &inode->i_flock;
2025    
2026            while ((fl = *before) != NULL) {
2027                    if (fl->fl_file == filp) {
2028                            if (IS_FLOCK(fl)) {
2029                                    locks_delete_lock(before);
2030                                    continue;



Additional info:
For practical / implementation purposes, we'll need to account for increased storage space as well.  Storage space is outside the scope of this RFE but we may want to mention it in documentation and/or comment above any config option in /etc/retrace-server.conf

The question is, how do we do the trees?  If git trees are available, a lot of space can be saved with 'git clone --reference'.  If we do "rpmbuild -bp" then it may be more reliable.

Comment 5 Lachlan McIlroy 2014-08-04 00:59:40 UTC
This is definitely a useful feature - there have many times when I've opened a vmcore and wished this feature was available.  The ability to quickly look at the faulting source code could result in an equally quick confirmation of a known problem without needing to find a repository and checkout the correct branch.

But with most vmcore analyses I do it's only a matter of time before I need the full kernel source indexed with cscope/tags so I can search it and the recent change history too (and not to mention other trees to compare with too).  So while this is a very convenient feature it does have it's limits.

Comment 6 Fabio Olive Leite 2014-08-04 14:15:38 UTC
This is indeed an interesting feature, that can increase our efficiency when doing the "first touch" on a vmcore and, as Lachlan points out, quickly confirm known issues, or checking the vmcore sources against the most current ones.

Since many parts of the kernel remain unchanged between minor releases and z-streams, an engineer could be very efficient with a checkout of the most recent sources and comparing it against what crash shows for the vmcore.  If the sources are the same for the problem under consideration, the engineer can quickly confirm or discard a hypothesis without waiting for the local checkout for that particular release, or decide to check the commit log for the particular file under consideration to quickly determine why something changed.

My only concern would be how efficient we can make the sources storage space.  I wonder how much we can use hardlinking and deduplication to keep that manageable.

Comment 7 Dave Wysochanski 2014-08-04 21:58:39 UTC
Yes I was thinking it would be useful to have the source even if all it was used for was listing out the source code of the faulting task.  Having the source right there automatically for every vmcore also opens more possibilities for automated analysis.

The other thing is it just feels incomplete to me to have a core (userspace or kernel) analysis system without source code integration.

Point taken that a lot of people that work on vmcores have their own trees and/or may need more than just the one tree.  I would like to hear from some kernel TSE and non-SEGs.


Note You need to log in before you can comment on or make changes to this bug.