Bug 510764

Summary: Data dissapearing from print_backtrace on s390x between versions
Product: Red Hat Enterprise Linux 5 Reporter: Petr Muller <pmuller>
Component: systemtapAssignee: Frank Ch. Eigler <fche>
Status: CLOSED WONTFIX QA Contact: BaseOS QE <qe-baseos-auto>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.6CC: mjw, ohudlick
Target Milestone: rc   
Target Release: ---   
Hardware: s390x   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-07-12 19:54:50 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Petr Muller 2009-07-10 16:34:23 UTC
Description of problem:
On s390x, the testcase from bug 503225 showed a minor issue where a data displayed by an old version of systemtap were not displayed by a new version, which is actually a very minor regression. 

Version-Release number of selected component (if applicable):
0.9.7-5.el5
kernel-2.6.18-156.el5

How reproducible:
Always

Steps to Reproduce:
1. run the reproducer
2. see the backtrace
  
Actual results:
tcp_nagle_check time_us=1247234364123187 pid=31956 exec=sshd mss_now=1348 nonagle=1
 0x00000000002188a2 : __tcp_push_pending_frames+0x656/0x8cc [kernel]               
[0x0000000018a73a58] [0x0000000018a73a70] 0x0000000018a73a70  

Expected results:
tcp_nagle_check time_us=1247234271692047 pid=31956 exec=sshd mss_now=1348 nonagle=1             
 0x00000000002188a2 : __tcp_push_pending_frames+0x656/0x8cc [kernel]                            
[0x0000000018a73a58] [0x0000000018a73a70] 0x0000000018a73a70 : iucv_init+0x186b1838/0x0 [kernel]

Additional info:
the testcase:

global cur_mss, nonagle

probe kernel.function("tcp_snd_test").call,
      kernel.function("__tcp_push_pending_frames").call
{
    cur_mss[tid()]=$cur_mss;
    nonagle[tid()]=$nonagle ;
}



probe kernel.function("tcp_snd_test").return,
      kernel.function("__tcp_push_pending_frames").return
{
    delete cur_mss[tid()];
    delete nonagle[tid()] ;
}



probe kernel.function("tcp_nagle_check")
{
    if(isinstr(execname(), "ssh"))
    {
        printf("tcp_nagle_check time_us=%d pid=%5d exec=%s mss_now=%d nonagle=%d\n", gettimeofday_us(), pid(), execname(), cur_mss[tid()],
nonagle[tid()]);
print_backtrace();
    }
}

Comment 1 Mark Wielaard 2009-07-10 17:35:52 UTC
(In reply to comment #0)
> On s390x, the testcase from bug 503225 showed a minor issue where a data
> displayed by an old version of systemtap were not displayed by a new version,
> which is actually a very minor regression. 

Do you happen to have the version number of that old version of systemtap (and preferably also the kernel version)?

Comment 2 Petr Muller 2009-07-12 22:59:54 UTC
Sure.

I've already given a kernel version :) - kernel-2.6.18-156.el5

Sorry for not giving the exact versions - I sometimes tend to suffer from the Curse of Knowledge in the RHEL world...

The old version is RHEL5.3.z's systemtap, that means systemtap-0.7.2-3.el5_3

Comment 3 Mark Wielaard 2009-07-17 11:38:54 UTC
I cannot immediately find what could have changed here and don't have a s390x handy. The symbol lookup code did change somewhat since then, but for this case should be similar. The code that doesn't find the name is in runtime/sym.c:

static void _stp_symbol_print(unsigned long address)
{
        const char *modname;
        const char *name;
        unsigned long offset, size;

        name = _stp_kallsyms_lookup(address, &size, &offset, &modname, NULL, NULL);

        _stp_printf("%p", (int64_t) address);

        if (name) {
                if (modname && *modname)
                        _stp_printf(" : %s+%#lx/%#lx [%s]", name, offset, size, modname);
                else
                        _stp_printf(" : %s+%#lx/%#lx", name, offset, size);
        }
}

So _stp_kallsyms_lookup () failed for the given address. If the kernel version between the runs is the same I cannot really explain atm why that would be though.

Comment 4 Frank Ch. Eigler 2009-10-27 13:28:44 UTC
Sorry, we don't have a complete analysis/patch for this problem yet.

Comment 5 Frank Ch. Eigler 2010-01-07 00:32:37 UTC
Need to reassess to what extent the 1.0.9 code for rhel5.5 improves this; otherwise defer.

Comment 6 Frank Ch. Eigler 2010-07-12 19:40:14 UTC
We cannot promise fixing this one in time, but the general dwarf unwinders are looking very good in systemtap 1.3; porting s390 to them might be possible.

Comment 7 RHEL Program Management 2010-07-12 19:54:50 UTC
Development Management has reviewed and declined this request.  You may appeal
this decision by reopening this request.