RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1439379 - dis -r function+offset doesn't list instructions down to the offset value
Summary: dis -r function+offset doesn't list instructions down to the offset value
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: crash
Version: 7.3
Hardware: x86_64
OS: Linux
low
low
Target Milestone: rc
: ---
Assignee: Dave Anderson
QA Contact: Qiao Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-04-05 20:41 UTC by John Siddle
Modified: 2019-02-19 21:40 UTC (History)
3 users (show)

Fixed In Version: crash-7.1.9-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-01 22:04:38 UTC
Target Upstream Version:
Embargoed:
qzhao: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:2019 0 normal SHIPPED_LIVE crash bug fix and enhancement update 2017-08-01 19:31:13 UTC

Description John Siddle 2017-04-05 20:41:07 UTC
Description of problem:

In a certain core, running 'dis -r s_show+0xb1' will only display instructions to s_show+0x5f. I've verified that 0xb1 is a valid offset for this function.

Version-Release number of selected component (if applicable):

crash-7.1.8-1.el6_8.x86_64

How reproducible:

Easily

Steps to Reproduce:
1. ssh optimus.gsslab.rdu2.redhat.com
2. login with kerberos credentials
3. $ retrace-server-interact 118000389 crash
4. crash> dis -r s_show+0xb1

Actual results:

Instructions only go up to s_show+0x5f

Expected results:

Instructions should go to the specified offset, s_show+0xb1

Additional info:

Using the RIP instead of function+offset works fine, i.e. 'dis -r ffffffff800abb13' works as expected.

Not sure if relevant but s_show is a duplicate symbol in this kernel.

crash> sym s_show
ffffffff800aba08 (t) s_show /usr/src/debug/kernel-2.6.18/linux-2.6.18-409.el5.x86_64/kernel/kallsyms.c: 354
ffffffff800aba62 (t) s_show /usr/src/debug/kernel-2.6.18/linux-2.6.18-409.el5.x86_64/mm/slab.c: 3905

Comment 2 Dave Anderson 2017-04-05 20:57:42 UTC
> Not sure if relevant but s_show is a duplicate symbol in this kernel.
> 
> crash> sym s_show
> ffffffff800aba08 (t) s_show /usr/src/debug/kernel-2.6.18/linux-2.6.18-409.el5.x86_64/kernel/kallsyms.c: 354
> ffffffff800aba62 (t) s_show /usr/src/debug/kernel-2.6.18/linux-2.6.18-409.el5.x86_64/mm/slab.c: 3905

Well, it's most definitely relevant.  It's disassembling the first one it
finds, and the first one ends at s_show+0x5f.

If you enter "dis s_show", it reminds you of the duplicates:

crash> dis s_show
dis: s_show: duplicate text symbols found:
ffffffff800a7f6a (t) s_show /usr/src/debug/kernel-2.6.18/linux-2.6.18.x86_64/kernel/kallsyms.c: 354
ffffffff800a7fc4 (t) s_show /usr/src/debug/kernel-2.6.18/linux-2.6.18.x86_64/mm/slab.c: 3905
crash> 

And the intent is for the "duplicate text symbols" message to be shown
for all argument invocations including "dis -r", but apparently that's
not always the case.  I'll take a look and see what's up.

Comment 3 Dave Anderson 2017-04-07 15:54:17 UTC
A fix has been pushed upstream to github repository:

https://github.com/crash-utility/crash/commit/3bb49a5a956ba4c1e262d1f812650ecd7e5cb9d0

  Fix for the "dis" command to detect duplicate symbols in the case
  of a "symbol+offset" argument where the duplicates are contiguous
  in the symbol list.  In addition, reject "symbol+offset" arguments
  if the resultant address goes beyond the end of the function.
  (anderson)

Comment 9 Dave Anderson 2017-06-02 20:07:32 UTC
Actually, the manner in which the error is reported is debatable.

The patch for this bugzilla addresses the specific case reported
in comment #1, in which there are two "s_show" symbols that are
contiguous in the kernel symbol list.  This was the case in RHEL5,
where the kernel referenced in comment #1 is a 2.6.18-409.el5 kernel:

  crash> sys | grep RELEASE
       RELEASE: 2.6.18-409.el5
  crash> sym -l 
  ... [ cut ] ...
  ffffffff800aba08 (t) .text.s_show
  ffffffff800aba08 (t) s_show
  ffffffff800aba62 (t) s_show
  ffffffff800abcf3 (t) .text.update_iter
  ...

The upstream patch that addresses the bug reported in this 
bugzilla is specific to that "contigous" symbol situation:

  https://github.com/crash-utility/crash/commit/3bb49a5a956ba4c1e262d1f812650ecd7e5cb9d0

  Fix for the "dis" command to detect duplicate symbols in the case
  of a "symbol+offset" argument where the duplicates are contiguous
  in the symbol list.  In addition, reject "symbol+offset" arguments
  if the resultant address goes beyond the end of the function.
  (anderson)
  
Now, you are testing with a RHEL7 kernel, which has 5 duplicate
"s_show" symbols, but none of them are contiguous.  So when you
used "s_show+0xb1" as an argument, it selected the first one
(with the lowest address), and added 0xb1 to it.  That evaluated
to an higher-address value that is contained in the "mce_severity_amd" 
function (at mce_severity_amd+81).  So that argument makes no sense 
as an argument to the "dis -r" option, and as a result, the command
fails with the error message "invalid expression: s_show+0xb1 
evaluates to: mce_severity_amd+81".

So that is a correct error message.  

However, it can be argued that it could be a "better" error message,
which would be to to indicate "duplicate text symbols found", 
followed by the list of duplicate symbols.

Towards that end, I have posted a patch upstream to change the
error message for the test scenario you have done:

  https://github.com/crash-utility/crash/commit/183a8113274f63579bb19bc9c4054ad0989a835f
  
  Fix for the "dis" command to detect duplicate symbols in the case
  of a "symbol+offset" argument where the duplicates are not contiguous
  in the symbol list.  Without the patch, the first of multiple symbol
  instances is used in the address evaluation.  With the patch, the
  command will fail with the error message "dis: <symbol+offset>:
  duplicate text symbols found:", followed by a list of the duplicate
  symbols, and their file and line numbers if available.
  (anderson)

With the patch above applied, the error message is different:

  crash> dis -r s_show+b1
  dis: s_show+b1: duplicate text symbols found:
  ffffffff81043d80 (t) s_show /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/arch/x86/kernel/cpu/mcheck/mce-severity.c: 303
  ffffffff811016d0 (t) s_show /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/kernel/kallsyms.c: 528
  ffffffff81151130 (t) s_show /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/kernel/trace/trace.c: 2860
  ffffffff811a5fb0 (t) s_show /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/mm/slab_common.c: 635
  ffffffff811bd220 (t) s_show /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/mm/vmalloc.c: 2640
  crash> 

Now, as to whether it is worth updating the crash utility for the
rhel-7.4 errata is debatable:

(1) Because it is so late in the cycle, I cannot push the patch 
    and respin the package because I cannot check the new patch 
    into dist-git unless it is approved has the "blocker+" flag set.
(2) The issue described by the bugzilla reporter has been fixed, 
    and it could be argued that this is a separate-but-similar issue.
(3) Given that there will undoubtedly be a rebase for rhel-7.5,
    the new patch will be part of that rebase. 

I don't have a strong opinion either way.  And I do not have the 
permissions to set the "blocker?" flag.

So Qiao or John, do you have a strong opinion either way?

Comment 10 Qiao Zhao 2017-06-06 09:54:47 UTC
(In reply to Dave Anderson from comment #9)
> Actually, the manner in which the error is reported is debatable.
> 
> The patch for this bugzilla addresses the specific case reported
> in comment #1, in which there are two "s_show" symbols that are
> contiguous in the kernel symbol list.  This was the case in RHEL5,
> where the kernel referenced in comment #1 is a 2.6.18-409.el5 kernel:
> 
>   crash> sys | grep RELEASE
>        RELEASE: 2.6.18-409.el5
>   crash> sym -l 
>   ... [ cut ] ...
>   ffffffff800aba08 (t) .text.s_show
>   ffffffff800aba08 (t) s_show
>   ffffffff800aba62 (t) s_show
>   ffffffff800abcf3 (t) .text.update_iter
>   ...
> 
> The upstream patch that addresses the bug reported in this 
> bugzilla is specific to that "contigous" symbol situation:
> 
>  
> https://github.com/crash-utility/crash/commit/
> 3bb49a5a956ba4c1e262d1f812650ecd7e5cb9d0
> 
>   Fix for the "dis" command to detect duplicate symbols in the case
>   of a "symbol+offset" argument where the duplicates are contiguous
>   in the symbol list.  In addition, reject "symbol+offset" arguments
>   if the resultant address goes beyond the end of the function.
>   (anderson)
>   
> Now, you are testing with a RHEL7 kernel, which has 5 duplicate
> "s_show" symbols, but none of them are contiguous.  So when you
> used "s_show+0xb1" as an argument, it selected the first one
> (with the lowest address), and added 0xb1 to it.  That evaluated
> to an higher-address value that is contained in the "mce_severity_amd" 
> function (at mce_severity_amd+81).  So that argument makes no sense 
> as an argument to the "dis -r" option, and as a result, the command
> fails with the error message "invalid expression: s_show+0xb1 
> evaluates to: mce_severity_amd+81".
> 
> So that is a correct error message.  
> 
> However, it can be argued that it could be a "better" error message,
> which would be to to indicate "duplicate text symbols found", 
> followed by the list of duplicate symbols.
> 
> Towards that end, I have posted a patch upstream to change the
> error message for the test scenario you have done:
> 
>  
> https://github.com/crash-utility/crash/commit/
> 183a8113274f63579bb19bc9c4054ad0989a835f
>   
>   Fix for the "dis" command to detect duplicate symbols in the case
>   of a "symbol+offset" argument where the duplicates are not contiguous
>   in the symbol list.  Without the patch, the first of multiple symbol
>   instances is used in the address evaluation.  With the patch, the
>   command will fail with the error message "dis: <symbol+offset>:
>   duplicate text symbols found:", followed by a list of the duplicate
>   symbols, and their file and line numbers if available.
>   (anderson)
> 
> With the patch above applied, the error message is different:
> 
>   crash> dis -r s_show+b1
>   dis: s_show+b1: duplicate text symbols found:
>   ffffffff81043d80 (t) s_show
> /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/arch/x86/
> kernel/cpu/mcheck/mce-severity.c: 303
>   ffffffff811016d0 (t) s_show
> /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/kernel/
> kallsyms.c: 528
>   ffffffff81151130 (t) s_show
> /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/kernel/
> trace/trace.c: 2860
>   ffffffff811a5fb0 (t) s_show
> /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/mm/
> slab_common.c: 635
>   ffffffff811bd220 (t) s_show
> /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/mm/vmalloc.
> c: 2640
>   crash> 
> 
> Now, as to whether it is worth updating the crash utility for the
> rhel-7.4 errata is debatable:
> 
> (1) Because it is so late in the cycle, I cannot push the patch 
>     and respin the package because I cannot check the new patch 
>     into dist-git unless it is approved has the "blocker+" flag set.
> (2) The issue described by the bugzilla reporter has been fixed, 
>     and it could be argued that this is a separate-but-similar issue.
> (3) Given that there will undoubtedly be a rebase for rhel-7.5,
>     the new patch will be part of that rebase. 
> 
> I don't have a strong opinion either way.  And I do not have the 
> permissions to set the "blocker?" flag.
> 
> So Qiao or John, do you have a strong opinion either way?

Hi Dave,

Sorry, I also no idea for this.

--
Thanks,
Qiao

Comment 14 errata-xmlrpc 2017-08-01 22:04:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2019


Note You need to log in before you can comment on or make changes to this bug.