Bug 1439379
| Summary: | dis -r function+offset doesn't list instructions down to the offset value | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | John Siddle <jsiddle> |
| Component: | crash | Assignee: | Dave Anderson <anderson> |
| Status: | CLOSED ERRATA | QA Contact: | Qiao Zhao <qzhao> |
| Severity: | low | Docs Contact: | |
| Priority: | low | ||
| Version: | 7.3 | CC: | anderson, jsiddle, qzhao |
| Target Milestone: | rc | Flags: | qzhao:
needinfo-
|
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | crash-7.1.9-1.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-08-01 22:04:38 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
John Siddle
2017-04-05 20:41:07 UTC
> Not sure if relevant but s_show is a duplicate symbol in this kernel.
>
> crash> sym s_show
> ffffffff800aba08 (t) s_show /usr/src/debug/kernel-2.6.18/linux-2.6.18-409.el5.x86_64/kernel/kallsyms.c: 354
> ffffffff800aba62 (t) s_show /usr/src/debug/kernel-2.6.18/linux-2.6.18-409.el5.x86_64/mm/slab.c: 3905
Well, it's most definitely relevant. It's disassembling the first one it
finds, and the first one ends at s_show+0x5f.
If you enter "dis s_show", it reminds you of the duplicates:
crash> dis s_show
dis: s_show: duplicate text symbols found:
ffffffff800a7f6a (t) s_show /usr/src/debug/kernel-2.6.18/linux-2.6.18.x86_64/kernel/kallsyms.c: 354
ffffffff800a7fc4 (t) s_show /usr/src/debug/kernel-2.6.18/linux-2.6.18.x86_64/mm/slab.c: 3905
crash>
And the intent is for the "duplicate text symbols" message to be shown
for all argument invocations including "dis -r", but apparently that's
not always the case. I'll take a look and see what's up.
A fix has been pushed upstream to github repository: https://github.com/crash-utility/crash/commit/3bb49a5a956ba4c1e262d1f812650ecd7e5cb9d0 Fix for the "dis" command to detect duplicate symbols in the case of a "symbol+offset" argument where the duplicates are contiguous in the symbol list. In addition, reject "symbol+offset" arguments if the resultant address goes beyond the end of the function. (anderson) Actually, the manner in which the error is reported is debatable. The patch for this bugzilla addresses the specific case reported in comment #1, in which there are two "s_show" symbols that are contiguous in the kernel symbol list. This was the case in RHEL5, where the kernel referenced in comment #1 is a 2.6.18-409.el5 kernel: crash> sys | grep RELEASE RELEASE: 2.6.18-409.el5 crash> sym -l ... [ cut ] ... ffffffff800aba08 (t) .text.s_show ffffffff800aba08 (t) s_show ffffffff800aba62 (t) s_show ffffffff800abcf3 (t) .text.update_iter ... The upstream patch that addresses the bug reported in this bugzilla is specific to that "contigous" symbol situation: https://github.com/crash-utility/crash/commit/3bb49a5a956ba4c1e262d1f812650ecd7e5cb9d0 Fix for the "dis" command to detect duplicate symbols in the case of a "symbol+offset" argument where the duplicates are contiguous in the symbol list. In addition, reject "symbol+offset" arguments if the resultant address goes beyond the end of the function. (anderson) Now, you are testing with a RHEL7 kernel, which has 5 duplicate "s_show" symbols, but none of them are contiguous. So when you used "s_show+0xb1" as an argument, it selected the first one (with the lowest address), and added 0xb1 to it. That evaluated to an higher-address value that is contained in the "mce_severity_amd" function (at mce_severity_amd+81). So that argument makes no sense as an argument to the "dis -r" option, and as a result, the command fails with the error message "invalid expression: s_show+0xb1 evaluates to: mce_severity_amd+81". So that is a correct error message. However, it can be argued that it could be a "better" error message, which would be to to indicate "duplicate text symbols found", followed by the list of duplicate symbols. Towards that end, I have posted a patch upstream to change the error message for the test scenario you have done: https://github.com/crash-utility/crash/commit/183a8113274f63579bb19bc9c4054ad0989a835f Fix for the "dis" command to detect duplicate symbols in the case of a "symbol+offset" argument where the duplicates are not contiguous in the symbol list. Without the patch, the first of multiple symbol instances is used in the address evaluation. With the patch, the command will fail with the error message "dis: <symbol+offset>: duplicate text symbols found:", followed by a list of the duplicate symbols, and their file and line numbers if available. (anderson) With the patch above applied, the error message is different: crash> dis -r s_show+b1 dis: s_show+b1: duplicate text symbols found: ffffffff81043d80 (t) s_show /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/arch/x86/kernel/cpu/mcheck/mce-severity.c: 303 ffffffff811016d0 (t) s_show /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/kernel/kallsyms.c: 528 ffffffff81151130 (t) s_show /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/kernel/trace/trace.c: 2860 ffffffff811a5fb0 (t) s_show /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/mm/slab_common.c: 635 ffffffff811bd220 (t) s_show /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/mm/vmalloc.c: 2640 crash> Now, as to whether it is worth updating the crash utility for the rhel-7.4 errata is debatable: (1) Because it is so late in the cycle, I cannot push the patch and respin the package because I cannot check the new patch into dist-git unless it is approved has the "blocker+" flag set. (2) The issue described by the bugzilla reporter has been fixed, and it could be argued that this is a separate-but-similar issue. (3) Given that there will undoubtedly be a rebase for rhel-7.5, the new patch will be part of that rebase. I don't have a strong opinion either way. And I do not have the permissions to set the "blocker?" flag. So Qiao or John, do you have a strong opinion either way? (In reply to Dave Anderson from comment #9) > Actually, the manner in which the error is reported is debatable. > > The patch for this bugzilla addresses the specific case reported > in comment #1, in which there are two "s_show" symbols that are > contiguous in the kernel symbol list. This was the case in RHEL5, > where the kernel referenced in comment #1 is a 2.6.18-409.el5 kernel: > > crash> sys | grep RELEASE > RELEASE: 2.6.18-409.el5 > crash> sym -l > ... [ cut ] ... > ffffffff800aba08 (t) .text.s_show > ffffffff800aba08 (t) s_show > ffffffff800aba62 (t) s_show > ffffffff800abcf3 (t) .text.update_iter > ... > > The upstream patch that addresses the bug reported in this > bugzilla is specific to that "contigous" symbol situation: > > > https://github.com/crash-utility/crash/commit/ > 3bb49a5a956ba4c1e262d1f812650ecd7e5cb9d0 > > Fix for the "dis" command to detect duplicate symbols in the case > of a "symbol+offset" argument where the duplicates are contiguous > in the symbol list. In addition, reject "symbol+offset" arguments > if the resultant address goes beyond the end of the function. > (anderson) > > Now, you are testing with a RHEL7 kernel, which has 5 duplicate > "s_show" symbols, but none of them are contiguous. So when you > used "s_show+0xb1" as an argument, it selected the first one > (with the lowest address), and added 0xb1 to it. That evaluated > to an higher-address value that is contained in the "mce_severity_amd" > function (at mce_severity_amd+81). So that argument makes no sense > as an argument to the "dis -r" option, and as a result, the command > fails with the error message "invalid expression: s_show+0xb1 > evaluates to: mce_severity_amd+81". > > So that is a correct error message. > > However, it can be argued that it could be a "better" error message, > which would be to to indicate "duplicate text symbols found", > followed by the list of duplicate symbols. > > Towards that end, I have posted a patch upstream to change the > error message for the test scenario you have done: > > > https://github.com/crash-utility/crash/commit/ > 183a8113274f63579bb19bc9c4054ad0989a835f > > Fix for the "dis" command to detect duplicate symbols in the case > of a "symbol+offset" argument where the duplicates are not contiguous > in the symbol list. Without the patch, the first of multiple symbol > instances is used in the address evaluation. With the patch, the > command will fail with the error message "dis: <symbol+offset>: > duplicate text symbols found:", followed by a list of the duplicate > symbols, and their file and line numbers if available. > (anderson) > > With the patch above applied, the error message is different: > > crash> dis -r s_show+b1 > dis: s_show+b1: duplicate text symbols found: > ffffffff81043d80 (t) s_show > /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/arch/x86/ > kernel/cpu/mcheck/mce-severity.c: 303 > ffffffff811016d0 (t) s_show > /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/kernel/ > kallsyms.c: 528 > ffffffff81151130 (t) s_show > /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/kernel/ > trace/trace.c: 2860 > ffffffff811a5fb0 (t) s_show > /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/mm/ > slab_common.c: 635 > ffffffff811bd220 (t) s_show > /usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/mm/vmalloc. > c: 2640 > crash> > > Now, as to whether it is worth updating the crash utility for the > rhel-7.4 errata is debatable: > > (1) Because it is so late in the cycle, I cannot push the patch > and respin the package because I cannot check the new patch > into dist-git unless it is approved has the "blocker+" flag set. > (2) The issue described by the bugzilla reporter has been fixed, > and it could be argued that this is a separate-but-similar issue. > (3) Given that there will undoubtedly be a rebase for rhel-7.5, > the new patch will be part of that rebase. > > I don't have a strong opinion either way. And I do not have the > permissions to set the "blocker?" flag. > > So Qiao or John, do you have a strong opinion either way? Hi Dave, Sorry, I also no idea for this. -- Thanks, Qiao Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2019 |