RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1360415 - Crash does not always parse correctly the modules symbol tables
Summary: Crash does not always parse correctly the modules symbol tables
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: crash
Version: 7.4
Hardware: All
OS: Linux
unspecified
low
Target Milestone: rc
: ---
Assignee: Dave Anderson
QA Contact: Emma Wu
URL:
Whiteboard:
Depends On:
Blocks: 1394638 1404314
TreeView+ depends on / blocked
 
Reported: 2016-07-26 16:23 UTC by Sebastien Piechurski
Modified: 2019-02-19 21:40 UTC (History)
4 users (show)

Fixed In Version: crash-7.1.8-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-01 22:04:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
patch for crash (2.39 KB, patch)
2016-07-26 16:23 UTC, Sebastien Piechurski
no flags Details | Diff
module which display crash issue (152.02 KB, application/octet-stream)
2016-11-25 13:22 UTC, alexandre.louvet
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:2019 0 normal SHIPPED_LIVE crash bug fix and enhancement update 2017-08-01 19:31:13 UTC

Description Sebastien Piechurski 2016-07-26 16:23:47 UTC
Created attachment 1184326 [details]
patch for crash

Description of problem:
Crash sometimes print garbage in place of symbol name

Version-Release number of selected component (if applicable):
crash-7.1.2-3.el7_2.1.x86_64

Issue description:
Under some cirumstances crash fail to decode symbol name and produce garbage

For exemple, on a node running lustre (https://wiki.hpdd.intel.com/display/PUB/HPDD+Wiki+Front+Page) for a given symbol I get:
crash> sym 0xffffffffa0537d70
ffffffffa0537d70 (?) <FF>СT<A0><FF><FF><FF><FF>V<C4>U<A0><FF><FF><FF><FF>ДT<A0><FF><FF><FF><FF>'<C4>U<A0><FF><FF><FF><FF><B0><94>T<A0><FF><FF><FF>
<FF>><C4>U<A0><FF><FF><FF><FF>@<8F>T<A0><FF><FF><FF><FF>^P<C4>U<A0><FF>
<FF><FF><FF><F0><98>T<A0><FF><FF><FF><FF><83><C4>U<A0><FF><FF><FF><FF><A0><95>T<A0><FF><FF><FF><FF><FE><C3>U<A0><FF><FF><FF><FF> [libcfs]

In the meantime if I ask crash to solve the symbol with the symbol name, I get the expected result :
crash> sym libcfs_log_return
ffffffffa0537d70 (t) libcfs_log_return [libcfs]

Note that the address used in previous example is the one associated to the symbol name. Crash was able to solve the association in one way, but not in the reverse one.

By chasing the issue a little, I found that there is an issue in the way crash cache symbols in store_module_symbols_v2.
		for (i = first = last = 0; i < nsyms; i++) {
			modsym = (struct kernel_symbol *)
			    (modsymbuf + (i * sizeof(struct kernel_symbol)));
			if (!first)
				first = (ulong)modsym->name;
			last = (ulong)modsym->name;
		}

The code expectation is that symbols name in the string buffer are in the exact same order that the symbol description. But there isn't any warranty that it will always be the case. With this code, first will always point to the first non null name while last is always the latest one. Nothing will warranty that first will have the smallest address nor last will have the highest one. However it is a strong assumption of the code that will solve symbol names later on :
			if (strbuf) 
				strcpy(buf1,
					&strbuf[(ulong)modsym->name - first]); Failing to have 'first' at the lowest address result in getting garbage (in best case).

On my system if I look into the way symbol in libcfs are stored, I have :
crash> mod | grep libcfs
ffffffffa0564420  libcfs                521580  (not loaded)  [CONFIG_KALLSYMS]
crash> module.syms ffffffffa0564420
  syms = 0xffffffffa054f030
crash> p ((struct kernel_symbol *)0xffffffffa054f030)[0]
$2 = {
  value = 18446744072104412544,
  name = 0xffffffffa055bc34 "__cfs_fail_check_set" <<<<<< address }
$3 = {
  value = 18446744072104413168,
  name = 0xffffffffa055bc1d "__cfs_fail_timeout_set"  <<<<< address }

A similar issue seams to exist for ngplsyms and in store_module_symbols_v1.
Attached patch is a proposal to address this issue (on crash 7.1.5). This patch make sure the buffer will cover the entire address space for strings.

Comment 2 Dave Anderson 2016-08-08 15:41:08 UTC
Would it be possible that you could make a vmlinux/vmcore pair that shows
this problem available?

Comment 3 Dave Anderson 2016-08-09 15:33:10 UTC
You can contact me offline with the location of your vmlinux/vmcore pair.

Comment 4 Dave Anderson 2016-08-16 19:26:06 UTC
I cannot reproduce it to the extent that your example shows, but I do see
occasional "double" symbols, where for example, "sys -M" shows the same
module address mapped to two different symbols, where the one of them
has a truncated name string.

In any case, the patch makes sense, and has been checked in upstream:

  https://github.com/crash-utility/crash/commit/2399cce9b7e93ea8b6b21b09873cfa7f091eea7b

  Fix for the gathering of module symbol name strings during session
  initialization.  In the unlikely case where the ordering of module
  symbol name strings does not match the order of the kernel_symbol
  structures, a faulty module symbol list entry may be created that
  contains a bogus name string.
  (sebastien.piechurski)

Comment 5 Qiao Zhao 2016-11-25 09:28:45 UTC
(In reply to Dave Anderson from comment #4)
> I cannot reproduce it to the extent that your example shows, but I do see
> occasional "double" symbols, where for example, "sys -M" shows the same
> module address mapped to two different symbols, where the one of them
> has a truncated name string.
> 
> In any case, the patch makes sense, and has been checked in upstream:
> 
>  
> https://github.com/crash-utility/crash/commit/
> 2399cce9b7e93ea8b6b21b09873cfa7f091eea7b
> 
>   Fix for the gathering of module symbol name strings during session
>   initialization.  In the unlikely case where the ordering of module
>   symbol name strings does not match the order of the kernel_symbol
>   structures, a faulty module symbol list entry may be created that
>   contains a bogus name string.
>   (sebastien.piechurski)

Hi Dave Anderson,

Sorry for disturb you. I try to reproduce this problem on my kvm guest, but when i run "sys -M" on crash>, i got error message "sys: invalid option -- 'M'".
Could you give me some idea how to reproduce this problem?

test package: crash-7.1.2-4.el7.x86_64

--
Thanks,
Qiao

Comment 6 alexandre.louvet 2016-11-25 13:22:59 UTC
Created attachment 1224321 [details]
module which display crash issue

Here are how to get the original issue :

- on a box running
  * kernel 3.10.0-327.36.3.el7.x86_64
  * crash-7.1.2-3.el7_2.x86_64

- load the attached libcfs.ko kernel modules (I don't exactly know how to generate a synthetic module which display this issue, so I just give you the one on which the issue was discovered and I used to track it down).
  # gzip -d ./libcfs.ko.gz
  # insmod ./libcfs.ko

- start crash on the live system
  # crash

- examine the symbol
  crash> sym libcfs_log_return
  ffffffffa05a2de0 (t) libcfs_log_return [libcfs]

  crash> sym ffffffffa05a2de0
  ffffffffa05a2de0 (?) <FF>@R[<A0><FF><FF><FF><FF>^^u\<A0><FF><FF><FF><FF>@E[<A0><FF><FF><FF><FF><EF>t\<A0>
<FF><FF><FF><FF> E[<A0><FF><FF><FF><FF>^Fu\<A0><FF><FF><FF><FF><B0>?[<A0><FF><FF><FF><FF><D8>t\<A0><FF>
<FF><FF><FF>`I[<A0><FF><FF><FF><FF>Ku\<A0><FF><FF><FF><FF>^PF[<A0><FF><FF><FF><FF><C6>t\<A0><FF><FF><FF>
<FF>pA[<A0><FF><FF><FF><FF>4u\<A0><FF><FF><FF><FF> [libcfs]

Comment 7 Dave Anderson 2016-11-28 14:43:32 UTC
(In reply to Qiao Zhao from comment #5)
> (In reply to Dave Anderson from comment #4)
> > I cannot reproduce it to the extent that your example shows, but I do see
> > occasional "double" symbols, where for example, "sys -M" shows the same
> > module address mapped to two different symbols, where the one of them
> > has a truncated name string.
> > 
> > In any case, the patch makes sense, and has been checked in upstream:
> > 
> >  
> > https://github.com/crash-utility/crash/commit/
> > 2399cce9b7e93ea8b6b21b09873cfa7f091eea7b
> > 
> >   Fix for the gathering of module symbol name strings during session
> >   initialization.  In the unlikely case where the ordering of module
> >   symbol name strings does not match the order of the kernel_symbol
> >   structures, a faulty module symbol list entry may be created that
> >   contains a bogus name string.
> >   (sebastien.piechurski)
> 
> Hi Dave Anderson,
> 
> Sorry for disturb you. I try to reproduce this problem on my kvm guest, but
> when i run "sys -M" on crash>, i got error message "sys: invalid option --
> 'M'".
> Could you give me some idea how to reproduce this problem?
> 
> test package: crash-7.1.2-4.el7.x86_64
> 
> --
> Thanks,
> Qiao

Sorry, that was a misprint.  It should be: sym -M

Comment 8 Qiao Zhao 2016-12-02 02:29:33 UTC
(In reply to alexandre.louvet from comment #6)
> Created attachment 1224321 [details]
> module which display crash issue
> 
> Here are how to get the original issue :
> 
> - on a box running
>   * kernel 3.10.0-327.36.3.el7.x86_64
>   * crash-7.1.2-3.el7_2.x86_64
> 
> - load the attached libcfs.ko kernel modules (I don't exactly know how to
> generate a synthetic module which display this issue, so I just give you the
> one on which the issue was discovered and I used to track it down).
>   # gzip -d ./libcfs.ko.gz
>   # insmod ./libcfs.ko
> 
> - start crash on the live system
>   # crash
> 
> - examine the symbol
>   crash> sym libcfs_log_return
>   ffffffffa05a2de0 (t) libcfs_log_return [libcfs]
> 
>   crash> sym ffffffffa05a2de0
>   ffffffffa05a2de0 (?)
> <FF>@R[<A0><FF><FF><FF><FF>^^u\<A0><FF><FF><FF><FF>@E[<A0><FF><FF><FF><FF><EF
> >t\<A0>
> <FF><FF><FF><FF>
> E[<A0><FF><FF><FF><FF>^Fu\<A0><FF><FF><FF><FF><B0>?[<A0><FF><FF><FF><FF><D8>t
> \<A0><FF>
> <FF><FF><FF>`I[<A0><FF><FF><FF><FF>Ku\<A0><FF><FF><FF><FF>^PF[<A0><FF><FF><FF
> ><FF><C6>t\<A0><FF><FF><FF>
> <FF>pA[<A0><FF><FF><FF><FF>4u\<A0><FF><FF><FF><FF> [libcfs]

Thanks for your "modules", i have got the same results. Thanks very much!

Comment 13 errata-xmlrpc 2017-08-01 22:04:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2019


Note You need to log in before you can comment on or make changes to this bug.