Bug 1718736
Summary: | [debug kernel] crash report: read error: kernel virtual address: ffff20000af33500 type: "idmap_ptrs_per_pgd" on a live system | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Emma Wu <xiawu> |
Component: | crash | Assignee: | Dave Anderson <anderson> |
Status: | CLOSED ERRATA | QA Contact: | Ziqian SUN (Zamir) <zsun> |
Severity: | low | Docs Contact: | |
Priority: | unspecified | ||
Version: | 8.1 | CC: | anderson, ruyang |
Target Milestone: | rc | Keywords: | Reopened |
Target Release: | 8.0 | ||
Hardware: | aarch64 | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | crash-7.2.6-2.el8 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-11-05 20:53:48 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1690227 |
Ah, so it is... sorry about that! It's a simple fix: --- a/arm64.c +++ b/arm64.c @@ -285,7 +285,7 @@ arm64_init(int when) case 65536: if (kernel_symbol_exists("idmap_ptrs_per_pgd") && readmem(symbol_value("idmap_ptrs_per_pgd"), KVADDR, - &value, sizeof(ulong), "idmap_ptrs_per_pgd", RETURN_ON_ERROR)) + &value, sizeof(ulong), "idmap_ptrs_per_pgd", QUIET|RETURN_ON_ERROR)) machdep->ptrs_per_pgd = value; if (machdep->machspec->VA_BITS > PGDIR_SHIFT_L3_64K) { But what I don't understand is why it's not seen on the regular kernel? I sanity-check each new RHEL8 kernel version running live, and for example, here is the latest 4.18.0-103.el8: [root@apm-mustang-ev3-07 ~]# crash crash 7.2.6-1.el8 Copyright (C) 2002-2019 Red Hat, Inc. Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005, 2011 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. GNU gdb (GDB) 7.6 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "aarch64-unknown-linux-gnu"... KERNEL: /usr/lib/debug/lib/modules/4.18.0-103.el8.aarch64/vmlinux DUMPFILE: /proc/kcore CPUS: 8 DATE: Mon Jun 10 10:36:48 2019 UPTIME: 00:00:58 LOAD AVERAGE: 1.66, 0.51, 0.18 TASKS: 229 NODENAME: apm-mustang-ev3-07.khw2.lab.eng.bos.redhat.com RELEASE: 4.18.0-103.el8.aarch64 VERSION: #1 SMP Sat Jun 8 15:47:32 UTC 2019 MACHINE: aarch64 (unknown Mhz) MEMORY: 16 GB PID: 4031 COMMAND: "crash" TASK: ffff8003724b3200 [THREAD_INFO: ffff8003724b3200] CPU: 4 STATE: TASK_RUNNING (ACTIVE) crash> I'll investigate it further. (In reply to Dave Anderson from comment #3) > > I'll investigate it further. The issue is related to the kernel configuration of CONFIG_DEVMEM in conjunction with the very first readmem() that is performed. $ pwd /home/git_repos/rhel8.1.0-git/configs $ $ find . -name CONFIG_DEVMEM ./debug/aarch64/CONFIG_DEVMEM ./generic/CONFIG_DEVMEM ./generic/aarch64/CONFIG_DEVMEM $ Generically it is set to y: $ cat ./generic/CONFIG_DEVMEM CONFIG_DEVMEM=y $ But aarch64 overrides the above with two possibilities: $ cat ./debug/aarch64/CONFIG_DEVMEM CONFIG_DEVMEM=y $ cat ./generic/aarch64/CONFIG_DEVMEM # CONFIG_DEVMEM is not set $ On the debug aarch64, /dev/mem does exist, and so it gets used for the very first readmem() of "idmap_ptrs_per_pgd". But because of CONFIG_STRICT_DEVMEM=y, it fails and prints out the error message -- but then gets retried using /proc/kcore as the live memory source from that point on. On the other (generic) architectures, /dev/mem also exists, but their very first readmem() is marked QUIET similar to the patch above, so there is no error message displayed. On the generic aarch64, /dev/mem does *not* exist, so /proc/kcore is set as the live memory source before the first readmem() is done. Patch applied upstream: https://github.com/crash-utility/crash/commit/bf48dd4e9926515345cad06c1bfce49d7a057a26 Fix for Linux 4.16 and later ARM64 kernels that contain kernel commit fa2a8445b1d3810c52f2a6b3a006456bd1aacb7e, titled "arm64: allow ID map to be extended to 52 bits", and which have been configured with both CONFIG_DEVMEM=y and CONFIG_STRICT_DEVMEM=y. Without the patch, an inconsequential error message indicating "crash: read error: kernel virtual address: <address> type: idmap_ptrs_per_pgd" is displayed during initialization. (anderson) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3349 |
> The error is reported when trying to read from /dev/mem. Crash should not show this error message unless > reading from /dev/mem and /proc/kcore both fail or debug flag is on: https://bugzilla.redhat.com/show_bug.cgi?id=1585944#c4 Read closely the last sentence: If the default live memory source /dev/mem is determined to be unusable because the kernel was configured with CONFIG_STRICT_DEVMEM, the first memory read during session initialization will fail. The current behavior results in a readmem() error message, followed by two notification messages that indicate that /dev/mem is restricted and a switch to using /proc/kcore will be attempted; the readmem is reattempted from /proc/kcore, and if successful, the session will continue initialization. With this patch, the behavior will change such that if the switch to /proc/kcore and the reattempted readmem() are successful, no messages will be displayed unless the crash session is invoked with "crash -d<number>". (anderson) You are invoking the session with -d7, so the debug flag is "on", and therefore the informational messages are intentionally displayed.