Bug 2226681 - kcore reading is broken on 6.5-rc3
Summary: kcore reading is broken on 6.5-rc3
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Baoquan He
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-07-26 08:13 UTC by Baoquan He
Modified: 2023-08-14 05:57 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-08-14 05:57:43 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Baoquan He 2023-07-26 08:13:35 UTC
On the latest linus's tree, the kernel 6.5-rc3, when executing either
of below two commands will trigger kernel panic.

  #makedumpfile --mem-usage /proc/kcore
or
  # cat /proc/kallsyms | grep ksys_read
  ffffffff8150ebc0 T ksys_read
  # objdump -d  --start-address=0xffffffff8150ebc0 --stop-address=0xffffffff8150ebd0 /proc/kcore 

  /proc/kcore:     file format elf64-x86-64


[13270.314323] Mem abort info:
[13270.317162]   ESR = 0x0000000096000007
[13270.320901]   EC = 0x25: DABT (current EL), IL = 32 bits
[13270.326217]   SET = 0, FnV = 0
[13270.329261]   EA = 0, S1PTW = 0
[13270.332390]   FSC = 0x07: level 3 translation fault
[13270.337270] Data abort info:
[13270.340139]   ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
[13270.345626]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[13270.350666]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[13270.355981] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000400651d64000
[13270.362672] [ffffdc9cf3ea0000] pgd=1000401ffffff003, p4d=1000401ffffff003, pud=1000401fffffe003, pmd=1000401fffffd003,
pte=0000000000000000
[13270.375367] Internal error: Oops: 0000000096000007 [#4] SMP
[13270.380934] Modules linked in: mlx5_ib ib_uverbs ib_core rfkill vfat fat joydev cdc_ether usbnet mii mlx5_core
acpi_ipmi mlxfw ipmi_ssif psample tls ipmi_devintf pci_hyperv_intf arm_spe_pmu ipmi_msghandler arm_cmn arm_dmc620_pmu
arm_dsu_pmu cppc_cpufreq acpi_tad fuse zram xfs crct10dif_ce polyval_ce polyval_generic ghash_ce uas sbsa_gwdt nvme
nvme_core ast usb_storage nvme_common i2c_algo_bit xgene_hwmon
[13270.416751] CPU: 15 PID: 8803 Comm: objdump Tainted: G      D            6.5.0-rc3 #1
[13270.424570] Hardware name: WIWYNN Mt.Jade Server System B81.030Z1.0007/Mt.Jade Motherboard, BIOS 2.10.20220531 (SCP:
2.10.20220531) 2022/05/31
[13270.437337] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[13270.444289] pc : __arch_copy_to_user+0x180/0x240
[13270.448910] lr : _copy_to_iter+0x11c/0x5d0
[13270.453002] sp : ffff8000b15a37c0
[13270.456306] x29: ffff8000b15a37c0 x28: ffffdc9cf3ea0000 x27: ffffdc9cf6938158
[13270.463431] x26: ffff8000b15a3ba8 x25: 0000000000000690 x24: ffff8000b15a3b80
[13270.470556] x23: 00000000000038ac x22: ffffdc9cf3ea0000 x21: ffff8000b15a3b80
[13270.477682] x20: ffffdc9cf64fdf00 x19: 0000000000000400 x18: 0000000000000000
[13270.484806] x17: 0000000000000000 x16: 0000000000000000 x15: ffffdc9cf3ea0000
[13270.491931] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[13270.499056] x11: 0001000000000000 x10: ffffdc9cf64fdf00 x9 : 0000000000000690
[13270.506182] x8 : 000000007c000000 x7 : 0000fd007e000000 x6 : 000000000eee0b60
[13270.513306] x5 : 000000000eee0f60 x4 : 0000000000000000 x3 : 0000000000000400
[13270.520431] x2 : 0000000000000380 x1 : ffffdc9cf3ea0000 x0 : 000000000eee0b60
[13270.527556] Call trace:
[13270.529992]  __arch_copy_to_user+0x180/0x240
[13270.534250]  read_kcore_iter+0x718/0x878
[13270.538167]  proc_reg_read_iter+0x8c/0xe8
[13270.542168]  vfs_read+0x214/0x2c0
[13270.545478]  ksys_read+0x78/0x118
[13270.548782]  __arm64_sys_read+0x24/0x38
[13270.552608]  invoke_syscall+0x78/0x108
[13270.556351]  el0_svc_common.constprop.0+0x4c/0xf8
[13270.561044]  do_el0_svc+0x34/0x50
[13270.564347]  el0_svc+0x34/0x108
[13270.567482]  el0t_64_sync_handler+0x100/0x130
[13270.571829]  el0t_64_sync+0x194/0x198
[13270.575483] Code: d503201f d503201f d503201f d503201f (a8c12027)
[13270.581567] ---[ end trace 0000000000000000 ]---


Reproducible: Always

Comment 1 Baoquan He 2023-07-26 08:15:55 UTC
As per people's comment, it's caused by:

2e1c0170771e fs/proc/kcore: avoid bounce buffer for ktext data

This is still under investigation.

Comment 2 Dave Young 2023-08-01 04:07:48 UTC
Hi Baoquan, it is strange that the file format is elf64-x86-64, but the calltrace is arm64 architecture...

Comment 3 Baoquan He 2023-08-01 06:38:45 UTC
(In reply to Dave Young from comment #2)
> Hi Baoquan, it is strange that the file format is elf64-x86-64, but the
> calltrace is arm64 architecture...

I copied the report from upstream about objdump. Jiri Olsa reported he encountered the objdump execution failure, when I tested it, I met the corruption. I should only copy the command part. Please see below thread:

https://lore.kernel.org/all/ZHc2fm+9daF6cgCE@krava/T/#u

Comment 4 Dave Young 2023-08-01 10:03:51 UTC
Ok, I Got it, so the bug description includes two different reports for same bug, one is from Jiri which is based on x86, the latter one is copied from another test on arm64.

Comment 5 Baoquan He 2023-08-01 12:03:42 UTC
(In reply to Dave Young from comment #4)
> Ok, I Got it, so the bug description includes two different reports for same
> bug, one is from Jiri which is based on x86, the latter one is copied from
> another test on arm64.

Yes. Jiri said he only saw the objdump failure, kernel is not harmed. When I tried his command,
the kernel is crashed directly. I forget why I didn't copy my own execution command.

Now, Lorenzo has posted a patch to revert the handling of KCORE_TEXT part, seems people will
accept this. I will test it to verify.

[PATCH] fs/proc/kcore: reinstate bounce buffer for KCORE_TEXT regions
https://lore.kernel.org/all/20230731215021.70911-1-lstoakes@gmail.com/T/#u

Comment 6 Baoquan He 2023-08-14 05:57:43 UTC
Patch has been merged into linus's tree:

17457784004c fs/proc/kcore: reinstate bounce buffer for KCORE_TEXT regions

So close this bug as the commit will take effect after ARK kernel synchronized with upstream kernel.


Note You need to log in before you can comment on or make changes to this bug.