Bug 1628574 - Fedora 28 aarch64 orangepi prime unable to handle kernel paging request after running for about 10 hours
Summary: Fedora 28 aarch64 orangepi prime unable to handle kernel paging request after...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 28
Hardware: aarch64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ARMTracker
TreeView+ depends on / blocked
 
Reported: 2018-09-13 13:14 UTC by Zamir SUN
Modified: 2019-02-21 21:10 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-02-21 21:10:46 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Linux Kernel 201243 None None None Never

Description Zamir SUN 2018-09-13 13:14:34 UTC
Description of problem:
Unable to handle kernel paging request happens after running OrangePi Prime for about 10 hours. I don't have any specific job running on it. Just a Fedora minimal + ssh server. 


Version-Release number of selected component (if applicable):
kernel-4.18.5-200.fc28.aarch64

How reproducible:
Frequently

Steps to Reproduce:
1. Power on OrangePi Prime with the given kernel, then just let it there
2.
3.

Actual results:
System will hang with unable to handle kernel paging request on console
(There are a lot of "wlan0: link is not ready" before and after the oops which I did not paste here)

[36630.569110] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[36945.596633] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[37124.218855] Unable to handle kernel paging request at virtual address ffff000009687dd8
[37124.226850] Mem abort info:
[37124.229645]   ESR = 0x96000007
[37124.232700]   Exception class = DABT (current EL), IL = 32 bits
[37124.238655]   SET = 0, FnV = 0
[37124.241715]   EA = 0, S1PTW = 0
[37124.244855] Data abort info:
[37124.247753]   ISV = 0, ISS = 0x00000007
[37124.251589]   CM = 0, WnR = 0
[37124.254562] swapper pgtable: 4k pages, 48-bit VAs, pgdp = 0000000014395a12
[37124.261447] [ffff000009687dd8] pgd=00000000bfffe803, pud=00000000bfffd803, pmd=00000000bfff9803, pte=00f8000041687f13
[37124.272077] Internal error: Oops: 96000007 [#1] SMP
[37124.276955] Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables vfat fat realtek snd_soc_hdmi_codec rc_cec dw_hdmi_cec dw_hdmi_i2s_audio sun8i_codec_analog sun4i_codec dwmac_sun8i r8723bs(C) stmmac_platform snd_soc_core stmmac mdio_mux ac97_bus of_mdio snd_pcm_dmaengine sun8i_drm_hdmi fixed_phy snd_pcm libphy dw_hdmi snd_timer crc32_ce cec crct10dif_ce snd sunxi_cir sun4i_drm ghash_ce rc_core sun4i_frontend sun8i_mixer sun4i_tcon soundcore
[37124.347855]  cfg80211 drm_kms_helper sunxi_wdt drm rfkill fb_sys_fops syscopyarea sysfillrect leds_gpio sysimgblt mmc_block sunxi phy_generic musb_hdrc udc_core ohci_platform ehci_platform sunxi_mmc phy_sun4i_usb gpio_keys
[37124.367625] CPU: 3 PID: 2538 Comm: dnf Tainted: G         C        4.18.5-200.fc28.aarch64 #1
[37124.376148] Hardware name: sunxi sunxi/sunxi, BIOS 2018.03 04/15/2018
[37124.382589] pstate: 20400005 (nzCv daif +PAN -UAO)
[37124.387395] pc : memblock_is_map_memory+0x24/0xa0
[37124.392105] lr : pfn_valid+0x20/0x30
[37124.395679] sp : ffff00000e033c60
[37124.398993] x29: ffff00000e033c60 x28: ffff8000763a8000 
[37124.404308] x27: ffff000008a42000 x26: 000000000000004f 
[37124.409620] x25: 00000000000007ff x24: ffff00000e033e38 
[37124.414924] x23: 0000000000000000 x22: ffff800074cbb000 
[37124.420236] x21: 0000000074cba020 x20: 0000000000000fe0 
[37124.425550] x19: 00000000b4cba000 x18: 000000000000023f 
[37124.430863] x17: 0000000000000000 x16: 0000000000000000 
[37124.436176] x15: 0000000000000000 x14: 000000000000002b 
[37124.441490] x13: 736e6967756c702d x12: 666e642f73656761 
[37124.446803] x11: 6b6361702d657469 x10: 732f362e336e6f68 
[37124.452117] x9 : 736567616b636170 x8 : ffff800074cbe000 
[37124.457430] x7 : ffff800074cba000 x6 : 0000000000004000 
[37124.462743] x5 : 0000000000015831 x4 : 0000ffffffffffff 
[37124.468055] x3 : 0000000000000000 x2 : 0000000000000000 
[37124.473368] x1 : 0000000000000fe0 x0 : ffff000009687dc8 
[37124.478684] Process dnf (pid: 2538, stack limit = 0x00000000956e6cc5)
[37124.485120] Call trace:
[37124.487571]  memblock_is_map_memory+0x24/0xa0
[37124.491928]  pfn_valid+0x20/0x30
[37124.495162]  __check_object_size+0x68/0x1e0
[37124.499351]  strncpy_from_user+0x48/0x368
[37124.503362]  getname_flags+0x6c/0x1b0
[37124.507026]  user_path_at_empty+0x40/0x78
[37124.511039]  vfs_statx+0x80/0xe0
[37124.514269]  sys_newfstatat+0x40/0x68
[37124.517934]  el0_svc_naked+0x30/0x34
[37124.521515] Code: d503201f 90009fa0 91372000 52800003 (b9401002) 
[37124.527610] ---[ end trace 68c59fb7d3d25bfc ]---
[37260.624594] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready


Expected results:
System should work fine

Additional info:

Comment 1 Peter Robinson 2018-09-13 13:23:46 UTC
It looks like dnf is causing some issue, like using too much memory or stack space:

[37124.478684] Process dnf (pid: 2538, stack limit = 0x00000000956e6cc5)

There's currently a number of dnf blocker bugs so might be worth checking them or other dnf isuses reported: https://qa.fedoraproject.org/blockerbugs/

I suspect it's the "dnf makecache" timer running that's causing this. I have an OrangePi PC which has been up for weeks without any issue but I also disable make cache.

Comment 2 Zamir SUN 2018-09-13 13:43:44 UTC
Thanks. I just disabled the dnf-makecache timer and service. Will follow up here tomorrow.

Comment 3 Peter Robinson 2018-09-13 14:45:58 UTC
(In reply to Zamir SUN from comment #2)
> Thanks. I just disabled the dnf-makecache timer and service. Will follow up
> here tomorrow.

It should be reported as a dnf bug if so

Comment 4 Zamir SUN 2018-09-26 12:34:06 UTC
(In reply to Peter Robinson from comment #3)
> (In reply to Zamir SUN from comment #2)
> > Thanks. I just disabled the dnf-makecache timer and service. Will follow up
> > here tomorrow.
> 

With dnf-makecache disabled, this problem no longer happens.

> It should be reported as a dnf bug if so

Well, I also talked to one of the linux-sunxi kernel developer, she said kernel should not oops no matter how bad the userspace is coded. So I prefer to keep it in kernel.

I will report this to upstream kernel as well.

Comment 5 Peter Robinson 2018-09-26 12:46:32 UTC
> Well, I also talked to one of the linux-sunxi kernel developer, she said
> kernel should not oops no matter how bad the userspace is coded. So I prefer
> to keep it in kernel.

Sure, but without an easy way to recreate it due to a lack of resources it'll likely remain here as I've not seen the issue across numerous devices with other applications.

> I will report this to upstream kernel as well.

That would be by far the best.

Comment 6 Zamir SUN 2018-09-26 13:29:04 UTC
Upstream bug filed
https://bugzilla.kernel.org/show_bug.cgi?id=201243

Comment 7 Justin M. Forbes 2019-01-29 16:24:22 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There are a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 28 kernel bugs.

Fedora 28 has now been rebased to 4.20.5-100.fc28.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 29, and are still experiencing this issue, please change the version to Fedora 29.

If you experience different issues, please open a new bug report for those.

Comment 8 Justin M. Forbes 2019-02-21 21:10:46 UTC
*********** MASS BUG UPDATE **************
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 3 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.


Note You need to log in before you can comment on or make changes to this bug.