Bug 1250613 - kernel BUG at arch/arm/mm/highmem.c:114!
Summary: kernel BUG at arch/arm/mm/highmem.c:114!
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 23
Hardware: arm
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Peter Robinson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ARMTracker
TreeView+ depends on / blocked
 
Reported: 2015-08-05 14:57 UTC by Paul Whalen
Modified: 2015-09-06 06:21 UTC (History)
9 users (show)

Fixed In Version: 4.1.6-100.fc21
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-08-25 16:07:05 UTC


Attachments (Terms of Use)

Description Paul Whalen 2015-08-05 14:57:36 UTC
Description of problem:
kernel-4.2.0-0.rc's have been crashing with:

[   46.164634] ------------[ cut here ]------------
[   46.169287] kernel BUG at arch/arm/mm/highmem.c:114!
[   46.174261] Internal error: Oops - BUG: 0 [#1] SMP ARM
[   46.179406] Modules linked in: loop mmc_block at803x fec ptp ahci_imx libahci_platform sdhci_esdhc_imx i2c_imx sdhci_pltfm sdhci pps_core mmc_core phy_mxs_usb imxdrm drm_kms_helper rtc_snvs syscopyarea sysfillrect sysimgblt drm xts lrw gf128mul sha256_arm dm_crypt dm_round_robin linear raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor xor xor_neon async_tx raid6_pq raid1 raid0 scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi squashfs cramfs dm_multipath
[   46.226528] CPU: 2 PID: 1173 Comm: pigz Not tainted 4.2.0-0.rc5.git0.2.fc23.armv7hl #1
[   46.234452] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[   46.240986] task: edf02e00 ti: df5da000 task.ti: df5da000
[   46.246412] PC is at __kunmap_atomic+0x54/0x178
[   46.250956] LR is at copy_page_to_iter+0x15c/0x258
[   46.255755] pc : [<c0222f84>]    lr : [<c052dcf0>]    psr: 20010013
[   46.255755] sp : df5dbe78  ip : 000000dd  fp : b4b41208
[   46.267236] r10: df5dbf0c  r9 : df5dbf14  r8 : 00000200
[   46.272466] r7 : ffedf200  r6 : 00000000  r5 : c0d48554  r4 : ffedf000
[   46.278999] r3 : 00021000  r2 : ffede000  r1 : 2d879000  r0 : ffedf000
[   46.285532] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   46.292671] Control: 10c5387d  Table: 2f5e404a  DAC: 00000015
[   46.298423] Process pigz (pid: 1173, stack limit = 0xdf5da220)
[   46.304261] Stack: (0xdf5dbe78 to 0xdf5dc000)
[   46.308625] be60:                                                       df5dbf0c 00000200
[   46.316810] be80: 00000000 c052dcf0 ef7c7538 ffedf000 00000200 00000000 dfba0280 dfc9ad50
[   46.324994] bea0: 00000200 0001be00 00000001 df5cbcc0 df5da000 c0387908 00000000 00000000
[   46.333181] bec0: 0000000e c08ea4a4 00000000 00000150 dfc9ac00 df5dbf14 df5da000 df5cbcc0
[   46.341366] bee0: 00000000 df5dbf88 0001be00 c020fae4 df5da000 00000200 000394c0 c037fe88
[   46.349550] bf00: 0001be00 c020fae4 df5da000 b4b41208 0001be00 00000000 00000000 0001be00
[   46.357734] bf20: df5dbf0c 00000001 df5cbcc0 00000000 00000000 00000000 00000000 00000000
[   46.365920] bf40: 00000000 00000000 eb827d88 b4b41208 df5cbcc0 df5dbf88 0001be00 c0380550
[   46.374105] bf60: df5cbcc0 b4b41208 0001be00 df5cbcc0 df5cbcc1 b4b41208 0001be00 c020fae4
[   46.382292] bf80: df5da000 c0380e04 00000000 00000000 0001be00 00000200 0001be00 b4b41208
[   46.390475] bfa0: 00000003 c020fad0 00000200 0001be00 00000000 b4b41208 0001be00 00000000
[   46.398659] bfc0: 00000200 0001be00 b4b41208 00000003 00000000 012d48e8 00025de8 000394c0
[   46.406845] bfe0: 00000000 bee0a558 b6fe84c0 b6f7fad8 800e0010 00000000 3f7fd821 3f7fdc21
[   46.415048] [<c0222f84>] (__kunmap_atomic) from [<c052dcf0>] (copy_page_to_iter+0x15c/0x258)
[   46.423510] [<c052dcf0>] (copy_page_to_iter) from [<c0387908>] (pipe_read+0xc8/0x260)
[   46.431356] [<c0387908>] (pipe_read) from [<c037fe88>] (__vfs_read+0xb0/0xd8)
[   46.438502] [<c037fe88>] (__vfs_read) from [<c0380550>] (vfs_read+0x8c/0x13c)
[   46.445646] [<c0380550>] (vfs_read) from [<c0380e04>] (SyS_read+0x48/0x88)
[   46.452542] [<c0380e04>] (SyS_read) from [<c020fad0>] (__sys_trace_return+0x0/0x10)
[   46.460208] Code: e1a03603 e0632002 e1540002 0a000000 (e7f001f2) 
[   46.466310] ---[ end trace 32dc3f318c68a352 ]--

The system remains functional, new kernel installations usually fail when running dracut.

Comment 1 Peter Robinson 2015-08-05 20:21:18 UTC
Upstream thread here http://www.spinics.net/lists/arm-kernel/msg437204.html

Russell's response:

What it looks like from your oops is that the address which was passed
in was 0xffedf000, but the address we calculated via the following for
the current index was 0xfff00000:

type = kmap_atomic_idx();
idx = type + KM_TYPE_NR * smp_processor_id();
__fix_to_virt(idx)

Doing a bit of maths... the address 0xffedf000 corresponds to a fixmap
index of... (0xffeff000 - 0xffedf000) >> 12 = 32.  KM_TYPE_NR is 16 on
ARM, so the mapping was created by CPU 2, and type was zero.

On unmap, 0xfff00000 gives... (0xffeff000 - 0xfff00000) >> 12 = -1.
That suggests we're on CPU 0, and type is -1 - in other words, there
are no atomically mapped mappings on CPU 0.

Since kmap_atomic() disables preemption and page faults, how did your
kernel migrate this thread from CPU 2 to CPU 0... and I can't see how
that happened.

Comment 2 Mark Salter 2015-08-11 17:49:40 UTC
Added this to upstream thread:

The fedora kernel is using PREEMPT_VOLUNTARY with !PREEMPT and
!PREEMPT_CPOUNT. So preempt_disable() is a nop. I added some code
to catch the kernel scheduling between kmap_atomic() and
kunmap_atomic() and got this straightaway:

[    2.958651] ------------[ cut here ]------------
[    2.963263] kernel BUG at arch/arm/mm/highmem.c:61!
[    2.968132] Internal error: Oops - BUG: 0 [#1] SMP ARM
[    2.973261] Modules linked in:
[    2.976313] CPU: 0 PID: 199 Comm: systemd-udevd Tainted: G        W       4.2.0-rc5 #9
[    2.984218] Hardware name: Highbank
[    2.987699] task: ecb9bf40 ti: eccce000 task.ti: eccce000
[    2.993097] PC is at check_kmap_atomic+0x20/0x2c
[    2.997710] LR is at __schedule+0x254/0x60c
[    3.001885] pc : [<c022318c>]    lr : [<c08d49a0>]    psr: 200d0093
[    3.001885] sp : ecccfdd8  ip : 00000000  fp : ecccfe1c
[    3.013350] r10: c0d508a0  r9 : ecb9c244  r8 : ffeff000
[    3.018565] r7 : c0d4a140  r6 : ec90a280  r5 : ed3b7140  r4 : ecb9bf40
[    3.025081] r3 : 00000001  r2 : 2c66d000  r1 : eccce000  r0 : 00000000
[    3.031599] Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
[    3.038810] Control: 10c5387d  Table: 2cc9c04a  DAC: 00000015
[    3.044546] Process systemd-udevd (pid: 199, stack limit = 0xeccce220)
[    3.051063] Stack: (0xecccfdd8 to 0xeccd0000)
[    3.055410] fdc0:                                                       eccad500 00000000
[    3.063580] fde0: ecc83e80 eccbc488 ecc83e80 c08d5030 2c66d000 00000000 00000002 eccce000
[    3.071749] fe00: 00000000 0000012a b6fe0000 ffeff000 ecccff14 ecccff0c ecccfe2c c08d5030
[    3.079918] fe20: ecb5f538 00000000 ecccfe34 c08d5088 b6fe0000 c08d6a14 00000000 c05285a0
[    3.088086] fe40: b6fe0000 ffeff000 0000012a 0000012a 00000000 ec3da524 ffeff000 0000012a
[    3.096255] fe60: ecccff14 c052dcec ef7b1678 ffeff000 0000012a 00000000 eccbc480 ec3da524
[    3.104423] fe80: ef7b1678 00000000 0000012a ecccff28 00000000 c032add4 00000000 c04b6804
[    3.112592] fea0: ecccff14 ffffffff 00000fff ec3da438 eccbc4e8 00000001 00000129 00000000
[    3.120760] fec0: ecc72c18 00000000 00000000 000b6fe0 00000000 00000000 b6fe1000 eccbc480
[    3.128929] fee0: 00000000 ecccff88 00001000 c020fae4 eccce000 00000200 00000000 c037ff08
[    3.137098] ff00: 00001000 c020fae4 eccce000 b6fe0000 00001000 00000000 00000000 00001000
[    3.145266] ff20: ecccff0c 00000001 eccbc480 00000000 00000000 00000000 00000000 00000000
[    3.153435] ff40: 00000000 00000000 00000000 b6fe0000 eccbc480 ecccff88 00001000 c03805d0
[    3.161603] ff60: eccbc480 b6fe0000 00001000 eccbc480 eccbc480 b6fe0000 00001000 c020fae4
[    3.169772] ff80: eccce000 c0380e84 00000000 00000000 00001000 8066e1c0 00003ffe 8066e1c0
[    3.177940] ffa0: 00000003 c020fad0 8066e1c0 00003ffe 00000006 b6fe0000 00001000 00000040
[    3.186109] ffc0: 8066e1c0 00003ffe 8066e1c0 00000003 0000000a bee856e4 00000000 00000000
[    3.194277] ffe0: 00000000 bee8528c b6daa470 b6e0f5b0 600d0010 00000006 00000000 00000000
[    3.202454] [<c022318c>] (check_kmap_atomic) from [<c08d49a0>] (__schedule+0x254/0x60c)
[    3.210454] [<c08d49a0>] (__schedule) from [<c08d5030>] (preempt_schedule_common+0x24/0x40)
[    3.218799] [<c08d5030>] (preempt_schedule_common) from [<c08d5088>] (_cond_resched+0x3c/0x4c)
[    3.227404] [<c08d5088>] (_cond_resched) from [<c08d6a14>] (down_read+0x14/0x48)
[    3.234799] [<c08d6a14>] (down_read) from [<c05285a0>] (__copy_to_user_memcpy+0x54/0x17c)
[    3.242974] [<c05285a0>] (__copy_to_user_memcpy) from [<c052dcec>] (copy_page_to_iter+0xd8/0x258)
[    3.251844] [<c052dcec>] (copy_page_to_iter) from [<c032add4>] (generic_file_read_iter+0x370/0x5dc)
[    3.260885] [<c032add4>] (generic_file_read_iter) from [<c037ff08>] (__vfs_read+0xb0/0xd8)
[    3.269142] [<c037ff08>] (__vfs_read) from [<c03805d0>] (vfs_read+0x8c/0x13c)
[    3.276270] [<c03805d0>] (vfs_read) from [<c0380e84>] (SyS_read+0x48/0x88)
[    3.283141] [<c0380e84>] (SyS_read) from [<c020fad0>] (__sys_trace_return+0x0/0x10)
[    3.290790] Code: e7922100 e7923003 e3530000 012fff1e (e7f001f2) 
[    3.296876] ---[ end trace cb88537fdc8fa202 ]---
[    3.301485] note: systemd-udevd[199] exited with preempt_count 2097152

Comment 3 Mark Salter 2015-08-11 21:24:20 UTC
So, from what rmk and nico say, CONFIG_UACCESS_WITH_MEMCPY is broken wrt the fedora config using PREEMPT_VOLUNTARY and !PREEMPT_COUNT (preempt_disable() is a noop). There's an easy enough fix for that, but it sounds like fedora should not be using CONFIG_UACCESS_WITH_MEMCPY which was a workaround for performance issues on Orion CPU. Using "dd if=/dev/zero of=/dev/null bs=4k" on highbank, I see 1.8GB/s with CONFIG_UACCESS_WITH_MEMCPY turned on, and 2.1GB/s with it turned off.

Comment 4 Peter Robinson 2015-08-11 21:53:42 UTC
I'll turn it off I'd sooner not use anything not dealt with upstream and TBH I'm not sure why it was enabled but I suspect some pull over from something historical.

Mark thanks for your help on this.

Comment 5 Fedora Update System 2015-08-13 11:30:51 UTC
kernel-4.2.0-0.rc6.git0.2.fc23 has been submitted as an update for Fedora 23.
https://admin.fedoraproject.org/updates/kernel-4.2.0-0.rc6.git0.2.fc23

Comment 6 Fedora Update System 2015-08-15 02:20:09 UTC
Package kernel-4.2.0-0.rc6.git0.2.fc23:
* should fix your issue,
* was pushed to the Fedora 23 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-4.2.0-0.rc6.git0.2.fc23'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-13503/kernel-4.2.0-0.rc6.git0.2.fc23
then log in and leave karma (feedback).

Comment 7 Paul Whalen 2015-08-17 13:10:49 UTC
No longer seeing the oops on highbank, bpi and rawhide installs. Many thanks!

Comment 8 Fedora Update System 2015-08-18 00:19:41 UTC
kernel-4.1.6-200.fc22 has been submitted as an update for Fedora 22.
https://admin.fedoraproject.org/updates/kernel-4.1.6-200.fc22

Comment 9 Fedora Update System 2015-08-18 02:23:06 UTC
kernel-4.1.6-100.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/kernel-4.1.6-100.fc21

Comment 10 Fedora Update System 2015-08-25 16:07:01 UTC
kernel-4.2.0-0.rc6.git0.2.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.

Comment 11 Fedora Update System 2015-08-26 17:49:49 UTC
kernel-4.1.6-200.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report.

Comment 12 Fedora Update System 2015-09-06 06:20:57 UTC
kernel-4.1.6-100.fc21 has been pushed to the Fedora 21 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.