Bug 910593 - Git causes kernel soft lockup
Summary: Git causes kernel soft lockup
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 18
Hardware: arm
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Jon Masters
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-02-12 23:03 UTC by Quentin Armitage
Modified: 2014-05-21 13:08 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-04-13 08:09:06 UTC


Attachments (Terms of Use)

Description Quentin Armitage 2013-02-12 23:03:49 UTC
Description of problem:
Git (or other largish transfer e.g. ftp) causes kernel soft lockup

Version-Release number of selected component (if applicable):
kernel-kirkwood-3.7.5-201.fc18.armv5tel
I first experienced this problem with kernel-kirkwood-3.6.10-6.fc18.armv5tel, although the previous kernel to that that I used was 3.1.4, which doesn't have the problem

How reproducible:
Always

Steps to Reproduce:
1. git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
2. Wait about a minute
3.
  
Actual results:
Kernel soft lockup

Expected results:
Successful data transfer.

Additional info:

One one occasion there were multiple of the mx643xx_eth_port ... messages for about 20 seconds, then a pause and then the BUG: soft lockup messages start. Normally is just a 25second or so pause and then the BUG: soft lockup messages start.

When I did an ftp transfer, aproximately 35Mb transferred before the problem occurred; on another occasion a few 10s of kb were transferred by git before the lockup.

[371414.773444] mv643xx_eth_port mv643xx_eth_port.0 eth0: tx error
[371414.950024] mv643xx_eth_port mv643xx_eth_port.0 eth0: tx error


[371437.822892] BUG: soft lockup - CPU#0 stuck for 22s! [git:17118]
[371437.828927] Modules linked in: 8021q garp stp llc nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables vfat fat btmrvl_sdio btmrvl snd_usb_audio snd_usbmidi_lib snd_hwdep snd_rawmidi snd_seq_device snd_pcm snd_page_alloc snd_timer snd soundcore libertas_sdio libertas bluetooth cfg80211 rfkill mv643xx_eth leds_gpio mmc_block sata_mv mvsdio mmc_core mv_cesa usb_storage
[371437.881787] 
[371437.883367] Pid: 17118, comm:                  git
[371437.888261] CPU: 0    Not tainted  (3.7.5-201.fc18.armv5tel.kirkwood #1)
[371437.895086] PC is at feroceon_l2_inv_range+0x44/0xb8
[371437.900155] LR is at __dma_page_cpu_to_dev+0x80/0x94
[371437.905232] pc : [<c0014a50>]    lr : [<c001058c>]    psr: 00000013
[371437.905232] sp : dec61cc0  ip : c00147dc  fp : c0675f60
[371437.916931] r10: c090fe80  r9 : 00000000  r8 : 1d9f85fe
[371437.922260] r7 : 0016be80  r6 : 00000000  r5 : 0b5f4000  r4 : 0b5f4000
[371437.928905] r3 : c0014a0c  r2 : 20000013  r1 : 0b5f4012  r0 : 0b5f4000
[371437.935549] Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[371437.942805] Control: 0005397f  Table: 1f368000  DAC: 00000015
[371437.948682] [<c000f354>] (unwind_backtrace+0x0/0x124) from [<c00771dc>] (watchdog_timer_fn+0xf4/0x148)
[371437.958128] [<c00771dc>] (watchdog_timer_fn+0xf4/0x148) from [<c003c0c4>] (__run_hrtimer+0xb0/0x1d4)
[371437.967395] [<c003c0c4>] (__run_hrtimer+0xb0/0x1d4) from [<c003c8d8>] (hrtimer_interrupt+0x104/0x250)
[371437.976752] [<c003c8d8>] (hrtimer_interrupt+0x104/0x250) from [<c0017508>] (orion_timer_interrupt+0x24/0x34)
[371437.986719] [<c0017508>] (orion_timer_interrupt+0x24/0x34) from [<c0077998>] (handle_irq_event_percpu+0x38/0x23c)
[371437.997120] [<c0077998>] (handle_irq_event_percpu+0x38/0x23c) from [<c0077bcc>] (handle_irq_event+0x30/0x40)
[371438.007087] [<c0077bcc>] (handle_irq_event+0x30/0x40) from [<c007a24c>] (handle_level_irq+0xcc/0xdc)
[371438.016348] [<c007a24c>] (handle_level_irq+0xcc/0xdc) from [<c00773c4>] (generic_handle_irq+0x28/0x38)
[371438.025782] [<c00773c4>] (generic_handle_irq+0x28/0x38) from [<c0009c40>] (handle_IRQ+0x68/0x8c)
[371438.034698] [<c0009c40>] (handle_IRQ+0x68/0x8c) from [<c0450cb4>] (__irq_svc+0x34/0x78)
[371438.042832] [<c0450cb4>] (__irq_svc+0x34/0x78) from [<c0014a50>] (feroceon_l2_inv_range+0x44/0xb8)
[371438.051926] [<c0014a50>] (feroceon_l2_inv_range+0x44/0xb8) from [<c001058c>] (__dma_page_cpu_to_dev+0x80/0x94)
[371438.062065] [<c001058c>] (__dma_page_cpu_to_dev+0x80/0x94) from [<c0010618>] (arm_dma_map_page+0x40/0x6c)
[371438.071767] [<c0010618>] (arm_dma_map_page+0x40/0x6c) from [<c023ce2c>] (dma_async_memcpy_buf_to_pg+0xe0/0x1ec)
[371438.081994] [<c023ce2c>] (dma_async_memcpy_buf_to_pg+0xe0/0x1ec) from [<c023e448>] (dma_memcpy_to_iovec+0xd4/0x158)
[371438.092571] [<c023e448>] (dma_memcpy_to_iovec+0xd4/0x158) from [<c038d964>] (dma_skb_copy_datagram_iovec+0x5c/0x1d4)
[371438.103238] [<c038d964>] (dma_skb_copy_datagram_iovec+0x5c/0x1d4) from [<c03b4250>] (tcp_recvmsg+0x630/0xac4)
[371438.113288] [<c03b4250>] (tcp_recvmsg+0x630/0xac4) from [<c03d48dc>] (inet_recvmsg+0x48/0x5c)
[371438.121938] [<c03d48dc>] (inet_recvmsg+0x48/0x5c) from [<c0365b2c>] (sock_aio_read+0x100/0x120)
[371438.130766] [<c0365b2c>] (sock_aio_read+0x100/0x120) from [<c00e9b1c>] (do_sync_read+0x98/0xd4)
[371438.139598] [<c00e9b1c>] (do_sync_read+0x98/0xd4) from [<c00ea268>] (vfs_read+0xb4/0x184)
[371438.147906] [<c00ea268>] (vfs_read+0xb4/0x184) from [<c00ea378>] (sys_read+0x40/0x6c)
[371438.155859] [<c00ea378>] (sys_read+0x40/0x6c) from [<c0008cc0>] (ret_fast_syscall+0x0/0x2c)

The above from BUG: soft lockup repeats every 30 seconds or so.

I makes no difference whether the transfer is over ipv4 or ipv6.

The harware is a Dreamplug, and I have had the same problem on 2 different Dreamplugs.

Comment 1 Peter Robinson 2013-04-13 08:09:06 UTC
Please update to 3.8.x

Comment 2 Stefan Ring 2014-05-21 11:11:51 UTC
I have the same thing reproducibly happening on my SheevaPlug with kernel 3.9.10-100.fc17.armv5tel.kirkwood. Not git, but also a large network transfer (that does not get very far):

$ wget https://java.net/downloads/openjdk6/openjdk-6-src-b31-15_apr_2014.tar.xz
--2014-05-21 13:07:38--  https://java.net/downloads/openjdk6/openjdk-6-src-b31-15_apr_2014.tar.xz
Resolving java.net (java.net)... 137.254.56.25
Connecting to java.net (java.net)|137.254.56.25|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 33245892 (32M) [application/x-tar]
Saving to: `openjdk-6-src-b31-15_apr_2014.tar.xz'

 0% [                                          ] 46,009      45.3K/s              


[  214.214083] BUG: soft lockup - CPU#0 stuck for 22s! [wget:578]
[  214.219939] Modules linked in: lockd sunrpc mtdchar ofpart cmdlinepart orion_nand marvell nand nand_ecc nand_ids mtd leds_gpio orion_wdt mv_cesa mv643xx_eth ums_cypress usb_storage mmc_core
[  214.237029]
[  214.238522] Pid: 578, comm:                 wget
[  214.243154] CPU: 0    Not tainted  (3.9.10-100.fc17.armv5tel.kirkwood #1)
[  214.249978] PC is at page_address+0x68/0xd0
[  214.254180] LR is at dma_cache_maint_page+0xc0/0xfc
[  214.259079] pc : [<c00cb4c4>]    lr : [<c000f88c>]    psr: 80000013
[  214.259079] sp : de17bcb8  ip : 0001dd88  fp : c0bb8100
[  214.270604] r10: c07e8900  r9 : c06c3da0  r8 : 00000744
[  214.275845] r7 : 00000001  r6 : 0001dd88  r5 : 00000003  r4 : 00000003
[  214.282394] r3 : c07e8900  r2 : c071867c  r1 : 000002ac  r0 : c0bb8100
[  214.288951] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[  214.296119] Control: 0005397f  Table: 1f3ac000  DAC: 00000015
[  214.301897] [<c000e488>] (unwind_backtrace+0x0/0x124) from [<c007b108>] (watchdog_timer_fn+0xe4/0x134)
[  214.311248] [<c007b108>] (watchdog_timer_fn+0xe4/0x134) from [<c003aca8>] (__run_hrtimer+0xd0/0x1c8)
[  214.320429] [<c003aca8>] (__run_hrtimer+0xd0/0x1c8) from [<c003b498>] (hrtimer_interrupt+0x110/0x25c)
[  214.329693] [<c003b498>] (hrtimer_interrupt+0x110/0x25c) from [<c001674c>] (orion_timer_interrupt+0x24/0x34)
[  214.339572] [<c001674c>] (orion_timer_interrupt+0x24/0x34) from [<c007b8f8>] (handle_irq_event_percpu+0x5c/0x234)
[  214.349886] [<c007b8f8>] (handle_irq_event_percpu+0x5c/0x234) from [<c007bb00>] (handle_irq_event+0x30/0x40)
[  214.359767] [<c007bb00>] (handle_irq_event+0x30/0x40) from [<c007e1c4>] (handle_level_irq+0xcc/0xdc)
[  214.368949] [<c007e1c4>] (handle_level_irq+0xcc/0xdc) from [<c007b300>] (generic_handle_irq+0x28/0x38)
[  214.378304] [<c007b300>] (generic_handle_irq+0x28/0x38) from [<c0009540>] (handle_IRQ+0x68/0x8c)
[  214.387129] [<c0009540>] (handle_IRQ+0x68/0x8c) from [<c04872f4>] (__irq_svc+0x34/0x78)
[  214.395179] [<c04872f4>] (__irq_svc+0x34/0x78) from [<c00cb4c4>] (page_address+0x68/0xd0)
[  214.403402] [<c00cb4c4>] (page_address+0x68/0xd0) from [<c000f88c>] (dma_cache_maint_page+0xc0/0xfc)
[  214.412583] [<c000f88c>] (dma_cache_maint_page+0xc0/0xfc) from [<c00104f8>] (__dma_page_dev_to_cpu+0x7c/0xcc)
[  214.422550] [<c00104f8>] (__dma_page_dev_to_cpu+0x7c/0xcc) from [<c02606d0>] (dma_async_memcpy_buf_to_pg+0x158/0x1ec)
[  214.433215] [<c02606d0>] (dma_async_memcpy_buf_to_pg+0x158/0x1ec) from [<c0261fc8>] (dma_memcpy_to_iovec+0xd4/0x158)
[  214.443792] [<c0261fc8>] (dma_memcpy_to_iovec+0xd4/0x158) from [<c03c362c>] (dma_skb_copy_datagram_iovec+0x5c/0x1d4)
[  214.454371] [<c03c362c>] (dma_skb_copy_datagram_iovec+0x5c/0x1d4) from [<c03ea840>] (tcp_recvmsg+0x634/0xac8)
[  214.464337] [<c03ea840>] (tcp_recvmsg+0x634/0xac8) from [<c040be98>] (inet_recvmsg+0x48/0x5c)
[  214.472910] [<c040be98>] (inet_recvmsg+0x48/0x5c) from [<c0399aa0>] (sock_aio_read+0xfc/0x118)
[  214.481568] [<c0399aa0>] (sock_aio_read+0xfc/0x118) from [<c00f1f20>] (do_sync_read+0x98/0xd4)
[  214.490224] [<c00f1f20>] (do_sync_read+0x98/0xd4) from [<c00f2684>] (vfs_read+0xb4/0x184)
[  214.498438] [<c00f2684>] (vfs_read+0xb4/0x184) from [<c00f2910>] (sys_read+0x40/0x6c)
[  214.506304] [<c00f2910>] (sys_read+0x40/0x6c) from [<c0008c40>] (ret_fast_syscall+0x0/0x2c)

Comment 3 Josh Boyer 2014-05-21 13:03:24 UTC
F17 has been EOL for some time.  The Fedora ARM effort no longer supports armv5tel binaries at all, as everything has moved to armv7hl.  There isn't anything we can do to help you here.

Comment 4 Stefan Ring 2014-05-21 13:08:02 UTC
Too bad. FWIW, I have a self-built 3.3.2 kernel, interestingly with PREEMPT, that works perfectly. I have been using this one with F17 for a long time before trying to switch to the distro kernel for convenience. Unsuccessfully, as can be seen here... :(


Note You need to log in before you can comment on or make changes to this bug.