Bug 663807 - jme driver locks up and network dies
Summary: jme driver locks up and network dies
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 14
Hardware: i686
OS: Linux
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
Depends On:
TreeView+ depends on / blocked
Reported: 2010-12-16 21:44 UTC by David Rees
Modified: 2010-12-17 00:44 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2010-12-17 00:06:20 UTC
Type: ---

Attachments (Terms of Use)

Description David Rees 2010-12-16 21:44:41 UTC
Description of problem:
Shortly after bringing up the network interface, it locks up.  Reloading the jme kernel module doesn't seem to help.

Version-Release number of selected component (if applicable):

How reproducible:
Every time - sometimes it locks up right away, but it always locks up after some short period of time.

Additional info:

lspci -v output seems to be a bit broken for the device:

08:00.5 Ethernet controller: JMicron Technology Corp. JMC250 PCI Express Gigabit Ethernet Controller (rev ff) (prog-if ff)
	!!! Unknown header type 7f
	Kernel modules: jme

The computer is an Asus ET2010AG.

Kernel messages - the interrupt stops responding and then the netdev watchdog chimes in.

[  177.802050] irq 19: nobody cared (try booting with the "irqpoll" option)
[  177.802064] Pid: 0, comm: swapper Not tainted #1
[  177.802071] Call Trace:
[  177.802090]  [<c048430f>] __report_bad_irq.clone.1+0x33/0x73
[  177.802101]  [<c048444b>] note_interrupt+0xfc/0x155
[  177.802112]  [<c0484d37>] handle_fasteoi_irq+0x89/0xa7
[  177.802122]  [<c0404e54>] handle_irq+0x40/0x4c
[  177.802130]  [<c0404bd5>] do_IRQ+0x46/0x91
[  177.802138]  [<c04038f0>] common_interrupt+0x30/0x38
[  177.802149]  [<c042261c>] ? native_safe_halt+0xa/0xc
[  177.802158]  [<c0408c86>] default_idle+0x42/0x60
[  177.802166]  [<c040214c>] cpu_idle+0x8e/0xaf
[  177.802176]  [<c07a2a5e>] start_secondary+0x241/0x281
[  177.802182] handlers:
[  177.802185] [<c069ed65>] (usb_hcd_irq+0x0/0x68)
[  177.802197] Disabling IRQ #19
[  195.712027] ------------[ cut here ]------------
[  195.712047] WARNING: at net/sched/sch_generic.c:258 dev_watchdog+0xc6/0x12e()
[  195.712055] Hardware name: ET2010AG        
[  195.712061] NETDEV WATCHDOG: eth0 (jme): transmit queue 0 timed out
[  195.712066] Modules linked in: cpufreq_ondemand powernow_k8 mperf ipv6 snd_hda_codec_idt arc4 ecb ath9k ath9k_common snd_hda_intel ath9k_hw snd_hda_codec ath snd_hwdep snd_seq joydev snd_seq_device mac80211 snd_pcm uvcvideo videodev nw_fermi snd_timer v4l1_compat cfg80211 snd i2c_piix4 rfkill soundcore k10temp sparse_keymap snd_page_alloc asus_atk0110 jme wmi mii pata_acpi ata_generic pata_atiixp sdhci_pci sdhci mmc_core radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
[  195.712147] Pid: 0, comm: swapper Not tainted #1
[  195.712153] Call Trace:
[  195.712167]  [<c0439321>] warn_slowpath_common+0x6a/0x7f
[  195.712177]  [<c072a78b>] ? dev_watchdog+0xc6/0x12e
[  195.712186]  [<c04393a9>] warn_slowpath_fmt+0x2b/0x2f
[  195.712195]  [<c072a78b>] dev_watchdog+0xc6/0x12e
[  195.712206]  [<c0450c39>] ? hrtimer_forward+0x114/0x128
[  195.712215]  [<c0407d24>] ? read_tsc+0xa/0x28
[  195.712224]  [<c0419ce2>] ? apic_write+0x14/0x16
[  195.712233]  [<c04438b4>] run_timer_softirq+0x167/0x20e
[  195.712242]  [<c072a6c5>] ? dev_watchdog+0x0/0x12e
[  195.712251]  [<c043e75e>] __do_softirq+0xa9/0x14a
[  195.712260]  [<c043e832>] do_softirq+0x33/0x3d
[  195.712267]  [<c043ea3b>] irq_exit+0x31/0x64
[  195.712275]  [<c041a1f2>] smp_apic_timer_interrupt+0x65/0x73
[  195.712284]  [<c07a7de5>] apic_timer_interrupt+0x31/0x38
[  195.712295]  [<c042261c>] ? native_safe_halt+0xa/0xc
[  195.712303]  [<c0408c86>] default_idle+0x42/0x60
[  195.712311]  [<c040214c>] cpu_idle+0x8e/0xaf
[  195.712320]  [<c07a2a5e>] start_secondary+0x241/0x281
[  195.712327] ---[ end trace bd5630fcfcfdffca ]---

Let me know if there is any other information I can provide.

Comment 1 David Rees 2010-12-16 23:27:30 UTC
Also tried kernel which has a jme change in it - but no change in behavior.

Tried booting with various boot parameters - pci=noapic or nolapic_timer both seem to "fix" the issue (based on limited testing).

With the card not stuck, I was able to get more information out of lspci on the card.

08:00.5 Ethernet controller: JMicron Technology Corp. JMC250 PCI Express
Gigabit Ethernet Controller (rev 03)
	Subsystem: ASUSTeK Computer Inc. Device 842e
	Flags: bus master, fast devsel, latency 0, IRQ 21
	Memory at 91000000 (32-bit, non-prefetchable) [size=16K]
	I/O ports at 2100 [size=128]
	I/O ports at 2000 [size=256]
	Capabilities: [68] Power Management version 3
	Capabilities: [50] Express Legacy Endpoint, MSI 00
	Capabilities: [40] MSI-X: Enable- Count=8 Masked-
	Capabilities: [70] MSI: Enable+ Count=1/8 Maskable+ 64bit+
	Kernel driver in use: jme
	Kernel modules: jme

Comment 2 David Rees 2010-12-17 00:06:20 UTC
Grr, closing bug, it's invalid.

Appears to be invalid as it's triggered when a 3rd party NextWindow Touchscreen driver (nw-fermi) is loaded and used.

Comment 3 David Rees 2010-12-17 00:44:55 UTC
Just a bit more info for the record.

The 3rd party nw-fermi driver isn't completely to blame.  I loaded the usbtouchscreen driver instead and added the NextWindow Touchscreen device id to new_id and ended up with the same results.

pci=noapic seems to be the most reliable way to prevent the issue from happening but at this point the touchscreen seems to be the cause of the issue (and it does not work without the nw-fermi driver).

Note You need to log in before you can comment on or make changes to this bug.