Bug 1013054

Summary:	irq 16: nobody cared
Product:	[Fedora] Fedora	Reporter:	Harald Reindl <h.reindl>
Component:	kernel	Assignee:	fedora-kernel-wireless-ath
Status:	CLOSED WONTFIX	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	19	CC:	gansalmon, h.reindl, itamar, jogreene, jonathan, kernel-maint, madhu.chinakonda, marcelo.barbosa, wheiss
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2014-02-04 19:10:37 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Harald Reindl 2013-09-27 17:24:36 UTC

my colleague is using the 100% identical hardware and has reported multiple times this issue and since i never faced it we assumed it's luck by having the WLAN card in a different slot

well, short ago with 3.11.2-200.fc19.x86_64 i had it the first time
because the other person has this problem more often over months it is very unlikely the new kernel-build - unsure what triggers this because i did not much special and had no load nor the WLAN used at the moment

[ 8254.358785] irq 16: nobody cared (try booting with the "irqpoll" option)
[ 8254.358789] CPU: 7 PID: 0 Comm: swapper/7 Tainted: GF          O 3.11.2-200.fc19.x86_64 #1
[ 8254.358790] Hardware name: Hewlett-Packard HP Compaq Elite 8300 CMT/3396, BIOS K01 v02.57 11/16/2012
[ 8254.358791]  ffff88040801298c ffff88041ebc3e50 ffffffff816476ef ffff880408012900
[ 8254.358793]  ffff88041ebc3e78 ffffffff810f80c2 ffff880408012900 0000000000000010
[ 8254.358794]  0000000000000000 ffff88041ebc3eb8 ffffffff810f84d8 ffffffff81500292
[ 8254.358795] Call Trace:
[ 8254.358796]  <IRQ>  [<ffffffff816476ef>] dump_stack+0x45/0x56
[ 8254.358804]  [<ffffffff810f80c2>] __report_bad_irq+0x32/0xd0
[ 8254.358805]  [<ffffffff810f84d8>] note_interrupt+0x138/0x1f0
[ 8254.358808]  [<ffffffff81500292>] ? cpuidle_enter_state+0x52/0xc0
[ 8254.358809]  [<ffffffff810f5ee1>] handle_irq_event_percpu+0xe1/0x1e0
[ 8254.358811]  [<ffffffff810f6016>] handle_irq_event+0x36/0x60
[ 8254.358812]  [<ffffffff810f9015>] handle_fasteoi_irq+0x55/0xf0
[ 8254.358815]  [<ffffffff8101459f>] handle_irq+0xbf/0x150
[ 8254.358816]  [<ffffffff8165220a>] ? atomic_notifier_call_chain+0x1a/0x20
[ 8254.358819]  [<ffffffff81658a4d>] do_IRQ+0x4d/0xc0
[ 8254.358820]  [<ffffffff8164e3ed>] common_interrupt+0x6d/0x6d
[ 8254.358821]  <EOI>  [<ffffffff81500292>] ? cpuidle_enter_state+0x52/0xc0
[ 8254.358823]  [<ffffffff815003c9>] cpuidle_idle_call+0xc9/0x210
[ 8254.358825]  [<ffffffff8101b5fe>] arch_cpu_idle+0xe/0x30
[ 8254.358827]  [<ffffffff810b66ae>] cpu_startup_entry+0xce/0x280
[ 8254.358829]  [<ffffffff8103ed77>] start_secondary+0x217/0x2c0
[ 8254.358830] handlers:
[ 8254.358832] [<ffffffff81469e90>] usb_hcd_irq
[ 8254.358838] [<ffffffffa018e640>] ath_isr [ath9k]
[ 8254.358839] Disabling IRQ #16
______________________________________________

[root@srv-rhsoft:~]$ lspci
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller (rev 09)
00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller (rev 09)
00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04)
00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04)
00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 7 Series/C210 Series Chipset Family High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 1 (rev c4)
00:1c.4 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 5 (rev c4)
00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a4)
00:1f.0 ISA bridge: Intel Corporation Q77 Express Chipset LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 7 Series/C210 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family SMBus Controller (rev 04)
01:00.0 Network controller: Qualcomm Atheros AR5418 Wireless Network Adapter [AR5008E 802.11(a)bgn] (PCI-Express) (rev 01)
02:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
______________________________________________

[root@srv-rhsoft:~]$ cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7       
  0:         29          0          0          0          0          0          0          0   IO-APIC-edge      timer
  1:          3          0          0          0          0          0          0          0   IO-APIC-edge      i8042
  4:          0          0          0          0          2          0          0          0   IO-APIC-edge    
  8:          1          0          0          0          0          0          0          0   IO-APIC-edge      rtc0
  9:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   acpi
 12:          4          0          0          0          0          0          0          0   IO-APIC-edge      i8042
 16:       5358       2547       1504       3127      24399      24620      17373      31193   IO-APIC-fasteoi   ehci_hcd:usb1, ath9k
 23:        106        266         95        183        471       2456       1353       3010   IO-APIC-fasteoi   ehci_hcd:usb2
 40:      11033       3846       8593       1285      25317      10203      17852       4605   PCI-MSI-edge      ahci
 41:          0          0          0          0          0          1          0          0   PCI-MSI-edge      xhci_hcd
 42:      37913       3392       3034       2220      21068       3995       4589       2807   PCI-MSI-edge      i915
 43:         42         24         35          2         82        100         55        145   PCI-MSI-edge      eth0
 44:         38        122         19          0        302         57         13        107   PCI-MSI-edge      snd_hda_intel
 45:         17       1736       2279        554       4069      36987       6979        924   PCI-MSI-edge      eth1-rx-0
 46:          6        197        308       8160       4888       1013        658       3510   PCI-MSI-edge      eth1-tx-0
 47:          0          0          0          0          1          0          1          0   PCI-MSI-edge      eth1
NMI:          0          0          0          0          0          0          0          0   Non-maskable interrupts
LOC:     111363      95999      96334      98733      40733      44101      45915      54367   Local timer interrupts
SPU:          0          0          0          0          0          0          0          0   Spurious interrupts
PMI:          0          0          0          0          0          0          0          0   Performance monitoring interrupts
IWI:       5387       3250       2150       2076       1663       1687       2401       1568   IRQ work interrupts
RTR:          5          0          0          0          0          0          0          0   APIC ICR read retries
RES:     127300     117662     109438     114419      50639      54077      56062      50163   Rescheduling interrupts
CAL:       1140        886        945        999        758        904        910        836   Function call interrupts
TLB:        303        603        353        335        697        734        398        544   TLB shootdowns
TRM:          0          0          0          0          0          0          0          0   Thermal event interrupts
THR:          0          0          0          0          0          0          0          0   Threshold APIC interrupts
MCE:          0          0          0          0          0          0          0          0   Machine check exceptions
MCP:          3          3          3          3          3          3          3          3   Machine check polls
ERR:          0
MIS:          0

Comment 1 Harald Reindl 2013-09-27 17:30:17 UTC

sorry - i missed the result of this problem:

while all works more or less the desktop get unusable slow, the mousepointer went lazy and i guess only by the power of the machine it was possible to save all things and reboot more or less smooth

Comment 2 Harald Reindl 2013-09-28 17:33:25 UTC

OK, now i say 3.11.2 makes things worser, the seond time i see this problem here while my colleague has it regulary on F18 over months
_____________________________________

i guess the bugfixes for ath9 making things worser

https://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.11.2

[harry@srv-rhsoft:~/Desktop]$ cat ChangeLog-3.11.2 | grep ath9
    ath9k: avoid accessing MRC registers on single-chain devices
    ath9k: fix rx descriptor related race condition
    ath9k: always clear ps filter bit on new assoc
_____________________________________

[65998.741193] irq 16: nobody cared (try booting with the "irqpoll" option)
[65998.741205] CPU: 3 PID: 0 Comm: swapper/3 Tainted: GF          O 3.11.2-200.fc19.x86_64 #1
[65998.741206] Hardware name: Hewlett-Packard HP Compaq Elite 8300 CMT/3396, BIOS K01 v02.57 11/16/2012
[65998.741207]  ffff88040801298c ffff88041eac3e50 ffffffff816476ef ffff880408012900
[65998.741208]  ffff88041eac3e78 ffffffff810f80c2 ffff880408012900 0000000000000010
[65998.741210]  0000000000000000 ffff88041eac3eb8 ffffffff810f84d8 ffffffff81500292
[65998.741211] Call Trace:
[65998.741212]  <IRQ>  [<ffffffff816476ef>] dump_stack+0x45/0x56
[65998.741220]  [<ffffffff810f80c2>] __report_bad_irq+0x32/0xd0
[65998.741221]  [<ffffffff810f84d8>] note_interrupt+0x138/0x1f0
[65998.741223]  [<ffffffff81500292>] ? cpuidle_enter_state+0x52/0xc0
[65998.741225]  [<ffffffff810f5ee1>] handle_irq_event_percpu+0xe1/0x1e0
[65998.741226]  [<ffffffff810f6016>] handle_irq_event+0x36/0x60
[65998.741228]  [<ffffffff810f9015>] handle_fasteoi_irq+0x55/0xf0
[65998.741230]  [<ffffffff8101459f>] handle_irq+0xbf/0x150
[65998.741232]  [<ffffffff8165220a>] ? atomic_notifier_call_chain+0x1a/0x20
[65998.741235]  [<ffffffff81658a4d>] do_IRQ+0x4d/0xc0
[65998.741236]  [<ffffffff8164e3ed>] common_interrupt+0x6d/0x6d
[65998.741237]  <EOI>  [<ffffffff81500292>] ? cpuidle_enter_state+0x52/0xc0
[65998.741239]  [<ffffffff815003c9>] cpuidle_idle_call+0xc9/0x210
[65998.741241]  [<ffffffff8101b5fe>] arch_cpu_idle+0xe/0x30
[65998.741243]  [<ffffffff810b66ae>] cpu_startup_entry+0xce/0x280
[65998.741245]  [<ffffffff8103ed77>] start_secondary+0x217/0x2c0
[65998.741246] handlers:
[65998.741248] [<ffffffff81469e90>] usb_hcd_irq
[65998.741254] [<ffffffffa0436640>] ath_isr [ath9k]
[65998.741255] Disabling IRQ #16

Comment 3 Harald Reindl 2013-09-28 17:34:51 UTC

interesting 

look at the time of my initial report - exactly 24 hours

Comment 4 Josh Boyer 2013-09-30 14:59:33 UTC

Please try and recreate this without loading whatever out-of-tree modules you have loaded.

Comment 5 Harald Reindl 2013-09-30 15:06:23 UTC

sorry, i can't shutdown VMware Workstation on this machine, it's hosting all internal services, build-environments, my other machine has no WLAN card and that happened exactly 2 times until now

maybe the changes from 3.11.2-201.fc19.x86_64 are fixing it for now but given that my colleague has the problem on F18 with ident hardware randomly over months and the amount of ath9k in the kernel-changelogs over months there is something wrong not related to the VMware modules

Comment 6 wheiss 2013-12-25 20:45:46 UTC

Hi,
also happens w/ Debian system with aptosid kernel 3.12-5.slh

Happened since I installed chrome and use heavy Flash apps.

Dec 22 13:43:22 osiris kernel: [  321.852928] irq 16: nobody cared (try booting with the "irqpoll" option)
Dec 22 13:43:22 osiris kernel: [  321.852937] CPU: 2 PID: 0 Comm: swapper/2 Tainted: P           O 3.12-0.slh.2-aptosid-amd64 #1
Dec 22 13:43:22 osiris kernel: [  321.852940] Hardware name: System manufacturer Maximus II Formula/Maximus II Formula, BIOS 2302    04/1
5/2010
Dec 22 13:43:22 osiris kernel: [  321.852942]  0000000000000006 ffffffff813722d2 ffff88022439d600 ffffffff8107647a
Dec 22 13:43:22 osiris kernel: [  321.852946]  ffff88022439d600 0000000000000000 0000000000000010 ffffffff81076804
Dec 22 13:43:22 osiris kernel: [  321.852950]  0000000000000000 ffff88022439d600 0000000000000010 0000000000000000
Dec 22 13:43:22 osiris kernel: [  321.852954] Call Trace:
Dec 22 13:43:22 osiris kernel: [  321.852957]  <IRQ>  [<ffffffff813722d2>] ? dump_stack+0x50/0x89
Dec 22 13:43:22 osiris kernel: [  321.852968]  [<ffffffff8107647a>] ? __report_bad_irq+0x2c/0xb4
Dec 22 13:43:22 osiris kernel: [  321.852971]  [<ffffffff81076804>] ? note_interrupt+0x145/0x1c5
Dec 22 13:43:22 osiris kernel: [  321.852976]  [<ffffffff81074c4c>] ? handle_irq_event_percpu+0x104/0x112
Dec 22 13:43:22 osiris kernel: [  321.852980]  [<ffffffff81074c8e>] ? handle_irq_event+0x34/0x51
Dec 22 13:43:22 osiris kernel: [  321.852984]  [<ffffffff81077019>] ? handle_fasteoi_irq+0x75/0xa6
Dec 22 13:43:22 osiris kernel: [  321.852988]  [<ffffffff8100bf90>] ? handle_irq+0x15/0x1d
Dec 22 13:43:22 osiris kernel: [  321.852992]  [<ffffffff8100bc5e>] ? do_IRQ+0x40/0x95
Dec 22 13:43:22 osiris kernel: [  321.852996]  [<ffffffff81376bed>] ? common_interrupt+0x6d/0x6d
Dec 22 13:43:22 osiris kernel: [  321.852998]  <EOI>  [<ffffffff8128ea01>] ? arch_local_irq_enable+0x4/0x8
Dec 22 13:43:22 osiris kernel: [  321.853007]  [<ffffffff8128ecd1>] ? cpuidle_enter_state+0x50/0xa9
Dec 22 13:43:22 osiris kernel: [  321.853019]  [<ffffffff8128edf9>] ? cpuidle_idle_call+0xcf/0x119
Dec 22 13:43:22 osiris kernel: [  321.853023]  [<ffffffff81011c87>] ? arch_cpu_idle+0x5/0x17
Dec 22 13:43:22 osiris kernel: [  321.853027]  [<ffffffff8107448a>] ? cpu_startup_entry+0xed/0x146
Dec 22 13:43:22 osiris kernel: [  321.853031]  [<ffffffff8102b54d>] ? start_secondary+0x1ed/0x1f0
Dec 22 13:43:22 osiris kernel: [  321.853033] handlers:
Dec 22 13:43:22 osiris kernel: [  321.853047] [<ffffffffa0009f87>] usb_hcd_irq [usbcore]
Dec 22 13:43:22 osiris kernel: [  321.853056] [<ffffffffa00ca3d7>] ata_bmdma_interrupt [libata]
Dec 22 13:43:22 osiris kernel: [  321.853142] [<ffffffffa052c21c>] nv_kern_isr [nvidia]
Dec 22 13:43:22 osiris kernel: [  321.853144] Disabling IRQ #16

.. upadted kernel, with IRQPOLL:
Dec 25 13:03:34 osiris kernel: [  329.462408] irq 16: nobody cared (try booting with the "irqpoll" option)
Dec 25 13:03:34 osiris kernel: [  329.462415] CPU: 2 PID: 0 Comm: swapper/2 Tainted: P           O 3.12-5.slh.2-aptosid-amd64 #1
Dec 25 13:03:34 osiris kernel: [  329.462418] Hardware name: System manufacturer Maximus II Formula/Maximus II Formula, BIOS 2302    04/1
5/2010
Dec 25 13:03:34 osiris kernel: [  329.462420]  0000000000000006 ffffffff813728c6 ffff88022439d800 ffffffff81076551
Dec 25 13:03:34 osiris kernel: [  329.462425]  ffff88022439d800 0000000000000000 00000000000002c8 ffffffff810768db
Dec 25 13:03:34 osiris kernel: [  329.462429]  0000000000000000 ffff88022439d800 0000000000000010 0000000000000000
Dec 25 13:03:34 osiris kernel: [  329.462433] Call Trace:
Dec 25 13:03:34 osiris kernel: [  329.462435]  <IRQ>  [<ffffffff813728c6>] ? dump_stack+0x50/0x89
Dec 25 13:03:34 osiris kernel: [  329.462447]  [<ffffffff81076551>] ? __report_bad_irq+0x2c/0xb4
Dec 25 13:03:34 osiris kernel: [  329.462451]  [<ffffffff810768db>] ? note_interrupt+0x145/0x1c5
Dec 25 13:03:34 osiris kernel: [  329.462456]  [<ffffffff81074d23>] ? handle_irq_event_percpu+0x104/0x112
Dec 25 13:03:34 osiris kernel: [  329.462460]  [<ffffffff81074d65>] ? handle_irq_event+0x34/0x51
Dec 25 13:03:34 osiris kernel: [  329.462464]  [<ffffffff810770f0>] ? handle_fasteoi_irq+0x75/0xa6
Dec 25 13:03:34 osiris kernel: [  329.462468]  [<ffffffff8100bf90>] ? handle_irq+0x15/0x1d
Dec 25 13:03:34 osiris kernel: [  329.462472]  [<ffffffff8100bc5e>] ? do_IRQ+0x40/0x95
Dec 25 13:03:34 osiris kernel: [  329.462476]  [<ffffffff813771ad>] ? common_interrupt+0x6d/0x6d
Dec 25 13:03:34 osiris kernel: [  329.462478]  <EOI>  [<ffffffff8128ed26>] ? arch_local_irq_enable+0x4/0x8
Dec 25 13:03:34 osiris kernel: [  329.462486]  [<ffffffff8128eff6>] ? cpuidle_enter_state+0x50/0xa9
Dec 25 13:03:34 osiris kernel: [  329.462500]  [<ffffffff8128f11e>] ? cpuidle_idle_call+0xcf/0x119
Dec 25 13:03:34 osiris kernel: [  329.462505]  [<ffffffff81011c91>] ? arch_cpu_idle+0x5/0x17
Dec 25 13:03:34 osiris kernel: [  329.462508]  [<ffffffff8107455f>] ? cpu_startup_entry+0x109/0x164
Dec 25 13:03:34 osiris kernel: [  329.462512]  [<ffffffff8102b54d>] ? start_secondary+0x1ed/0x1f0
Dec 25 13:03:34 osiris kernel: [  329.462515] handlers:
Dec 25 13:03:34 osiris kernel: [  329.462528] [<ffffffffa000a013>] usb_hcd_irq [usbcore]
Dec 25 13:03:34 osiris kernel: [  329.462537] [<ffffffffa014d3d4>] ata_bmdma_interrupt [libata]
Dec 25 13:03:34 osiris kernel: [  329.462623] [<ffffffffa062c21e>] nv_kern_isr [nvidia]
Dec 25 13:03:34 osiris kernel: [  329.462625] Disabling IRQ #16


IRQ 16:
$ grep  '16:' /proc/interrupts 
 16:    9379585    9251522        252        199   IO-APIC-fasteoi   uhci_hcd:usb2, pata_marvell, nvidia

Comment 7 Harald Reindl 2013-12-25 21:15:43 UTC

well, that means we have different systems, different hardware and all the time IRQ16 is involved - looks like a deeper kernel problem, i am glad that it happened to me only twice, but as said my co-developer has 100% identical hardware as mine and it happens way too often to him

he has the same machine at office without the WLAN card, it did not happen there once, so i suspect the more PCI/PCI-X cards the machine has the more likely it get triggered

Comment 8 wheiss 2013-12-25 21:20:23 UTC

ps: More info:
new is: I use eSATA via Marvell.

I moved the nVidia to the other slot.. anyone an idea why it sticks on IRQ16?

Comment 9 Harald Reindl 2013-12-25 21:26:47 UTC

https://www.google.at/search?q=+irq+16%3A+nobody+cared
* Fedora
* CentOS
* Debian
* Arch Linux
* SuSE
................

Comment 10 John Greene 2014-01-27 15:56:31 UTC

Have you tried booting with IRQPoll as the trace suggests?  Any relieve with that?

Comment 11 Harald Reindl 2014-01-27 16:00:43 UTC

i only faced this issue exactly 2 times, so for me it is hard to nail down, i only know that others are again and again affected and opened that bugreport after the first time my machine did go down

one of the other heavier affected users statet that IRQPoll does not help

really interesting is the large count of google hits

Comment 12 wheiss 2014-01-29 14:14:52 UTC

Hi,
IRQPoll did not fix it, however I think I found the reason.
MoBo Maximus II Formula
IRQ16 in use by: nVidia,.. and Marvell. pata_marvell is also responsible for eSATA.
After switching from eSATA to a (new USB3-adapter, no more lost IRQs.
I do assume either the card or the Marvell driver have a bug and the kernel disables unhandled IRQs.

Comment 13 John Greene 2014-01-29 21:33:29 UTC

(In reply to wheiss from comment #12)
> Hi,
> IRQPoll did not fix it, however I think I found the reason.
> MoBo Maximus II Formula
> IRQ16 in use by: nVidia,.. and Marvell. pata_marvell is also responsible for
> eSATA.
> After switching from eSATA to a (new USB3-adapter, no more lost IRQs.
> I do assume either the card or the Marvell driver have a bug and the kernel
> disables unhandled IRQs.

Yes, that might do it.  Unless they both fully support shared IRQs, chaos ensues.  Even then, my experience has been to avoid it if at all possible, latency problems, etc.

Harald, does this help you as well?

Comment 14 Harald Reindl 2014-01-29 21:50:32 UTC

my colleague don't want to replace his WLAN-AP with USB3 :-)

we both compared our machines mutliple times
they are 100% identical including the IRQ sharing
no idea why it affected my only twice in 2 years and him much more often

more funny that the two times it happened for me the second one was exactly 24 hours after the first https://bugzilla.redhat.com/show_bug.cgi?id=1013054#c3

i really don't get it :-(

Comment 15 John Greene 2014-01-30 13:48:48 UTC

(In reply to Harald Reindl from comment #14)
> my colleague don't want to replace his WLAN-AP with USB3 :-)
> 
> we both compared our machines mutliple times
> they are 100% identical including the IRQ sharing
> no idea why it affected my only twice in 2 years and him much more often
> 
> more funny that the two times it happened for me the second one was exactly
> 24 hours after the first
> https://bugzilla.redhat.com/show_bug.cgi?id=1013054#c3
> 
> i really don't get it :-(

Perhaps the applications you run differ in what he runs, exposing the problem with different frequency..video demand, disk access on eSata.  It may take a specific sequence to expose the timing window..  Do you both run similar applications load?

Comment 16 Harald Reindl 2014-01-30 14:32:55 UTC

his workload is more eclipse and mine more VMware machines with a lot of IO
my machine exists much longer and because the hostapd WLAN-AP worked that fine we ordered exactly the same hardware again with nearly identical config (LAN/WAN/Bridges/Routing/VPN)

i fear there is not much more to debug given that it takes sometimes 8 days and sometimes happens daily for him :-(

Comment 17 John Greene 2014-02-04 19:10:37 UTC

Nor do I at this point..at least not from from me.  Perhaps bad or just different hardware.  Closing this, please feel free to reopen if you get some more information.