Description of problem: KVM based Guest VM become halt after huge data rsync Version-Release number of selected component (if applicable): How reproducible: Just rsync more then 15G data from any live host. Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: I have am using FC11 64bit on Host and same OS on KVM based guest machine. I am using TAP based routing network inside host. I am using virtio network driver for my guest machine When i transfer huge data via rsync from another live host my guest machine got halt with following errors in log. I need to restart network service to make guest VM back. With 512M RAM for guest VM, it got halt for 5G data transffer via rsync With 1G RAM for guest VM, it got halt if i try to shift 15G data transffer via rsync Aug 28 12:43:08 phili-p kernel: swapper: page allocation failure. order:0, mode:0x20 Aug 28 12:43:08 phili-p kernel: Pid: 0, comm: swapper Not tainted 2.6.29.6-217.2.8.fc11.x86_64 #1 Aug 28 12:43:08 phili-p kernel: Call Trace: Aug 28 12:43:08 phili-p kernel: <IRQ> [<ffffffff810a50c8>] __alloc_pages_internal+0x40d/0x429 Aug 28 12:43:08 phili-p kernel: [<ffffffff810c7311>] alloc_pages_current+0xb7/0xc0 Aug 28 12:43:08 phili-p kernel: [<ffffffffa0057913>] try_fill_recv+0xa8/0x18d [virtio_net] Aug 28 12:43:08 phili-p kernel: [<ffffffffa00584a7>] virtnet_poll+0x518/0x57d [virtio_net] Aug 28 12:43:08 phili-p kernel: [<ffffffff8130eccd>] net_rx_action+0xb7/0x1b1 Aug 28 12:43:08 phili-p kernel: [<ffffffff8104df7f>] __do_softirq+0x94/0x155 Aug 28 12:43:08 phili-p kernel: [<ffffffff8101274c>] call_softirq+0x1c/0x30 Aug 28 12:43:08 phili-p kernel: [<ffffffff810138ce>] do_softirq+0x52/0xb9 Aug 28 12:43:08 phili-p kernel: [<ffffffff8104dba2>] irq_exit+0x53/0x90 Aug 28 12:43:08 phili-p kernel: [<ffffffff81013bf7>] do_IRQ+0x12c/0x151 Aug 28 12:43:08 phili-p kernel: [<ffffffff81011e93>] ret_from_intr+0x0/0x2e Aug 28 12:43:08 phili-p kernel: <EOI> [<ffffffff8102942c>] ? native_safe_halt+0xb/0xd Aug 28 12:43:08 phili-p kernel: [<ffffffff81017d30>] ? default_idle+0x51/0x7c Aug 28 12:43:08 phili-p kernel: [<ffffffff813af519>] ? atomic_notifier_call_chain+0x13/0x15 Aug 28 12:43:08 phili-p kernel: [<ffffffff81010237>] ? enter_idle+0x27/0x29 Aug 28 12:43:08 phili-p kernel: [<ffffffff810102a1>] ? cpu_idle+0x68/0xb3 Aug 28 12:43:08 phili-p kernel: [<ffffffff81398367>] ? rest_init+0x6b/0x6d Aug 28 12:43:08 phili-p kernel: Mem-Info: Aug 28 12:43:08 phili-p kernel: Node 0 DMA per-cpu: Aug 28 12:43:08 phili-p kernel: CPU 0: hi: 0, btch: 1 usd: 0 Aug 28 12:43:08 phili-p kernel: Node 0 DMA32 per-cpu: Aug 28 12:43:08 phili-p kernel: CPU 0: hi: 186, btch: 31 usd: 191 Aug 28 12:43:08 phili-p kernel: Active_anon:5593 active_file:26509 inactive_anon:7932 Aug 28 12:43:08 phili-p kernel: inactive_file:177229 unevictable:0 dirty:16171 writeback:52 unstable:0 Aug 28 12:43:08 phili-p kernel: free:1334 slab:10907 mapped:3728 pagetables:1340 bounce:0 Aug 28 12:43:08 phili-p kernel: Node 0 DMA free:3892kB min:24kB low:28kB high:36kB active_anon:0kB inactive_anon:0kB active_file:24kB inactive_file:4024kB unevictable:0kB present:6368kB pages_scanned:0 all_unreclaimable? no Aug 28 12:43:08 phili-p kernel: lowmem_reserve[]: 0 970 970 970 Aug 28 12:43:08 phili-p kernel: Node 0 DMA32 free:1444kB min:3972kB low:4964kB high:5956kB active_anon:22372kB inactive_anon:31728kB active_file:106012kB inactive_file:704892kB unevictable:0kB present:993776kB pages_scanned:0 all_unreclaimable? no Aug 28 12:43:08 phili-p kernel: lowmem_reserve[]: 0 0 0 0 Aug 28 12:43:08 phili-p kernel: Node 0 DMA: 1*4kB 0*8kB 1*16kB 1*32kB 0*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3892kB Aug 28 12:43:08 phili-p kernel: Node 0 DMA32: 1*4kB 0*8kB 1*16kB 0*32kB 0*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1428kB Aug 28 12:43:08 phili-p kernel: 203802 total pagecache pages Aug 28 12:43:08 phili-p kernel: 0 pages in swap cache Aug 28 12:43:08 phili-p kernel: Swap cache stats: add 0, delete 0, find 0/0 Aug 28 12:43:08 phili-p kernel: Free swap = 2096440kB Aug 28 12:43:08 phili-p kernel: Total swap = 2096440kB Aug 28 12:43:08 phili-p kernel: 255984 pages RAM Aug 28 12:43:08 phili-p kernel: 8458 pages reserved Aug 28 12:43:08 phili-p kernel: 214776 pages shared Aug 28 12:43:08 phili-p kernel: 43672 pages non-shared
i got the same error with new kernel and on different VM. Problem always occur while doing rsync, though data transfer size is not too big. i am using following CPU. Plz let me know if u need more details. vendor_id : AuthenticAMD model name : AMD Athlon(tm) 64 X2 Dual Core Processor 5200+ Regards
bug #480822 was a kinda similar looking bug which was fixed before 2.6.29 was released This thread: http://www.mail-archive.com/kvm@vger.kernel.org/msg14432.html suggests it may be due to OOM This looks like it might fix it: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=3161e453e4
For testing purpose i have remove virtio driver from my VM Conf and then try to do rsync from 3 different location have 4G, 4G and 7G of Data simultaneously. At last i got the same error with standard driver. --------------------------------------------- Sep 7 18:36:52 phili-p kernel: swapper: page allocation failure. order:0, mode:0x4020 Sep 7 18:36:52 phili-p kernel: Pid: 0, comm: swapper Not tainted 2.6.29.6-217.2.16.fc11.x86_64 #1 Sep 7 18:36:52 phili-p kernel: Call Trace: Sep 7 18:36:52 phili-p kernel: <IRQ> [<ffffffff810a50e5>] __alloc_pages_internal+0x40d/0x42c Sep 7 18:36:52 phili-p kernel: [<ffffffff810c732d>] alloc_pages_current+0xb7/0xc0 Sep 7 18:36:52 phili-p kernel: [<ffffffff810cc644>] alloc_slab_page+0x22/0x2f Sep 7 18:36:52 phili-p kernel: [<ffffffff810cc6ba>] new_slab+0x69/0x1cd Sep 7 18:36:52 phili-p kernel: [<ffffffff810ccd5c>] __slab_alloc+0x207/0x394 Sep 7 18:36:52 phili-p kernel: [<ffffffff8130acbe>] ? __netdev_alloc_skb+0x34/0x50 Sep 7 18:36:52 phili-p kernel: [<ffffffff8130acbe>] ? __netdev_alloc_skb+0x34/0x50 Sep 7 18:36:52 phili-p kernel: [<ffffffff810cdf2a>] __kmalloc_node_track_caller+0xb7/0x127 Sep 7 18:36:52 phili-p kernel: [<ffffffff8130a0d6>] __alloc_skb+0x80/0x14d Sep 7 18:36:52 phili-p kernel: [<ffffffff8130acbe>] __netdev_alloc_skb+0x34/0x50 Sep 7 18:36:52 phili-p kernel: [<ffffffffa0042444>] cp_rx_poll+0x121/0x31c [8139cp] Sep 7 18:36:52 phili-p kernel: [<ffffffff8105f99c>] ? ktime_get_ts+0x4e/0x53 Sep 7 18:36:52 phili-p kernel: [<ffffffff8130ecc9>] net_rx_action+0xb7/0x1b1 Sep 7 18:36:52 phili-p kernel: [<ffffffff8130e8ee>] ? list_add_tail+0x15/0x17 Sep 7 18:36:52 phili-p kernel: [<ffffffff8104df87>] __do_softirq+0x94/0x155 Sep 7 18:36:52 phili-p kernel: [<ffffffff8101274c>] call_softirq+0x1c/0x30 Sep 7 18:36:52 phili-p kernel: [<ffffffff810138ce>] do_softirq+0x52/0xb9 Sep 7 18:36:52 phili-p kernel: [<ffffffff8104dbaa>] irq_exit+0x53/0x90 Sep 7 18:36:52 phili-p kernel: [<ffffffff81013bf7>] do_IRQ+0x12c/0x151 Sep 7 18:36:52 phili-p kernel: [<ffffffff81011e93>] ret_from_intr+0x0/0x2e Sep 7 18:36:52 phili-p kernel: <EOI> [<ffffffff8102942c>] ? native_safe_halt+0xb/0xd Sep 7 18:36:52 phili-p kernel: [<ffffffff81017d30>] ? default_idle+0x51/0x7c Sep 7 18:36:52 phili-p kernel: [<ffffffff813af529>] ? atomic_notifier_call_chain+0x13/0x15 Sep 7 18:36:52 phili-p kernel: [<ffffffff81010237>] ? enter_idle+0x27/0x29 Sep 7 18:36:52 phili-p kernel: [<ffffffff810102a1>] ? cpu_idle+0x68/0xb3 Sep 7 18:36:52 phili-p kernel: [<ffffffff81398377>] ? rest_init+0x6b/0x6d Sep 7 18:36:52 phili-p kernel: Mem-Info: Sep 7 18:36:52 phili-p kernel: Node 0 DMA per-cpu: Sep 7 18:36:52 phili-p kernel: CPU 0: hi: 0, btch: 1 usd: 0 Sep 7 18:36:52 phili-p kernel: Node 0 DMA32 per-cpu: Sep 7 18:36:52 phili-p kernel: CPU 0: hi: 186, btch: 31 usd: 130 Sep 7 18:36:52 phili-p kernel: Active_anon:4721 active_file:11992 inactive_anon:7552 Sep 7 18:36:52 phili-p kernel: inactive_file:73300 unevictable:0 dirty:9737 writeback:13 unstable:0 Sep 7 18:36:52 phili-p kernel: free:724 slab:8329 mapped:3790 pagetables:1224 bounce:0 Sep 7 18:36:52 phili-p kernel: Node 0 DMA free:1924kB min:36kB low:44kB high:52kB active_anon:0kB inactive_anon:0kB active_file:376kB inactive_file:5536kB unevictable:0kB present:6368kB pages_scanned:0 all_unreclaimable? no Sep 7 18:36:52 phili-p kernel: lowmem_reserve[]: 0 477 477 477 Sep 7 18:36:52 phili-p kernel: Node 0 DMA32 free:972kB min:2772kB low:3464kB high:4156kB active_anon:18884kB inactive_anon:30208kB active_file:47592kB inactive_file:287664kB unevictable:0kB present:488776kB pages_scanned:0 all_unreclaimable? no Sep 7 18:36:52 phili-p kernel: lowmem_reserve[]: 0 0 0 0 Sep 7 18:36:52 phili-p kernel: Node 0 DMA: 1*4kB 0*8kB 2*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 1924kB Sep 7 18:36:52 phili-p kernel: Node 0 DMA32: 174*4kB 5*8kB 6*16kB 0*32kB 0*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 960kB Sep 7 18:36:52 phili-p kernel: 85350 total pagecache pages Sep 7 18:36:52 phili-p kernel: 0 pages in swap cache Sep 7 18:36:52 phili-p kernel: Swap cache stats: add 0, delete 0, find 0/0 Sep 7 18:36:52 phili-p kernel: Free swap = 2096440kB Sep 7 18:36:52 phili-p kernel: Total swap = 2096440kB Sep 7 18:36:52 phili-p kernel: 127984 pages RAM Sep 7 18:36:52 phili-p kernel: 5634 pages reserved Sep 7 18:36:52 phili-p kernel: 94075 pages shared Sep 7 18:36:52 phili-p kernel: 37370 pages non-shared ------------------------------------------------------ will you say it still virtio_net failure ? Regards, Mateen
I more thing i forget to mention, my machine never goes to swap Regards, Mateen
Sorry forget to mention another imp point. VM got in working and dont become halt on standard driver, just give above mentioned errors in logs again & again almost 50 times until rsync finish. -------------- grep 'swapper: page allocation failure' /var/log/messages |wc -l 50 -------------- The fix suggested above is related to following error:- "[929492.154634] pdflush: page allocation failure. order:0, mode:0x20" while i am getting different error:- "phili-p kernel: swapper: page allocation failure. order:0" I think its pure kernel issue. Regards, Mateen
Any update on the issue ? For kind information i got the same errors on FC10 based VM's as well. i have opened a new bug for FC10. See https://bugzilla.redhat.com/show_bug.cgi?id=523299
Mohammad: could you try and reproduce with the 2.6.30 kernel in F11 updates? Thanks
Here is what i am getting new kernel and fedora default network driver with tap networking: Sep 24 12:09:58 phili-p kernel: kjournald: page allocation failure. order:0, mode:0x4020 Sep 24 12:09:58 phili-p kernel: Pid: 814, comm: kjournald Not tainted 2.6.30.5-43.fc11.x86_64 #1 Sep 24 12:09:58 phili-p kernel: Call Trace: Sep 24 12:09:58 phili-p kernel: <IRQ> [<ffffffff810da7f0>] __alloc_pages_internal+0x434/0x468 Sep 24 12:09:58 phili-p kernel: [<ffffffff81101ec2>] alloc_pages_current+0xca/0xe9 Sep 24 12:09:58 phili-p kernel: [<ffffffff81108cdf>] alloc_slab_page+0x35/0x56 Sep 24 12:09:58 phili-p kernel: [<ffffffff81108d78>] new_slab+0x78/0x1ee Sep 24 12:09:58 phili-p kernel: [<ffffffff81109409>] __slab_alloc+0x21a/0x3ba Sep 24 12:09:58 phili-p kernel: [<ffffffff813da548>] ? __netdev_alloc_skb+0x43/0x76 Sep 24 12:09:58 phili-p kernel: [<ffffffff8110a2e1>] __kmalloc_node_track_caller+0xd1/0x16f Sep 24 12:09:58 phili-p kernel: [<ffffffff813da548>] ? __netdev_alloc_skb+0x43/0x76 Sep 24 12:09:58 phili-p kernel: [<ffffffff813d97f7>] __alloc_skb+0x8f/0x17d Sep 24 12:09:58 phili-p kernel: [<ffffffff813da548>] __netdev_alloc_skb+0x43/0x76 Sep 24 12:09:58 phili-p kernel: [<ffffffffa0062c7b>] cp_rx_poll+0x133/0x348 [8139cp] Sep 24 12:09:58 phili-p kernel: [<ffffffff813e2591>] net_rx_action+0xc3/0x1df Sep 24 12:09:58 phili-p kernel: [<ffffffff812599eb>] ? irq_2_iommu+0x21/0x68 Sep 24 12:09:58 phili-p kernel: [<ffffffff8105db54>] __do_softirq+0xd2/0x1d2 Sep 24 12:09:58 phili-p kernel: [<ffffffff81259be1>] ? irq_remapped+0x21/0x40 Sep 24 12:09:58 phili-p kernel: [<ffffffff8102aaad>] ? ack_apic_level+0x5b/0x115 Sep 24 12:09:58 phili-p kernel: [<ffffffff8101323c>] call_softirq+0x1c/0x30 Sep 24 12:09:58 phili-p kernel: [<ffffffff81014ba3>] do_softirq+0x5f/0xd7 Sep 24 12:09:58 phili-p kernel: [<ffffffff8105d633>] irq_exit+0x66/0xb7 Sep 24 12:09:58 phili-p kernel: [<ffffffff814986d8>] ? trace_hardirqs_off_thunk+0x3a/0x6c Sep 24 12:09:58 phili-p kernel: [<ffffffff810143f8>] do_IRQ+0xc1/0xee Sep 24 12:09:58 phili-p kernel: [<ffffffff81012a93>] ret_from_intr+0x0/0x16 Sep 24 12:09:58 phili-p kernel: <EOI> [<ffffffff811bbbcc>] ? journal_remove_journal_head+0x54/0x55 Sep 24 12:09:58 phili-p kernel: [<ffffffff811b8e5b>] ? journal_commit_transaction+0x57d/0xe42 Sep 24 12:09:58 phili-p kernel: [<ffffffff81498fef>] ? _spin_lock_irqsave+0x38/0x6a Sep 24 12:09:58 phili-p kernel: [<ffffffff81063218>] ? try_to_del_timer_sync+0x69/0x87 Sep 24 12:09:58 phili-p kernel: [<ffffffff811bccdf>] ? kjournald+0xfd/0x253 Sep 24 12:09:58 phili-p kernel: [<ffffffff81070b57>] ? autoremove_wake_function+0x0/0x5f Sep 24 12:09:58 phili-p kernel: [<ffffffff811bcbe2>] ? kjournald+0x0/0x253 Sep 24 12:09:58 phili-p kernel: [<ffffffff81070665>] ? kthread+0x6d/0xae Sep 24 12:09:58 phili-p kernel: [<ffffffff8101313a>] ? child_rip+0xa/0x20 Sep 24 12:09:58 phili-p kernel: [<ffffffff81012afd>] ? restore_args+0x0/0x30 Sep 24 12:09:58 phili-p kernel: [<ffffffff810705f8>] ? kthread+0x0/0xae Sep 24 12:09:58 phili-p kernel: [<ffffffff81013130>] ? child_rip+0x0/0x20 Sep 24 12:09:58 phili-p kernel: Mem-Info: Sep 24 12:09:58 phili-p kernel: Node 0 DMA per-cpu: Sep 24 12:09:58 phili-p kernel: CPU 0: hi: 0, btch: 1 usd: 0 Sep 24 12:09:58 phili-p kernel: Node 0 DMA32 per-cpu: Sep 24 12:09:58 phili-p kernel: CPU 0: hi: 186, btch: 31 usd: 198 Sep 24 12:09:58 phili-p kernel: Active_anon:1382 active_file:10532 inactive_anon:11842 Sep 24 12:09:58 phili-p kernel: inactive_file:85709 unevictable:0 dirty:11820 writeback:381 unstable:0 Sep 24 12:09:58 phili-p kernel: free:724 slab:8511 mapped:2198 pagetables:1185 bounce:0 Sep 24 12:09:58 phili-p kernel: Node 0 DMA free:1920kB min:28kB low:32kB high:40kB active_anon:0kB inactive_anon:0kB active_file:1176kB inactive_file:3956kB unevictable:0kB present:5324kB pages_scanned:0 all_unreclaimable? no Sep 24 12:09:58 phili-p kernel: lowmem_reserve[]: 0 477 477 477 Sep 24 12:09:58 phili-p kernel: Node 0 DMA32 free:976kB min:2776kB low:3468kB high:4164kB active_anon:5528kB inactive_anon:47368kB active_file:40952kB inactive_file:338880kB unevictable:0kB present:488776kB pages_scanned:0 all_unreclaimable? no Sep 24 12:09:58 phili-p kernel: lowmem_reserve[]: 0 0 0 0 Sep 24 12:09:58 phili-p kernel: Node 0 DMA: 16*4kB 8*8kB 10*16kB 5*32kB 5*64kB 7*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1920kB Sep 24 12:09:58 phili-p kernel: Node 0 DMA32: 109*4kB 15*8kB 5*16kB 0*32kB 1*64kB 0*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 956kB Sep 24 12:09:58 phili-p kernel: 96396 total pagecache pages Sep 24 12:09:58 phili-p kernel: 104 pages in swap cache Sep 24 12:09:58 phili-p kernel: Swap cache stats: add 669, delete 565, find 7234/7251 Sep 24 12:09:58 phili-p kernel: Free swap = 1042324kB Sep 24 12:09:58 phili-p kernel: Total swap = 1044184kB Sep 24 12:09:58 phili-p kernel: 127984 pages RAM Sep 24 12:09:58 phili-p kernel: 6097 pages reserved Sep 24 12:09:58 phili-p kernel: 102917 pages shared Sep 24 12:09:58 phili-p kernel: 27165 pages non-shared Plz, let me know if u need something else to know.
Created attachment 363335 [details] backport of 2.6.31 virtio_net patch Okay, I think the best thing is to try this virtio_net patch from 2.6.31 It certainly looks like it should fix the issue Justin, could you apply this? I've backported it, but haven't even tried compiling it
Tested and applied, it should appear in the next F-11 update kernel.
kernel-2.6.30.9-90.fc11 has been submitted as an update for Fedora 11. http://admin.fedoraproject.org/updates/kernel-2.6.30.9-90.fc11
kernel-2.6.30.9-90.fc11 has been pushed to the Fedora 11 stable repository. If problems still persist, please make note of it in this bug report.