Hide Forgot
Created attachment 510566 [details] xml file to define guest, and the /var/log/messages log Description of problem: Trying to start 512 guests , but will meet kernel call trace errors during the starting process, and the starting process would get stopped . Version-Release number of selected component (if applicable): libvirt-0.9.2-1.el6.x86_64 qemu-kvm-0.12.1.2-2.165.el6.x86_64 kernel-2.6.32-161.el6.x86_64 How reproducible: Accidental during the starting loop Steps to Reproduce: 1.create 512 guest via libvirt 2.service cgconfig stop 3.for i in {1..512} ; do virsh start guest$i ; done Actual results: Domain test-kvm-rhel6-x86_64427 started Domain test-kvm-rhel6-x86_64428 started Domain test-kvm-rhel6-x86_64429 started Message from syslogd@intel-e7450-512-1 at Jun 30 05:04:06 ... kernel:Stack: Message from syslogd@intel-e7450-512-1 at Jun 30 05:04:06 ... kernel:Call Trace: Message from syslogd@intel-e7450-512-1 at Jun 30 05:04:06 ... kernel: <IRQ> Message from syslogd@intel-e7450-512-1 at Jun 30 05:04:06 ... kernel: <EOI> Message from syslogd@intel-e7450-512-1 at Jun 30 05:04:06 ... kernel:Code: 00 00 00 01 74 05 e8 79 08 d9 ff c9 c3 0f 1f 80 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 48 89 fa 66 ff 02 66 66 90 48 89 f7 57 9d <0f> 1f 44 00 00 c9 c3 66 90 55 48 89 e5 0f 1f 44 00 00 f0 ff 07 Expected results: guests could be started successfully without kernel errors Additional info: guest os RHEL6.0 released RHEL6_X86_64=http://download.englab.nay.redhat.com/pub/rhel/released/RHEL-6/6.0/Server/x86_64/os/ the smallest installation
Seems like a non virt kernel bug: dm_mod [last unloaded: nf_conntrack] Jun 29 02:41:28 intel-e7450-512-1 kernel: CPU 51 Jun 29 02:41:28 intel-e7450-512-1 kernel: Modules linked in: ext3 jbd scsi_dh_rdac dm_round_robin dm_multipath ipt_MASQUERADE xt_state iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables ip6table_filter ip6_tables ebtable_nat ebtables ipt_REJECT xt_CHECKSUM bridge stp llc sunrpc ipv6 dm_mirror dm_region_hash dm_log vhost_net macvtap macvlan tun kvm_intel kvm microcode bnx2 ibmaem ipmi_msghandler sg iTCO_wdt iTCO_vendor_support shpchp ext4 mbcache jbd2 sd_mod crc_t10dif usb_storage qla2xxx lpfc scsi_transport_fc scsi_tgt megaraid_sas sr_mod cdrom ata_generic pata_acpi ata_piix radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core dm_mod [last unloaded: nf_conntrack] Jun 29 02:41:28 intel-e7450-512-1 kernel: Jun 29 02:41:28 intel-e7450-512-1 kernel: Pid: 207, comm: migration/51 Not tainted 2.6.32-161.el6.x86_64 #1 IBM IBM 3850 M2 / x3950 M2 -[72335SC]-/Node1 Processor Card Jun 29 02:41:28 intel-e7450-512-1 kernel: RIP: 0010:[<ffffffff814df657>] [<ffffffff814df657>] _spin_unlock_irqrestore+0x17/0x20 Jun 29 02:41:28 intel-e7450-512-1 kernel: RSP: 0018:ffff888100e63e20 EFLAGS: 00000246 Jun 29 02:41:28 intel-e7450-512-1 kernel: RAX: 0000000000000004 RBX: ffff888100e63e20 RCX: 0000000000000000 Jun 29 02:41:28 intel-e7450-512-1 kernel: RDX: ffff887f85ce4350 RSI: 0000000000000246 RDI: 0000000000000246 Jun 29 02:41:28 intel-e7450-512-1 kernel: RBP: ffffffff8100bbd3 R08: 0000000000000018 R09: 0000000000000000 Jun 29 02:41:28 intel-e7450-512-1 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff888100e63da0 Jun 29 02:41:28 intel-e7450-512-1 kernel: R13: 0000000000000000 R14: ffff887f85ce4988 R15: ffffffff814e513b Jun 29 02:41:28 intel-e7450-512-1 kernel: FS: 0000000000000000(0000) GS:ffff888100e60000(0000) knlGS:0000000000000000 Jun 29 02:41:28 intel-e7450-512-1 kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b Jun 29 02:41:28 intel-e7450-512-1 kernel: CR2: 00007fff907f7b48 CR3: 000000ca14a88000 CR4: 00000000000026e0 Jun 29 02:41:28 intel-e7450-512-1 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jun 29 02:41:28 intel-e7450-512-1 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jun 29 02:41:28 intel-e7450-512-1 kernel: Process migration/51 (pid: 207, threadinfo ffff883f9727c000, task ffff883f97278a80) Jun 29 02:41:28 intel-e7450-512-1 kernel: Stack: Jun 29 02:41:28 intel-e7450-512-1 kernel: ffff888100e63e50 ffffffff8124b193 ffff886728cc2690 ffff887f796a2c00 Jun 29 02:41:28 intel-e7450-512-1 kernel: <0> 0000000000000000 ffff887f85ce4988 ffff888100e63e60 ffffffff8124b1cf Jun 29 02:41:28 intel-e7450-512-1 kernel: <0> ffff888100e63ea0 ffffffffa00023fc ffff883f9727dfd8 ffff888100e63eb0 Jun 29 02:41:28 intel-e7450-512-1 kernel: Call Trace: Jun 29 02:41:28 intel-e7450-512-1 kernel: <IRQ> Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8124b193>] ? blk_end_bidi_request+0x63/0x80 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8124b1cf>] ? blk_end_request_all+0x1f/0x40 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffffa00023fc>] ? dm_softirq_done+0xcc/0x130 [dm_mod] Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff81250615>] ? blk_done_softirq+0x85/0xa0 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8106fc71>] ? __do_softirq+0xc1/0x1d0 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff810932a5>] ? hrtimer_interrupt+0x1b5/0x250 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8100c20c>] ? call_softirq+0x1c/0x30 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8100de45>] ? do_softirq+0x65/0xa0 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8106fa55>] ? irq_exit+0x85/0x90 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff814e5140>] ? smp_apic_timer_interrupt+0x70/0x9b Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8100bbd3>] ? apic_timer_interrupt+0x13/0x20 Jun 29 02:41:28 intel-e7450-512-1 kernel: <EOI> Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff81060410>] ? migration_thread+0x260/0x2e0 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff810601b0>] ? migration_thread+0x0/0x2e0 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8108e386>] ? kthread+0x96/0xa0 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8100c10a>] ? child_rip+0xa/0x20 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8108e2f0>] ? kthread+0x0/0xa0 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8100c100>] ? child_rip+0x0/0x20 Jun 29 02:41:28 intel-e7450-512-1 kernel: Code: 00 00 00 01 74 05 e8 79 08 d9 ff c9 c3 0f 1f 80 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 48 89 fa 66 ff 02 66 66 90 48 89 f7 57 9d <0f> 1f 44 00 00 c9 c3 66 90 55 48 89 e5 0f 1f 44 00 00 f0 ff 07 Jun 29 02:41:28 intel-e7450-512-1 kernel: Call Trace: Jun 29 02:41:28 intel-e7450-512-1 kernel: <IRQ> [<ffffffff8124b193>] ? blk_end_bidi_request+0x63/0x80 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8124b1cf>] ? blk_end_request_all+0x1f/0x40 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffffa00023fc>] ? dm_softirq_done+0xcc/0x130 [dm_mod] Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff81250615>] ? blk_done_softirq+0x85/0xa0 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8106fc71>] ? __do_softirq+0xc1/0x1d0 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff810932a5>] ? hrtimer_interrupt+0x1b5/0x250 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8100c20c>] ? call_softirq+0x1c/0x30 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8100de45>] ? do_softirq+0x65/0xa0 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8106fa55>] ? irq_exit+0x85/0x90 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff814e5140>] ? smp_apic_timer_interrupt+0x70/0x9b Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8100bbd3>] ? apic_timer_interrupt+0x13/0x20 Jun 29 02:41:28 intel-e7450-512-1 kernel: <EOI> [<ffffffff81060410>] ? migration_thread+0x260/0x2e0 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff810601b0>] ? migration_thread+0x0/0x2e0 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8108e386>] ? kthread+0x96/0xa0 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8100c10a>] ? child_rip+0xa/0x20 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8108e2f0>] ? kthread+0x0/0xa0 Jun 29 02:41:28 intel-e7450-512-1 kernel: [<ffffffff8100c100>] ? child_rip+0x0/0x20 Jun 29 02:41:35 intel-e7450-512-1 kernel: [drm:radeon_dvi_detect] *ERROR* DVI-D-1: probed a monitor but no|invalid EDID Jun 29 02:42:11 intel-e7450-512-1 abrt: Kerneloops: Reported 1 kernel oopses to Abrt Jun 29 02:42:11 intel-e7450-512-1 abrtd: Directory 'kerneloops-1309286531-5694-1' creation detected Jun 29 02:42:11 intel-e7450-512-1 abrtd: Crash is in database already (dup of /var/spool/abrt/kerneloops-1309248540-5694-1) Jun 29 02:42:11 intel-e7450-512-1 abrtd: Deleting crash kerneloops-1309286531-5694-1 (dup of kerneloops-1309248540-5694-1), sending dbus signal Jun 29 02:48:33 intel-e7450-512-1 kernel: kvm: 61754: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0x0 Jun 29 02:48:33 intel-e7450-512-1 kernel: kvm: 61754: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x130079 Jun 29 02:48:33 intel-e7450-512-1 kernel: kvm: 61754: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0xffdb6570 Jun 29 02:48:33 intel-e7450-512-1 kernel: kvm: 61754: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x530079 Jun 29 02:51:30 intel-e7450-512-1 kernel: [drm:radeon_dvi_detect] *ERROR* DVI-D-1: probed a monitor but no|invalid EDID Jun 29 02:52:29 intel-e7450-512-1 kernel: BUG: soft lockup - CPU#75 stuck for 67s! [migration/75:303] Jun 29 02:52:30 intel-e7450-512-1 kernel: Modules linked in: ext3 jbd scsi_dh_rdac dm_round_robin dm_multipath ipt_MASQUERADE xt_state iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables ip6table_filter ip6_tables ebtable_nat ebtables ipt_REJECT xt_CHECKSUM bridge stp llc sunrpc ipv6 dm_mirror dm_region_hash dm_log vhost_net macvtap macvlan tun kvm_intel kvm microcode bnx2 ibmaem ipmi_msghandler sg iTCO_wdt iTCO_vendor_support shpchp ext4 mbcache jbd2 sd_mod crc_t10dif usb_storage qla2xxx lpfc scsi_transport_fc scsi_tgt megaraid_sas sr_mod cdrom ata_generic pata_acpi ata_piix radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core dm_mod [last unloaded: nf_conntrack] Jun 29 02:52:30 intel-e7450-512-1 kernel: CPU 75 Jun 29 02:52:30 intel-e7450-512-1 kernel: Modules linked in: ext3 jbd scsi_dh_rdac dm_round_robin dm_multipath ipt_MASQUERADE xt_state iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables ip6table_filter ip6_tables ebtable_nat ebtables ipt_REJECT xt_CHECKSUM bridge stp llc sunrpc ipv6 dm_mirror dm_region_hash dm_log vhost_net macvtap macvlan tun kvm_intel kvm microcode bnx2 ibmaem ipmi_msghandler sg iTCO_wdt iTCO_vendor_support shpchp ext4 mbcache jbd2 sd_mod crc_t10dif usb_storage qla2xxx lpfc scsi_transport_fc scsi_tgt megaraid_sas sr_mod cdrom ata_generic pata_acpi ata_piix radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core dm_mod [last unloaded: nf_conntrack] Jun 29 02:52:30 intel-e7450-512-1 kernel: Jun 29 02:52:30 intel-e7450-512-1 kernel: Pid: 303, comm: migration/75 Not tainted 2.6.32-161.el6.x86_64 #1 IBM IBM 3850 M2 / x3950 M2 -[72335SC]-/Node1 Processor Card Jun 29 02:52:30 intel-e7450-512-1 kernel: RIP: 0010:[<ffffffff814df657>] [<ffffffff814df657>] _spin_unlock_irqrestore+0x17/0x20 Jun 29 02:52:30 intel-e7450-512-1 kernel: RSP: 0018:ffff88c0f0e63d60 EFLAGS: 00000286 Jun 29 02:52:30 intel-e7450-512-1 kernel: RAX: 0000000000000400 RBX: ffff88c0f0e63d60 RCX: ffff88ccad347200 Jun 29 02:52:30 intel-e7450-512-1 kernel: RDX: ffff88c0f0e75fc0 RSI: 0000000000000286 RDI: 0000000000000286 Jun 29 02:52:30 intel-e7450-512-1 kernel: RBP: ffffffff8100bbd3 R08: 0000000000000001 R09: 0000000000000000 Jun 29 02:52:30 intel-e7450-512-1 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff88c0f0e63ce0 Jun 29 02:52:30 intel-e7450-512-1 kernel: R13: ffff88ff571bce08 R14: ffff88c0f0e75fc0 R15: ffffffff814e513b Jun 29 02:52:30 intel-e7450-512-1 kernel: FS: 0000000000000000(0000) GS:ffff88c0f0e60000(0000) knlGS:0000000000000000 Jun 29 02:52:30 intel-e7450-512-1 kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b Jun 29 02:52:30 intel-e7450-512-1 kernel: CR2: 00007fe76466f000 CR3: 0000002c57310000 CR4: 00000000000026e0 Jun 29 02:52:30 intel-e7450-512-1 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jun 29 02:52:30 intel-e7450-512-1 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jun 29 02:52:30 intel-e7450-512-1 kernel: Process migration/75 (pid: 303, threadinfo ffff883f96c0e000, task ffff883f96c0ab00) Jun 29 02:52:30 intel-e7450-512-1 kernel: Stack: Jun 29 02:52:30 intel-e7450-512-1 kernel: ffff88c0f0e63dc0 ffffffff81057642 ffff88c0f0e63d80 0000000000015fc0 Jun 29 02:52:30 intel-e7450-512-1 kernel: <0> 000000000000004b ffff88c0f0e767f0 0000000000000000 0000000000015fc0 Jun 29 02:52:30 intel-e7450-512-1 kernel: <0> 000000000000004b 0000000000015fc0 0000000000000001 000000011c1d961e Jun 29 02:52:30 intel-e7450-512-1 kernel: Call Trace: Jun 29 02:52:30 intel-e7450-512-1 kernel: <IRQ> Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff81057642>] ? update_shares+0xd2/0x110 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8105d462>] ? rebalance_domains+0x52/0x5b0 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8105da0c>] ? run_rebalance_domains+0x4c/0x160 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8106fc71>] ? __do_softirq+0xc1/0x1d0 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff810932a5>] ? hrtimer_interrupt+0x1b5/0x250 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8100c20c>] ? call_softirq+0x1c/0x30 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8100de45>] ? do_softirq+0x65/0xa0 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8106fa55>] ? irq_exit+0x85/0x90 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff814e5140>] ? smp_apic_timer_interrupt+0x70/0x9b Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8100bbd3>] ? apic_timer_interrupt+0x13/0x20 Jun 29 02:52:30 intel-e7450-512-1 kernel: <EOI> Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff81060410>] ? migration_thread+0x260/0x2e0 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff810601b0>] ? migration_thread+0x0/0x2e0 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8108e386>] ? kthread+0x96/0xa0 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8100c10a>] ? child_rip+0xa/0x20 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8108e2f0>] ? kthread+0x0/0xa0 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8100c100>] ? child_rip+0x0/0x20 Jun 29 02:52:30 intel-e7450-512-1 kernel: Code: 00 00 00 01 74 05 e8 79 08 d9 ff c9 c3 0f 1f 80 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 48 89 fa 66 ff 02 66 66 90 48 89 f7 57 9d <0f> 1f 44 00 00 c9 c3 66 90 55 48 89 e5 0f 1f 44 00 00 f0 ff 07 Jun 29 02:52:30 intel-e7450-512-1 kernel: Call Trace: Jun 29 02:52:30 intel-e7450-512-1 kernel: <IRQ> [<ffffffff81057642>] ? update_shares+0xd2/0x110 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8105d462>] ? rebalance_domains+0x52/0x5b0 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8105da0c>] ? run_rebalance_domains+0x4c/0x160 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8106fc71>] ? __do_softirq+0xc1/0x1d0 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff810932a5>] ? hrtimer_interrupt+0x1b5/0x250 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8100c20c>] ? call_softirq+0x1c/0x30 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8100de45>] ? do_softirq+0x65/0xa0 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8106fa55>] ? irq_exit+0x85/0x90 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff814e5140>] ? smp_apic_timer_interrupt+0x70/0x9b Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8100bbd3>] ? apic_timer_interrupt+0x13/0x20 Jun 29 02:52:30 intel-e7450-512-1 kernel: <EOI> [<ffffffff81060410>] ? migration_thread+0x260/0x2e0 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff810601b0>] ? migration_thread+0x0/0x2e0 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8108e386>] ? kthread+0x96/0xa0 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8100c10a>] ? child_rip+0xa/0x20 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8108e2f0>] ? kthread+0x0/0xa0 Jun 29 02:52:30 intel-e7450-512-1 kernel: [<ffffffff8100c100>] ? child_rip+0x0/0x20 Jun 29 02:54:11 intel-e7450-512-1 abrt: Kerneloops: Reported 1 kernel oopses to Abrt Jun 29 02:54:11 intel-e7450-512-1 abrtd: Directory 'kerneloops-1309287251-5694-1' creation detected Jun 29 02:54:11 intel-e7450-512-1 abrtd: Crash is in database already (dup of /var/spool/abrt/kerneloops-1309248540-5694-1) Please do what Don asked.
tested with [host] libvirt-0.9.4-11.el6.x86_64 qemu-kvm-0.12.1.2-2.185.el6.x86_64 kernel-2.6.32-193.el6.x86_64 [guest] tree RHEL6.2-20110907.1 And didn't encounter this bug with irqbalance service running
(In reply to comment #4) leaving 328 guest running in the morning , and I suddenly saw the same kernel call trace again . Since now , it is with irqbalance service running , I'll confirm with irqbalance off asap . And paste the result here after I get it .
(In reply to comment #4) (In reply to comment #5) Went on my test with pkgs in comment #4 , and turned off irqbalance service , left 428 guests running through the weekend , didn't encounter this bug . Summary: 1. with irqbalance on , start 328 guests, leave those guests running through the morning , got kernel call trace errors . 2. with irqbalance off, start 428 guests , leave those guests running through the weekend two days , didn't get any kernel call trace errors .
Since RHEL 6.2 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux.
This request was not resolved in time for the current release. Red Hat invites you to ask your support representative to propose this request, if still desired, for consideration in the next release of Red Hat Enterprise Linux.
For RHEL7.3. I can not reproduce this bug with qemu-kvm-1.5.3-126 & qemu-kvm-rhev-2.6.0-27 & 3.10.0-514.el7. I booted 600 RHEL7.3 guests on the same one host. All guests work well. and didn't find call trace errors from dmesg log. I also tested overcommit memory and cpu. All guests also work well. Host info: MemTotal: 394680668 kB On-line CPU(s) list: 0-191 Steps detailed: 1.setup host configuration #ulimit -n 4096 #/etc/libvirt/qemu.conf max_files =4096 #service libvirtd restart #echo 655360 >/proc/sys/fs/aio-max-nr 2.guest xml <domain type='kvm'> <name>g1</name> <memory unit='M'>1024</memory> <vcpu placement='static'>1</vcpu> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.3.0'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <pae/> </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='block' device='disk'> <driver name='qemu' type='qcow2' cache='none' io='native'/> <source dev='/home/images/g1.qcow2'/> <target dev='vda' bus='virtio'/> </disk> <controller type='virtio-serial' index='0'> </controller> <serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console> <channel type='spicevmc'> <target type='virtio' name='com.redhat.spice.0'/> </channel> <input type='tablet' bus='usb'/> <input type='mouse' bus='ps2'/> <graphics type='spice' autoport='yes'/> <video> <model type='qxl' ram='65536' vram='65536' heads='1'/> </video> <memballoon model='none'> </memballoon> </devices> </domain> 3.prepare guest images for i in `seq 1 600` do cp g1.qcow2 images/g$i.qcow2 done 4.create multiple xml for i in `seq 1 600` do sed -i "s,#num#,g$i,g" g$i.xml virsh define g$i.xml #sleep 1 done 5. start all guest for i in `seq 61 600`;do virsh destroy g$i;virsh start g$i;sleep 5;done 6.check guest log #virsh console domain #dmesg Another, QE will test RHEL6.8 with the same steps above. and will update result to bz asap.
For RHEL6.8. QE can not reproduce it either. 600 guests work well on host(intel-purley-lr-02.khw.lab.eng.bos.redhat.com) kernel:2.6.32-642.el6.x86_64 qemu-kvm:qemu-kvm-rhev-0.12.1.2-2.491.el6.x86_64