Red Hat Bugzilla – Bug 1116398
RHEV-H crashes and reboots when ksmd (MOM) is enabled
Last modified: 2015-07-22 04:09:45 EDT
Description of problem: When KSM is active, hypervisors locks up and occasionally reboot un 9 17:17:58 rhhyper11 kernel: BUG: soft lockup - CPU#4 stuck for 67s! [qemu-kvm:6084] Jun 9 17:17:58 rhhyper11 kernel: Modules linked in: iptable_nat nf_nat ebt_arp nfs fscache auth_rpcgss nfs_acl ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp ebtable_nat ebtables bnx2fc fcoe libfcoe libfc lockd sunrpc bridge bonding ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_physdev ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_multiport iptable_filter ip_tables ext4 jbd2 8021q garp stp llc sha256_generic cbc cryptoloop dm_crypt ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 aes_generic vhost_net macvtap macvlan tun kvm_amd kvm sg hpwdt amd64_edac_mod edac_core edac_mce_amd i2c_piix4 shpchp dm_snapshot squashfs ext2 mbcache dm_round_robin sd_mod hpsa lpfc scsi_transport_fc scsi_tgt crc_t10dif be2net pata_acpi ata_generic pata_atiixp ahci radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_multipath dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscs Jun 9 17:17:58 rhhyper11 kernel: i_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] Jun 9 17:17:58 rhhyper11 kernel: CPU 4 Jun 9 17:17:58 rhhyper11 kernel: Modules linked in: iptable_nat nf_nat ebt_arp nfs fscache auth_rpcgss nfs_acl ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp ebtable_nat ebtables bnx2fc fcoe libfcoe libfc lockd sunrpc bridge bonding ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_physdev ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_multiport iptable_filter ip_tables ext4 jbd2 8021q garp stp llc sha256_generic cbc cryptoloop dm_crypt ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 aes_generic vhost_net macvtap macvlan tun kvm_amd kvm sg hpwdt amd64_edac_mod edac_core edac_mce_amd i2c_piix4 shpchp dm_snapshot squashfs ext2 mbcache dm_round_robin sd_mod hpsa lpfc scsi_transport_fc scsi_tgt crc_t10dif be2net pata_acpi ata_generic pata_atiixp ahci radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_multipath dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscs Jun 9 17:17:58 rhhyper11 kernel: i_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] Jun 9 17:17:58 rhhyper11 kernel: Jun 9 17:17:58 rhhyper11 kernel: Pid: 6084, comm: qemu-kvm Not tainted 2.6.32-431.11.2.el6.x86_64 #1 HP ProLiant BL685c G7 Jun 9 17:17:58 rhhyper11 kernel: RIP: 0010:[<ffffffff8152a92e>] [<ffffffff8152a92e>] _spin_lock+0x1e/0x30 Jun 9 17:17:58 rhhyper11 kernel: RSP: 0000:ffff880f80a0d5e8 EFLAGS: 00000287 Jun 9 17:17:58 rhhyper11 kernel: RAX: 000000000000dfa0 RBX: ffff880f80a0d5e8 RCX: ffff880000000000 Jun 9 17:17:58 rhhyper11 kernel: RDX: 000000000000df9f RSI: ffff880c39721d20 RDI: ffffea002ab1b5b8 Jun 9 17:17:58 rhhyper11 kernel: RBP: ffffffff8100bb8e R08: 0000000000000000 R09: 0000000000001000 Jun 9 17:17:58 rhhyper11 kernel: R10: 0000000000013560 R11: 0000000000000000 R12: ffffffffa04b90e9 Jun 9 17:17:58 rhhyper11 kernel: R13: ffff880f80a0d5b8 R14: ffff8813e123d000 R15: 0000000000000003 Jun 9 17:17:58 rhhyper11 kernel: FS: 00007f7001a2c980(0000) GS:ffff880028240000(0000) knlGS:0000000000000000 Jun 9 17:17:58 rhhyper11 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jun 9 17:17:58 rhhyper11 kernel: CR2: 00007f6dd9a00000 CR3: 0000001b77d51000 CR4: 00000000000007e0 Jun 9 17:17:58 rhhyper11 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jun 9 17:17:58 rhhyper11 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jun 9 17:17:58 rhhyper11 kernel: Process qemu-kvm (pid: 6084, threadinfo ffff880f80a0c000, task ffff88103a570040) Jun 9 17:17:58 rhhyper11 kernel: Stack: Jun 9 17:17:58 rhhyper11 kernel: ffff880f80a0d628 ffffffff81153e81 ffff880f80a0d608 ffff881c39222d40 Jun 9 17:17:58 rhhyper11 kernel: <d> 0000000000000301 00007f30b49ae000 ffff8808348c04a8 0000000000000301 Jun 9 17:17:58 rhhyper11 kernel: <d> ffff880f80a0d6b8 ffffffff81154670 ffff880c3923ccd0 ffffffff81173b00 Jun 9 17:17:58 rhhyper11 kernel: Call Trace: Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81153e81>] ? page_check_address+0x141/0x1d0 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81154670>] ? try_to_unmap_one+0x40/0x500 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81173b00>] ? remove_migration_pte+0x0/0x300 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81155294>] ? rmap_walk+0x184/0x230 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff811696ab>] ? compaction_alloc+0x3b/0x460 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff811553ee>] ? try_to_unmap_anon+0xae/0x140 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81155cd5>] ? try_to_unmap+0x55/0x70 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81174de3>] ? migrate_pages+0x333/0x480 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8100b9ce>] ? common_interrupt+0xe/0x13 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81169670>] ? compaction_alloc+0x0/0x460 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8116a1b1>] ? compact_zone+0x581/0x950 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8116a82c>] ? compact_zone_order+0xac/0x100 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff814898e9>] ? nf_iterate+0x69/0xb0 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8113b898>] ? zone_reclaim+0x558/0x650 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff814b1001>] ? tcp_send_delayed_ack+0xf1/0x100 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff814ad88b>] ? tcp_rcv_established+0x39b/0x7f0 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8100b9ce>] ? common_interrupt+0xe/0x13 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8112d82c>] ? get_page_from_freelist+0x6ac/0x870 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff814af44e>] ? tcp_transmit_skb+0x40e/0x7b0 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff814a4d91>] ? tcp_recvmsg+0x821/0xe80 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8112f3a3>] ? __alloc_pages_nodemask+0x113/0x8d0 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8144a929>] ? sock_common_recvmsg+0x39/0x50 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81449f23>] ? sock_recvmsg+0x133/0x160 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81167baa>] ? alloc_pages_vma+0x9a/0x150 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8118344d>] ? do_huge_pmd_anonymous_page+0x14d/0x3b0 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8114b360>] ? handle_mm_fault+0x2f0/0x300 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8104a8d8>] ? __do_page_fault+0x138/0x480 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8144a11b>] ? sys_recvfrom+0x16b/0x180 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81144de0>] ? sys_madvise+0x350/0x790 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8152da7e>] ? do_page_fault+0x3e/0xa0 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8152ae35>] ? page_fault+0x25/0x30 Jun 9 17:17:58 rhhyper11 kernel: Code: 00 00 00 01 74 05 e8 22 41 d6 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 Jun 9 17:17:58 rhhyper11 kernel: Call Trace: Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81154cb0>] ? page_add_anon_rmap+0x10/0x20 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81153e81>] ? page_check_address+0x141/0x1d0 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81154670>] ? try_to_unmap_one+0x40/0x500 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81173b00>] ? remove_migration_pte+0x0/0x300 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81155294>] ? rmap_walk+0x184/0x230 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff811696ab>] ? compaction_alloc+0x3b/0x460 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff811553ee>] ? try_to_unmap_anon+0xae/0x140 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81155cd5>] ? try_to_unmap+0x55/0x70 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81174de3>] ? migrate_pages+0x333/0x480 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8100b9ce>] ? common_interrupt+0xe/0x13 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81169670>] ? compaction_alloc+0x0/0x460 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8116a1b1>] ? compact_zone+0x581/0x950 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8116a82c>] ? compact_zone_order+0xac/0x100 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff814898e9>] ? nf_iterate+0x69/0xb0 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8113b898>] ? zone_reclaim+0x558/0x650 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff814b1001>] ? tcp_send_delayed_ack+0xf1/0x100 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff814ad88b>] ? tcp_rcv_established+0x39b/0x7f0 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8100b9ce>] ? common_interrupt+0xe/0x13 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8112d82c>] ? get_page_from_freelist+0x6ac/0x870 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff814af44e>] ? tcp_transmit_skb+0x40e/0x7b0 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff814a4d91>] ? tcp_recvmsg+0x821/0xe80 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8112f3a3>] ? __alloc_pages_nodemask+0x113/0x8d0 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8144a929>] ? sock_common_recvmsg+0x39/0x50 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81449f23>] ? sock_recvmsg+0x133/0x160 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81167baa>] ? alloc_pages_vma+0x9a/0x150 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8118344d>] ? do_huge_pmd_anonymous_page+0x14d/0x3b0 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8114b360>] ? handle_mm_fault+0x2f0/0x300 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8104a8d8>] ? __do_page_fault+0x138/0x480 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8144a11b>] ? sys_recvfrom+0x16b/0x180 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff81144de0>] ? sys_madvise+0x350/0x790 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8152da7e>] ? do_page_fault+0x3e/0xa0 Jun 9 17:17:58 rhhyper11 kernel: [<ffffffff8152ae35>] ? page_fault+0x25/0x30 Jun 9 17:18:58 rhhyper11 kernel: BUG: soft lockup - CPU#0 stuck for 67s! [ksmd:507] Jun 9 17:18:58 rhhyper11 kernel: Modules linked in: iptable_nat nf_nat ebt_arp nfs fscache auth_rpcgss nfs_acl ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp ebtable_nat ebtables bnx2fc fcoe libfcoe libfc lockd sunrpc bridge bonding ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_physdev ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_multiport iptable_filter ip_tables ext4 jbd2 8021q garp stp llc sha256_generic cbc cryptoloop dm_crypt ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 aes_generic vhost_net macvtap macvlan tun kvm_amd kvm sg hpwdt amd64_edac_mod edac_core edac_mce_amd i2c_piix4 shpchp dm_snapshot squashfs ext2 mbcache dm_round_robin sd_mod hpsa lpfc scsi_transport_fc scsi_tgt crc_t10dif Jun 9 17:26:59 rhhyper11 kernel: imklog 5.8.10, log source = /proc/kmsg started. [crash] Jun 9 17:26:59 rhhyper11 rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="10319" x-info="http://www.rsyslog.com"] start Jun 9 17:26:59 rhhyper11 kernel: Initializing cgroup subsys cpuset Jun 9 17:26:59 rhhyper11 kernel: Initializing cgroup subsys cpu Jun 9 17:26:59 rhhyper11 kernel: Linux version 2.6.32-431.11.2.el6.x86_64 (mockbuild@x86-027.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Mon Mar 3 13:32:45 EST 2014 J Version-Release number of selected component (if applicable): Red Hat Enterprise Virtualization Hypervisor release 6.5 (20140520.0.el6ev) How reproducible: On customer's hardware constantly when KSM is enabled, Steps to Reproduce: 1. Run VMs on hypervisor to trigger KSM activation Actual results: Hypervisor crashes Expected results: Hypervisor works with KSM normally Additional info: disabling KSM completely workarounds the issue, environment running stable for weeks # vdsClient -s 0 setMOMPolicyParameters ksmEnabled=False # service ksmtuned status # service ksmtuned stop
Moving this to qemu-kvm for now, but I am not sure if this is a qemu-kvm-rhev or kernel issue.
This is affecting (at least) RHEV-H so requesting 6.5.z
Hi, The first thing that comes to mind when I see a soft lockup like this is that you are trying to run too many vCPUs given the number of physical cores you have in the hardware. Could you share some more information about your configuration: 1. /proc/cpuinfo on the host 2. Details about the VM load on the host a. how many vms b. how many vCPUs per vm c. how much memory assigned to each VM d. workload that is driving memory consumption inside the VMs 3. /var/log/vdsm/mom.log around the time of the crash so I can see the ksm settings that are being activated.
RHEV-H QE got the following conclusions after did three different scenarios. Test version: rhev-hypervisor6-6.5-20140624.0.el6ev ovirt-node-3.0.1-18.el6_5.11.noarch vdsm-4.14.7-3.el6ev.x86_64 RHEVM av10 Test scenario 1 (Host: RHEV-H): Run lots of VMs to full of memory for trigger KSM activation. Test result: ksm server can start automatic and RHEV-H will no crash. Test scenario 2 (Host: RHEV-H): 1. Create a VM and made the memory more closely to the host(e.g. The memory of host= 48G, then set the Memory of VM to 48G) 2. Run eatmemory script on VM for run out of memory Test result: RHEV-H crashes. Test scenario 3 (Host: RHEL): Do the same steps with scenario 2 on RHEL host. Test result: 1. The process of eatmemory will be killed automatic. 2. Host(RHEL) will not crash. So the crash only occurs on RHEV-H but no RHEL. You can find the script and crash.png in the attachment. Thanks!
Created attachment 917023 [details] eatmemory
Created attachment 917024 [details] RHEV-H-crash.png
Hey Ying, thanks for the excessive testing. Pelase provide some more details (see inline). (In reply to shaochen from comment #6) > Test scenario 2 (Host: RHEV-H): > 1. Create a VM and made the memory more closely to the host(e.g. The memory > of host= 48G, then set the Memory of VM to 48G) > 2. Run eatmemory script on VM for run out of memory > > Test result: > RHEV-H crashes. Please provide the output of `free -m` , or some details on how much swap was avilable. > Test scenario 3 (Host: RHEL): > Do the same steps with scenario 2 on RHEL host. > > Test result: > 1. The process of eatmemory will be killed automatic. > 2. Host(RHEL) will not crash. The same as above - please proved `free -m` or some details on the swap available. > So the crash only occurs on RHEV-H but no RHEL. > You can find the script and crash.png in the attachment. IIUIC then it's quite normal that memory hogs get killed in cases that the kernel runs out of memory.
> Please provide the output of `free -m` , or some details on how much swap > was avilable. > ===================================================== [root@ibm-x3650m3-01 admin]# free -m swap total used free shared buffers cached Mem: 48259 4505 43754 0 30 195 -/+ buffers/cache: 4278 43981 Swap: 32323 868 31455 [root@ibm-x3650m3-01 admin]# free -m swap total used free shared buffers cached Mem: 48259 9817 38442 0 30 195 -/+ buffers/cache: 9591 38668 Swap: 32323 868 31455 [root@ibm-x3650m3-01 admin]# free -m swap total used free shared buffers cached Mem: 48259 22919 25340 0 30 196 -/+ buffers/cache: 22692 25567 Swap: 32323 867 31456 [root@ibm-x3650m3-01 admin]# free -m swap total used free shared buffers cached Mem: 48259 36447 11812 0 30 196 -/+ buffers/cache: 36220 12039 Swap: 32323 864 31459 [root@ibm-x3650m3-01 admin]# free -m swap total used free shared buffers cached Mem: 48259 46712 1547 0 30 196 -/+ buffers/cache: 46485 1774 Swap: 32323 864 31459 [root@ibm-x3650m3-01 admin]# free -m swap total used free shared buffers cached Mem: 48259 47990 269 0 22 148 -/+ buffers/cache: 47819 440 Swap: 32323 1137 31186 The scenario 2's error is different with original bug error. > > Test scenario 3 (Host: RHEL): > > Do the same steps with scenario 2 on RHEL host. > > > > Test result: > > 1. The process of eatmemory will be killed automatic. > > 2. Host(RHEL) will not crash. > > The same as above - please proved `free -m` or some details on the swap > available. ==================================================== [root@dell-op740-03 ~]# free -m swap total used free shared buffers cached Mem: 7808 1428 6380 0 81 163 -/+ buffers/cache: 1183 6625 Swap: 0 0 0 [root@dell-op740-03 ~]# free -m swap total used free shared buffers cached Mem: 7808 3955 3853 0 81 163 -/+ buffers/cache: 3710 4098 Swap: 0 0 0 [root@dell-op740-03 ~]# free -m swap total used free shared buffers cached Mem: 7808 5995 1813 0 81 163 -/+ buffers/cache: 5750 2058 Swap: 0 0 0 [root@dell-op740-03 ~]# free -m swap total used free shared buffers cached Mem: 7808 7583 225 0 81 163 -/+ buffers/cache: 7338 470 Swap: 0 0 0 > > > So the crash only occurs on RHEV-H but no RHEL. > > You can find the script and crash.png in the attachment. > > IIUIC then it's quite normal that memory hogs get killed in cases that the > kernel runs out of memory.
Chen, could you also please provide the informations Adam needs, see comment 5.
(In reply to Fabian Deutsch from comment #14) > Chen, could you also please provide the informations Adam needs, see comment > 5. 1. /proc/cpuinfo on the host Please see attachment "cpuinfo.txt" 2. Details about the VM load on the host a. how many vms 9 vms b. how many vCPUs per vm 1 or 2 vCPUs per vm c. how much memory assigned to each VM The total Memory of the host is 48G, and 4800M memory assigned to each VM. d. workload that is driving memory consumption inside the VMs 99% 3. /var/log/vdsm/mom.log around the time of the crash so I can see the ksm settings that are being activated. Please see attachment "mom.log"
Created attachment 918569 [details] mom.log
Created attachment 918570 [details] cpuinfo.txt
Hey Linqing, As comment 10 said, the scenario 2's error is different with original bug error. We suspect that it is another new bug, not the same issue as this bug. And we can not determine Shao Chen's test procedure is 100% steps to reproduce the customer bug. Could you help to reproduce this bug in kernel side? Thanks Ying
Hey Alexander, See comment 5, could you reply and provide the information? Thanks Ying
Hey Adam, do the informations (thanks Chen) from comment 15 shed some more light on this?
Other two needinfo flags from got removed unintentionally. Adding them back. Sorry for the confusion.
Attachment in comment #8 is a different bug than the one in comment #0 and comment #1. For comment #8 please file another bugreport, I've fixed in my upstream aa git tree several issues with OOM handling related to ext4 I/O errors that even lead to remounting the fs readonly (found with trinity triggering floods of OOM). For this bug (comment #0 and comment #1) it seems some sort of deadlock in smp_call_function_single/many. I would have expected the NMI watchdog to trigger too but checking the sos report it didn't. The softlockup shows the deadlock kept running for 67 seconds before full crash. The NMI watchdog should fire in 5 seconds much less. It's unclear if it's a lock inversion between all those smp_call_functions running simultanously or something else. One wouldn't expect bugs in the IPI delivery logic because it runs all the time. It would help if you could run SYSRQ+L and SYSRQ+T while syslog is still able to log (i.e. within the first 67 seconds) and report it. A crash dump would also help as then we could see all CPU stacktraces. Just to make an example CPU 1 is not shown. It is possible the culprit is in one of those CPUs that don't show the softlockup. I'll think more about the available stack trace next week. And if this is only reproducible in a single NUMA system and not everywhere else, we could evaluate if there could be hardware issues in the NUMA IPI delivery. A lost IPI can explain this too: there is a CPU waiting in csd_lock_wait in generic_exec_single, that is just waiting the IPI to run. (again if the IPI doesn't run normally it means the irqs have been disabled for too long on such CPU, but then the NMI watchdog should have fired, or the IPI was lost by the hardware, or there is some other software bug in the IPI delivery). "grep NMI /proc/interrupts" and "cat /proc/sys/kernel/nmi_watchdog" can also verify the NMI watchdog is running.
Shao Chen, As comment 25, could you help to submit a new bug for your comment #8? Thanks.
(In reply to Ying Cui from comment #28) > Shao Chen, > As comment 25, could you help to submit a new bug for your comment #8? > Thanks. I can't reproduce this issue with rhev-hypervisor6-6.5-20140821.1.el6ev(kernel-2.6.32-431.29.2.el6.x86_64 + qemu-kvm-rhev-0.12.1.2-2.415.el6_5.14.x86_64). The process of eatmemory will be killed automatic. so please ignore my comment. Thanks! ./eatmemory 20000M Eating 20971520000 bytes in chunks of 1024... Killed
I'm building a patch after discussion with Andrea.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release.
*** Bug 1083448 has been marked as a duplicate of this bug. ***
Patch(es) available on kernel-2.6.32-527.el6
Test following scenarios with # uname -r 2.6.32-550.el6.x86_64 # rpm -q qemu-kvm qemu-kvm-0.12.1.2-2.464.el6.x86_64 # rpm -qpi kernel-2.6.32-550.el6.x86_64.rpm --changelog |grep 1116398 - [x86] kvm: Avoid pagefault in kvm_lapic_sync_to_vapic (Paolo Bonzini) [1116398] ENV: host has 512G memory: # free -g total used free shared buffers cached Mem: 504 3 501 0 0 0 -/+ buffers/cache: 2 501 Swap: 3 0 3 The cli of guests like this: /usr/libexec/qemu-kvm -cpu Opteron_G1 -M rhel6.5.0 -enable-kvm -m 52G -smp 4,sockets=1,cores=4,threads=1 -name rhel6.4-64 -uuid 9a0e67ec-f286-d8e7-0548-0c1c9ec93009 -nodefconfig -nodefaults -monitor stdio -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive file=/home/RHEL-Server-6.7-64-virtio-scsi.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,snapshot=on -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:d5:51:12,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -vga qxl -spice port=5911,disable-ticketing,seamless-migration=on -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=67108864 scenario1: 1.start 14 guests: # ps aux |grep qemu -c 14 2.start stress in guest: # stress -m 1 --vm-bytes 50000M --vm-keep 3.wait for the memory of host is full used to trigger KSM activation. # free -m total used free shared buffers cached Mem: 516858 516337 521 0 1 46 -/+ buffers/cache: 516289 569 Swap: 4095 4040 55 # service ksm status ksm is running Result: wait for long time, host/guests work well, no crash/softlock occurs. scenario2: # service ksmtuned status ksmtuned is stopped # service ksm status ksm is running 1.start a 50G memory guest 2.Start stress in guest: # stress -m 1 --vm-bytes 50000M --vm-keep 3.try to execute the eatmemory program # ./eatmemory 500000M Eating 524288000000 bytes in chunks of 1024... Killed Result:guest/host work well. So the tests pass, and this bug is fixed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1272.html