RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1726486 - System crash with exception RIP: kvm_zap_rmapp+0x34
Summary: System crash with exception RIP: kvm_zap_rmapp+0x34
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel
Version: 7.4
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Paolo Bonzini
QA Contact: Chao Yang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-03 02:34 UTC by Amit Kumar Das
Modified: 2019-12-11 12:41 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-12-11 12:41:04 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 2924271 0 Troubleshoot None Kernel Panic and crash while executing "kvm_zap_rmapp" 2019-07-03 02:43:07 UTC

Description Amit Kumar Das 2019-07-03 02:34:57 UTC
Description of problem:

This issue is seen in RHEL7.3 kernel and been fixed.
However, issue reproduced again in RHEL7.4 kernel.

//Previous BUG//

https://bugzilla.redhat.com/show_bug.cgi?id=1421296
https://access.redhat.com/solutions/2924271


#retrace-server-interact 233607565 crash

crash> sys
      KERNEL: /cores/retrace/repos/kernel/x86_64/usr/lib/debug/lib/modules/3.10.0-693.11.6.el7.x86_64/vmlinux
    DUMPFILE: /cores/retrace/tasks/233607565/crash/vmcore  [PARTIAL DUMP]
        CPUS: 32
        DATE: Mon Jul  1 09:10:59 2019
      UPTIME: 477 days, 18:04:56
LOAD AVERAGE: 7.00, 5.38, 4.18
       TASKS: 6110
    NODENAME: xxxx.xx.xx.xx.localdomain
     RELEASE: 3.10.0-693.11.6.el7.x86_64
     VERSION: #1 SMP Thu Dec 28 14:23:39 EST 2017
     MACHINE: x86_64  (2396 Mhz)
      MEMORY: 511.9 GB
       PANIC: "general protection fault: 0000 [#1] SMP "
crash> 


crash> bt
PID: 737365  TASK: ffff8837407ccf10  CPU: 1   COMMAND: "CPU 2/KVM"
 #0 [ffff882751ee3970] machine_kexec at ffffffff8105c58b
 #1 [ffff882751ee39d0] __crash_kexec at ffffffff81106742
 #2 [ffff882751ee3aa0] crash_kexec at ffffffff81106830
 #3 [ffff882751ee3ab8] oops_end at ffffffff816b0aa8
 #4 [ffff882751ee3ae0] die at ffffffff8102e87b
 #5 [ffff882751ee3b10] do_general_protection at ffffffff816b042e
 #6 [ffff882751ee3b40] general_protection at ffffffff816af898
    [exception RIP: kvm_zap_rmapp+52]
    RIP: ffffffffc0676434  RSP: ffff882751ee3bf8  RFLAGS: 00010206
    RAX: 0000000000000000  RBX: ffffc9009e2383a0  RCX: 00000000002c7e74
    RDX: 000080930000000c  RSI: 000080930000000c  RDI: ffff8854fdb08000
    RBP: ffff882751ee3c08   R8: 0000000000000001   R9: 0000000000000000
    R10: ffffc90034f04000  R11: 0000000000840000  R12: ffff8854fdb08000
    R13: ffffffffc0676460  R14: 0000000000000000  R15: ffffc90034f04008
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffff882751ee3c10] kvm_unmap_rmapp at ffffffffc067646e [kvm]
 #8 [ffff882751ee3c20] kvm_handle_hva_range at ffffffffc0673a14 [kvm]
 #9 [ffff882751ee3cc0] kvm_unmap_hva_range at ffffffffc067fa87 [kvm]
#10 [ffff882751ee3cd0] kvm_mmu_notifier_invalidate_range_start at ffffffffc0656aa3 [kvm]
#11 [ffff882751ee3d10] __mmu_notifier_invalidate_range_start at ffffffff811d7d24
#12 [ffff882751ee3d50] change_protection_range at ffffffff811bbd44
#13 [ffff882751ee3e50] change_protection at ffffffff811bbdf5
#14 [ffff882751ee3e88] change_prot_numa at ffffffff811d48bb
#15 [ffff882751ee3e98] task_numa_work at ffffffff810caf92
#16 [ffff882751ee3ef0] task_work_run at ffffffff810aede7
#17 [ffff882751ee3f30] do_notify_resume at ffffffff8102ab52
#18 [ffff882751ee3f50] int_signal at ffffffff816b8d37
    RIP: 00007fb9c81b3107  RSP: 00007fb9b718c858  RFLAGS: 00000246
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: ffffffffffffffff
    RDX: 0000000000000000  RSI: 000000000000ae80  RDI: 000000000000001a
    RBP: 0000000000000000   R8: 0000556a2b644bf0   R9: 000000000275d9be
    R10: 000000005d1a0658  R11: 0000000000000246  R12: 0000556a2b62b8e0
    R13: 0000000000000000  R14: 00007fb9e0b18000  R15: 0000556a2d634000
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b
crash> 


[41282315.573796] kvm_get_msr_common: 6 callbacks suppressed
[41282315.573800] kvm [230918]: vcpu0 unhandled rdmsr: 0x140
[41282315.600438] kvm [230918]: vcpu1 unhandled rdmsr: 0x140
[41282315.630459] kvm [230918]: vcpu2 unhandled rdmsr: 0x140
[41282315.660057] kvm [230918]: vcpu3 unhandled rdmsr: 0x140
[41282316.450382] kvm [230918]: vcpu0 unhandled rdmsr: 0x34
[41282378.638141] general protection fault: 0000 [#1] SMP 
[41282378.644265] Modules linked in: xt_set fuse btrfs raid6_pq xor vfat msdos fat ext4 mbcache jbd2 mptctl mptbase nf_log_ipv4 nf_log_common xt_LOG ipt_REJECT nf_reject_ipv4 nf_conntrack_netlink ip6table_raw ip6table_filter ip6_tables xt_CT iptable_raw xt_mac xt_comment xt_physdev xt_multiport xt_conntrack iptable_filter ip_tables ip_set_hash_net ip_set nfnetlink sctp_diag sctp dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag vhost_net vhost macvtap macvlan tun veth nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache bonding vport_vxlan vxlan ip6_udp_tunnel udp_tunnel openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 br_netfilter nls_utf8 isofs iTCO_wdt iTCO_vendor_support sb_edac edac_core intel_powerclamp coretemp
[41282378.725448]  intel_rapl iosf_mbi kvm_intel kvm irqbypass pcspkr i2c_i801 lpc_ich sg hpilo hpwdt ioatdma dca ipmi_si ipmi_devintf ipmi_msghandler pcc_cpufreq shpchp acpi_power_meter nf_conntrack bridge nfsd stp llc binfmt_misc nfs_acl lockd grace auth_rpcgss xfs sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper scsi_transport_iscsi syscopyarea sysfillrect sysimgblt bnx2x crct10dif_pclmul crct10dif_common crc32_pclmul fb_sys_fops crc32c_intel ttm ghash_clmulni_intel drm aesni_intel lrw gf128mul glue_helper ablk_helper cryptd serio_raw tg3 hpsa i2c_core ptp pps_core scsi_transport_sas mdio libcrc32c wmi sunrpc dm_mirror dm_region_hash dm_log dm_mod [last unloaded: xt_physdev]
[41282378.795510] CPU: 1 PID: 737365 Comm: CPU 2/KVM Not tainted 3.10.0-693.11.6.el7.x86_64 #1
[41282378.805325] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 09/13/2016
[41282378.815424] task: ffff8837407ccf10 ti: ffff882751ee0000 task.ti: ffff882751ee0000
[41282378.824609] RIP: 0010:[<ffffffffc0676434>]  [<ffffffffc0676434>] kvm_zap_rmapp+0x34/0x60 [kvm]
[41282378.835607] RSP: 0018:ffff882751ee3bf8  EFLAGS: 00010206
[41282378.842992] RAX: 0000000000000000 RBX: ffffc9009e2383a0 RCX: 00000000002c7e74
[41282378.851882] RDX: 000080930000000c RSI: 000080930000000c RDI: ffff8854fdb08000
[41282378.861049] RBP: ffff882751ee3c08 R08: 0000000000000001 R09: 0000000000000000
[41282378.869894] R10: ffffc90034f04000 R11: 0000000000840000 R12: ffff8854fdb08000
[41282378.878779] R13: ffffffffc0676460 R14: 0000000000000000 R15: ffffc90034f04008
[41282378.887633] FS:  00007fb9b718f700(0000) GS:ffff883f7fa40000(0000) knlGS:0000000000000000
[41282378.897671] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[41282378.905161] CR2: 00007f0a2edfd000 CR3: 0000006c6be40000 CR4: 00000000001627e0
[41282378.914094] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[41282378.923642] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[41282378.933249] Call Trace:
[41282378.937096]  [<ffffffffc067646e>] kvm_unmap_rmapp+0xe/0x20 [kvm]
[41282378.945495]  [<ffffffffc0673a14>] kvm_handle_hva_range+0x134/0x1a0 [kvm]
[41282378.954603]  [<ffffffffc067fa87>] kvm_unmap_hva_range+0x17/0x20 [kvm]
[41282378.962794]  [<ffffffffc0656aa3>] kvm_mmu_notifier_invalidate_range_start+0x53/0x90 [kvm]
[41282378.972908]  [<ffffffff811d7d24>] __mmu_notifier_invalidate_range_start+0x64/0xc0
[41282378.982281]  [<ffffffff811bbd44>] change_protection_range+0x794/0x7e0
[41282378.990513]  [<ffffffffc066e7d2>] ? kvm_arch_vcpu_put+0x22/0x40 [kvm]
[41282378.998817]  [<ffffffff811bbdf5>] change_protection+0x65/0xa0
[41282379.006301]  [<ffffffff811d48bb>] change_prot_numa+0x1b/0x40
[41282379.013682]  [<ffffffff810caf92>] task_numa_work+0x202/0x350
[41282379.021204]  [<ffffffff810aede7>] task_work_run+0xa7/0xf0
[41282379.028323]  [<ffffffff8102ab52>] do_notify_resume+0x92/0xb0
[41282379.035721]  [<ffffffff816b8d37>] int_signal+0x12/0x17
[41282379.042612] Code: 41 54 53 48 8b 16 48 89 f3 48 85 d2 74 3c 49 89 fc 31 c0 0f 1f 40 00 f6 c2 01 48 89 d6 74 07 48 83 e2 fe 48 8b 32 48 85 f6 74 1a <f6> 06 01 74 1e 4c 89 e7 e8 1f ff ff ff 48 8b 13 b8 01 00 00 00 
[41282379.066773] RIP  [<ffffffffc0676434>] kvm_zap_rmapp+0x34/0x60 [kvm]
[41282379.075684]  RSP <ffff882751ee3bf8>
crash> 

crash> mod -t
no tainted modules
crash> ps -S
  RU: 67
  IN: 6042
  WA: 1


crash> dis -lr ffffffffc0676434
/usr/src/debug/kernel-3.10.0-693.11.6.el7/linux-3.10.0-693.11.6.el7.x86_64/arch/x86/kvm/mmu.c: 1397
0xffffffffc0676400 <kvm_zap_rmapp>:	nopl   0x0(%rax,%rax,1) [FTRACE NOP]
0xffffffffc0676405 <kvm_zap_rmapp+5>:	push   %rbp
0xffffffffc0676406 <kvm_zap_rmapp+6>:	mov    %rsp,%rbp
0xffffffffc0676409 <kvm_zap_rmapp+9>:	push   %r12
0xffffffffc067640b <kvm_zap_rmapp+11>:	push   %rbx
/usr/src/debug/kernel-3.10.0-693.11.6.el7/linux-3.10.0-693.11.6.el7.x86_64/arch/x86/kvm/mmu.c: 1137
0xffffffffc067640c <kvm_zap_rmapp+12>:	mov    (%rsi),%rdx
/usr/src/debug/kernel-3.10.0-693.11.6.el7/linux-3.10.0-693.11.6.el7.x86_64/arch/x86/kvm/mmu.c: 1397
0xffffffffc067640f <kvm_zap_rmapp+15>:	mov    %rsi,%rbx
/usr/src/debug/kernel-3.10.0-693.11.6.el7/linux-3.10.0-693.11.6.el7.x86_64/arch/x86/kvm/mmu.c: 1137
0xffffffffc0676412 <kvm_zap_rmapp+18>:	test   %rdx,%rdx
0xffffffffc0676415 <kvm_zap_rmapp+21>:	je     0xffffffffc0676453 <kvm_zap_rmapp+83>
0xffffffffc0676417 <kvm_zap_rmapp+23>:	mov    %rdi,%r12
/usr/src/debug/kernel-3.10.0-693.11.6.el7/linux-3.10.0-693.11.6.el7.x86_64/arch/x86/kvm/mmu.c: 1400
0xffffffffc067641a <kvm_zap_rmapp+26>:	xor    %eax,%eax
0xffffffffc067641c <kvm_zap_rmapp+28>:	nopl   0x0(%rax)
/usr/src/debug/kernel-3.10.0-693.11.6.el7/linux-3.10.0-693.11.6.el7.x86_64/arch/x86/kvm/mmu.c: 1140
0xffffffffc0676420 <kvm_zap_rmapp+32>:	test   $0x1,%dl
/usr/src/debug/kernel-3.10.0-693.11.6.el7/linux-3.10.0-693.11.6.el7.x86_64/arch/x86/kvm/mmu.c: 1142
0xffffffffc0676423 <kvm_zap_rmapp+35>:	mov    %rdx,%rsi
/usr/src/debug/kernel-3.10.0-693.11.6.el7/linux-3.10.0-693.11.6.el7.x86_64/arch/x86/kvm/mmu.c: 1140
0xffffffffc0676426 <kvm_zap_rmapp+38>:	je     0xffffffffc067642f <kvm_zap_rmapp+47>
/usr/src/debug/kernel-3.10.0-693.11.6.el7/linux-3.10.0-693.11.6.el7.x86_64/arch/x86/kvm/mmu.c: 1145
0xffffffffc0676428 <kvm_zap_rmapp+40>:	and    $0xfffffffffffffffe,%rdx
/usr/src/debug/kernel-3.10.0-693.11.6.el7/linux-3.10.0-693.11.6.el7.x86_64/arch/x86/kvm/mmu.c: 1147
0xffffffffc067642c <kvm_zap_rmapp+44>:	mov    (%rdx),%rsi
/usr/src/debug/kernel-3.10.0-693.11.6.el7/linux-3.10.0-693.11.6.el7.x86_64/arch/x86/kvm/mmu.c: 1402
0xffffffffc067642f <kvm_zap_rmapp+47>:	test   %rsi,%rsi
0xffffffffc0676432 <kvm_zap_rmapp+50>:	je     0xffffffffc067644e <kvm_zap_rmapp+78>
/usr/src/debug/kernel-3.10.0-693.11.6.el7/linux-3.10.0-693.11.6.el7.x86_64/arch/x86/kvm/mmu.c: 1403
0xffffffffc0676434 <kvm_zap_rmapp+52>:	testb  $0x1,(%rsi)
crash> 

1594 int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end)
1595 {       
1596         return kvm_handle_hva_range(kvm, start, end, 0, kvm_unmap_rmapp);
1597 }       

crash> struct kvm.mmu_notifier -ox
struct kvm {
  [0x3368] struct mmu_notifier mmu_notifier;
}

crash> sym kvm_unmap_rmapp
ffffffffc0676460 (t) kvm_unmap_rmapp [kvm] /usr/src/debug/kernel-3.10.0-693.11.6.el7/linux-3.10.0-693.11.6.el7.x86_64/arch/x86/kvm/mmu.c: 1416

1531 static int kvm_handle_hva_range(struct kvm *kvm,
1532                                 unsigned long start,
1533                                 unsigned long end,
1534                                 unsigned long data,
1535                                 int (*handler)(struct kvm *kvm,
1536                                                struct kvm_rmap_head *rmap_head,
1537                                                struct kvm_memory_slot *slot,
1538                                                gfn_t gfn,
1539                                                int level,
1540                                                unsigned long data))
1541 {
......
1570                                 ret |= handler(kvm, iterator.rmap, memslot,
1571                                                iterator.gfn, iterator.level, data);

1413 static int kvm_unmap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
1414                            struct kvm_memory_slot *slot, gfn_t gfn, int level,
1415                            unsigned long data)
1416 {
1417         return kvm_zap_rmapp(kvm, rmap_head);
1418 }



Additional info:
Might be, this issue is not properly fixed in previous kernel or could have/had
broken something else in regression testing. 

Slightly different wrt panic backtrace function call when compared with previous bug.


==xxx.xx.xxx.localdomain==
crash> bt
PID: 737365  TASK: ffff8837407ccf10  CPU: 1   COMMAND: "CPU 2/KVM"
 #0 [ffff882751ee3970] machine_kexec at ffffffff8105c58b
 #1 [ffff882751ee39d0] __crash_kexec at ffffffff81106742
 #2 [ffff882751ee3aa0] crash_kexec at ffffffff81106830
 #3 [ffff882751ee3ab8] oops_end at ffffffff816b0aa8
 #4 [ffff882751ee3ae0] die at ffffffff8102e87b
 #5 [ffff882751ee3b10] do_general_protection at ffffffff816b042e               
 #6 [ffff882751ee3b40] general_protection at ffffffff816af898       <- calls general_protection()
    [exception RIP: kvm_zap_rmapp+0x34]


==previous BUG(1421296)==
crash> bt
  PID: 135844  TASK: ffff89a26c148000  CPU: 14  COMMAND: "qemu-kvm"
   #0 [ffff89cfa993f8c8] machine_kexec at ffffffff81051e9b
   #1 [ffff89cfa993f928] crash_kexec at ffffffff810f27a2
   #2 [ffff89cfa993f9f8] oops_end at ffffffff8163f448
   #3 [ffff89cfa993fa20] no_context at ffffffff8162f57b
   #4 [ffff89cfa993fa70] __bad_area_nosemaphore at ffffffff8162f611
   #5 [ffff89cfa993fab8] bad_area_nosemaphore at ffffffff8162f77b
   #6 [ffff89cfa993fac8] __do_page_fault at ffffffff816421be
   #7 [ffff89cfa993fb28] do_page_fault at ffffffff81642353
   #8 [ffff89cfa993fb50] page_fault at ffffffff8163e648             <- calls page_fault() then bad_area_nosemaphore()
      [exception RIP: kvm_zap_rmapp+0x34]

Comment 2 Amit Kumar Das 2019-07-03 02:39:58 UTC
Additional details

crash> sys -i
        DMI_BIOS_VENDOR: HP 
       DMI_BIOS_VERSION: P89
          DMI_BIOS_DATE: 09/13/2016
         DMI_SYS_VENDOR: HP
       DMI_PRODUCT_NAME: ProLiant DL360 Gen9
    DMI_PRODUCT_VERSION: 
     DMI_PRODUCT_SERIAL: MXQ6460079
       DMI_PRODUCT_UUID: 32353537-3835-584D-5136-343630303739
       DMI_BOARD_VENDOR: HP
         DMI_BOARD_NAME: ProLiant DL360 Gen9
      DMI_BOARD_VERSION: 
       DMI_BOARD_SERIAL: MXQ6460079
    DMI_BOARD_ASSET_TAG: B-123743
     DMI_CHASSIS_VENDOR: HP
       DMI_CHASSIS_TYPE: 23
    DMI_CHASSIS_VERSION: 
     DMI_CHASSIS_SERIAL: MXQ6460079
  DMI_CHASSIS_ASSET_TAG: B-123743

crash> cpuinfo
<<< Physical CPU   0 >>>
	CPU   0, core   0 : 0xffff883f7fa18100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU   1, core   1 : 0xffff883f7fa58100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU   2, core   2 : 0xffff883f7fa98100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU   3, core   3 : 0xffff883f7fad8100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU   4, core   4 : 0xffff883f7fb18100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU   5, core   5 : 0xffff883f7fb58100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU   6, core   6 : 0xffff883f7fb98100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU   7, core   7 : 0xffff883f7fbd8100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  16, core   0 : 0xffff883f7fc18100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  17, core   1 : 0xffff883f7fc58100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  18, core   2 : 0xffff883f7fc98100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  19, core   3 : 0xffff883f7fcd8100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  20, core   4 : 0xffff883f7fd18100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  21, core   5 : 0xffff883f7fd58100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  22, core   6 : 0xffff883f7fd98100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  23, core   7 : 0xffff883f7fdd8100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
<<< Physical CPU   1 >>>
	CPU   8, core   0 : 0xffff887f7f218100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU   9, core   1 : 0xffff887f7f258100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  10, core   2 : 0xffff887f7f298100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  11, core   3 : 0xffff887f7f2d8100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  12, core   4 : 0xffff887f7f318100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  13, core   5 : 0xffff887f7f358100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  14, core   6 : 0xffff887f7f398100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  15, core   7 : 0xffff887f7f3d8100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  24, core   0 : 0xffff887f7f418100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  25, core   1 : 0xffff887f7f458100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  26, core   2 : 0xffff887f7f498100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  27, core   3 : 0xffff887f7f4d8100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  28, core   4 : 0xffff887f7f518100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  29, core   5 : 0xffff887f7f558100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  30, core   6 : 0xffff887f7f598100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
	CPU  31, core   7 : 0xffff887f7f5d8100 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz


crash> mod|grep kvm
ffffffffc06b9ae0  kvm        566562  /cores/retrace/repos/kernel/x86_64/usr/lib/debug/lib/modules/3.10.0-693.11.6.el7.x86_64/kernel/arch/x86/kvm/kvm.ko.debug 
ffffffffc1840920  kvm_intel  174296  /cores/retrace/repos/kernel/x86_64/usr/lib/debug/lib/modules/3.10.0-693.11.6.el7.x86_64/kernel/arch/x86/kvm/kvm-intel.ko.debug


# rpm -q qemu-kvm-rhev
qemu-kvm-rhev-2.9.0-10.el7.x86_64


# modinfo kvm kvm_intel
filename:       /lib/modules/3.10.0-693.11.6.el7.x86_64/kernel/arch/x86/kvm/kvm.ko.xz
license:        GPL
author:         Qumranet
rhelversion:    7.4
srcversion:     F8C2B40929758F8488B2EA8
depends:        irqbypass
intree:         Y
vermagic:       3.10.0-693.11.6.el7.x86_64 SMP mod_unload modversions
signer:         Red Hat Enterprise Linux kernel signing key
sig_key:        5F:D8:EB:CF:C4:C5:20:3C:3C:B7:90:52:19:FB:66:9D:5F:4B:E3:FF
sig_hashalgo:   sha256
parm:           ignore_msrs:bool
parm:           min_timer_period_us:uint
parm:           kvmclock_periodic_sync:bool
parm:           tsc_tolerance_ppm:uint
parm:           lapic_timer_advance_ns:uint
parm:           vector_hashing:bool
parm:           halt_poll_ns:uint
parm:           halt_poll_ns_grow:uint
parm:           halt_poll_ns_shrink:uint
filename:       /lib/modules/3.10.0-693.11.6.el7.x86_64/kernel/arch/x86/kvm/kvm-intel.ko.xz
license:        GPL
author:         Qumranet
rhelversion:    7.4
srcversion:     028C6733CB4503A506E2FFA
alias:          x86cpu:vendor:*:family:*:model:*:feature:*0085*
depends:        kvm
intree:         Y
vermagic:       3.10.0-693.11.6.el7.x86_64 SMP mod_unload modversions
signer:         Red Hat Enterprise Linux kernel signing key
sig_key:        5F:D8:EB:CF:C4:C5:20:3C:3C:B7:90:52:19:FB:66:9D:5F:4B:E3:FF
sig_hashalgo:   sha256
parm:           vpid:bool
parm:           flexpriority:bool
parm:           ept:bool
parm:           unrestricted_guest:bool
parm:           eptad:bool
parm:           emulate_invalid_guest_state:bool
parm:           vmm_exclusive:bool
parm:           fasteoi:bool
parm:           enable_apicv:bool
parm:           enable_shadow_vmcs:bool
parm:           nested:bool
parm:           pml:bool
parm:           ple_gap:int
parm:           ple_window:int
parm:           ple_window_grow:int
parm:           ple_window_shrink:int
parm:           ple_window_max:int

Comment 4 Paolo Bonzini 2019-07-12 12:41:52 UTC
This is not related to bz 1421296; the VMWRITE errors are not present in the dmesg output. The call to kvm_zap_rmapp was initially triggered by autonuma, so if this is reproducible disabling autonuma could be a workaround.

There have been many changes since 7.4 so my immediate suggestion would be to upgrade.

Comment 5 Amit Kumar Das 2019-07-15 04:37:51 UTC
Thanks Paolo. System meets autonuma balancing condition.
Provided workaround is forwarded to the case.
========================
#cat proc/sys/kernel/numa_balancing
1

#cat numactl_--hardware 
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23
node 0 size: 262018 MB
node 0 free: 254207 MB
node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31
node 1 size: 262144 MB
node 1 free: 226137 MB
node distances:
node   0   1 
  0:  10  21 
  1:  21  10 
#cat numactl_--show 
policy: default
preferred node: current
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
cpubind: 0 1 
nodebind: 0 1 
membind: 0 1 
========================

To disable automatic NUMA balancing   #echo 0 > /proc/sys/kernel/numa_balancing


Note You need to log in before you can comment on or make changes to this bug.