Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1667560

Summary: When scheduling an instance with PCI PT NIC SR-IOV on an hypervisor with a swapfile on the root partition, the IO can kill the OS
Product: Red Hat Enterprise Linux 7 Reporter: David Vallee Delisle <dvd>
Component: qemu-kvmAssignee: Alex Williamson <alex.williamson>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Pei Zhang <pezhang>
Severity: low Docs Contact:
Priority: low    
Version: 7.5CC: aarcange, alex.williamson, chayang, dvd, juzhang, knoel, michen, pbonzini, pezhang, virt-maint, yfu
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-03-20 21:45:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Vallee Delisle 2019-01-18 19:50:49 UTC
Description of problem:
When a VM is launched with SR-IOV, memory is immediately allocated. When there's swap, it might kill the root block device with IO

Version-Release number of selected component (if applicable):
libvirt-3.9.0-14.el7_5.7.x86_64                             Tue Aug 21 20:44:42 2018
qemu-kvm-common-rhev-2.10.0-21.el7_5.4.x86_64               Tue Aug 21 20:41:42 2018
qemu-kvm-rhev-2.10.0-21.el7_5.4.x86_64                      Tue Aug 21 20:44:23 2018


How reproducible:
All the time

Steps to Reproduce:
1. Configure SR-IOV on an hypervisor
2. Add a swapfile on the root partition
3. Fill that hypervisor with VMs
4. Spawn a VM with SR-IOV that will use the swap

Actual results:
Hypervisor starts OOMkill stuff [1]

Expected results:
TBD

Additional info:

[1]
~~~
[Fri Jan 18 18:42:17 2019] ixgbe 0000:05:00.1 p2p2: VF Reset msg received from vf 0
[Fri Jan 18 18:42:17 2019] ixgbe 0000:05:00.1: setting MAC fa:16:3e:39:d5:df on VF 0
[Fri Jan 18 18:42:17 2019] ixgbe 0000:05:00.1: Reload the VF driver to make this change effective.
[Fri Jan 18 18:42:17 2019] ixgbe 0000:05:00.1: Setting VLAN 906, QOS 0x0 on VF 0
[Fri Jan 18 18:44:51 2019] CPU 4/KVM invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
[Fri Jan 18 18:44:51 2019] CPU 4/KVM cpuset=vcpu4 mems_allowed=0-1
[Fri Jan 18 18:44:51 2019] CPU: 16 PID: 62175 Comm: CPU 4/KVM Kdump: loaded Not tainted 3.10.0-862.11.6.el7.x86_64 #1
[Fri Jan 18 18:44:51 2019] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.4.2 01/09/2017
[Fri Jan 18 18:44:51 2019] Call Trace:
[Fri Jan 18 18:44:51 2019]  [<ffffffffa97135d4>] dump_stack+0x19/0x1b
[Fri Jan 18 18:44:51 2019]  [<ffffffffa970e79f>] dump_header+0x90/0x229
[Fri Jan 18 18:44:51 2019]  [<ffffffffa92dc63b>] ? cred_has_capability+0x6b/0x120
[Fri Jan 18 18:44:51 2019]  [<ffffffffa919ac64>] oom_kill_process+0x254/0x3d0
[Fri Jan 18 18:44:51 2019]  [<ffffffffa92dc71e>] ? selinux_capable+0x2e/0x40
[Fri Jan 18 18:44:51 2019]  [<ffffffffa919b4a6>] out_of_memory+0x4b6/0x4f0
[Fri Jan 18 18:44:51 2019]  [<ffffffffa970f2a3>] __alloc_pages_slowpath+0x5d6/0x724
[Fri Jan 18 18:44:51 2019]  [<ffffffffa91a17f5>] __alloc_pages_nodemask+0x405/0x420
[Fri Jan 18 18:44:51 2019]  [<ffffffffa91ef7c5>] alloc_pages_vma+0xb5/0x200
[Fri Jan 18 18:44:51 2019]  [<ffffffffa91dde85>] __read_swap_cache_async+0x115/0x190
[Fri Jan 18 18:44:51 2019]  [<ffffffffa91ddf26>] read_swap_cache_async+0x26/0x60
[Fri Jan 18 18:44:51 2019]  [<ffffffffa91de008>] swapin_readahead+0xa8/0x110
[Fri Jan 18 18:44:51 2019]  [<ffffffffa91c89a2>] handle_pte_fault+0x812/0xd10
[Fri Jan 18 18:44:51 2019]  [<ffffffffa91cae3d>] ? handle_mm_fault+0x39d/0x9b0
[Fri Jan 18 18:44:51 2019]  [<ffffffffa91cae3d>] handle_mm_fault+0x39d/0x9b0
[Fri Jan 18 18:44:51 2019]  [<ffffffffa91c16c6>] __get_user_pages+0x1c6/0x760
[Fri Jan 18 18:44:51 2019]  [<ffffffffc08cc549>] __gfn_to_pfn_memslot+0x179/0x480 [kvm]
[Fri Jan 18 18:44:51 2019]  [<ffffffffc08f29b7>] try_async_pf+0x67/0x1f0 [kvm]
[Fri Jan 18 18:44:51 2019]  [<ffffffffc08f4a6a>] tdp_page_fault+0x13a/0x260 [kvm]
[Fri Jan 18 18:44:51 2019]  [<ffffffffc063fea2>] ? vmx_vcpu_run+0x352/0xa90 [kvm_intel]
[Fri Jan 18 18:44:51 2019]  [<ffffffffc08ebf51>] kvm_mmu_page_fault+0x71/0x120 [kvm]
[Fri Jan 18 18:44:51 2019]  [<ffffffffc063fea2>] ? vmx_vcpu_run+0x352/0xa90 [kvm_intel]
[Fri Jan 18 18:44:51 2019]  [<ffffffffc06389dd>] handle_ept_violation+0x8d/0x100 [kvm_intel]
[Fri Jan 18 18:44:51 2019]  [<ffffffffc0641a14>] vmx_handle_exit+0x294/0xc90 [kvm_intel]
[Fri Jan 18 18:44:51 2019]  [<ffffffffa914bdf4>] ? rcu_eqs_exit_common.isra.31+0x24/0xe0
[Fri Jan 18 18:44:51 2019]  [<ffffffffc063feae>] ? vmx_vcpu_run+0x35e/0xa90 [kvm_intel]
[Fri Jan 18 18:44:51 2019]  [<ffffffffa914bf00>] ? rcu_eqs_exit+0x50/0xa0
[Fri Jan 18 18:44:51 2019]  [<ffffffffc08de74d>] vcpu_enter_guest+0x64d/0x12c0 [kvm]
[Fri Jan 18 18:44:51 2019]  [<ffffffffc08f292f>] ? kvm_can_do_async_pf+0x4f/0x70 [kvm]
[Fri Jan 18 18:44:51 2019]  [<ffffffffc08e64e1>] ? kvm_arch_can_inject_async_page_present+0x21/0x30 [kvm]
[Fri Jan 18 18:44:51 2019]  [<ffffffffc08e5ea8>] kvm_arch_vcpu_ioctl_run+0x358/0x480 [kvm]
[Fri Jan 18 18:44:51 2019]  [<ffffffffc08cb641>] kvm_vcpu_ioctl+0x2b1/0x650 [kvm]
[Fri Jan 18 18:44:51 2019]  [<ffffffffa922045e>] ? do_readv_writev+0x19e/0x260
[Fri Jan 18 18:44:51 2019]  [<ffffffffa9234040>] do_vfs_ioctl+0x360/0x550
[Fri Jan 18 18:44:51 2019]  [<ffffffffa914bdf4>] ? rcu_eqs_exit_common.isra.31+0x24/0xe0
[Fri Jan 18 18:44:51 2019]  [<ffffffffa92dccdf>] ? file_has_perm+0x9f/0xb0
[Fri Jan 18 18:44:51 2019]  [<ffffffffa914bf00>] ? rcu_eqs_exit+0x50/0xa0
[Fri Jan 18 18:44:51 2019]  [<ffffffffa92342d1>] SyS_ioctl+0xa1/0xc0
[Fri Jan 18 18:44:51 2019]  [<ffffffffa9725a1b>] tracesys+0xa3/0xc9
[Fri Jan 18 18:44:51 2019] Mem-Info:
[Fri Jan 18 18:44:51 2019] active_anon:27720003 inactive_anon:1311526 isolated_anon:736
 active_file:75 inactive_file:73 isolated_file:0
 unevictable:89096 dirty:0 writeback:31 unstable:0
 slab_reclaimable:25897 slab_unreclaimable:53986
 mapped:3867 shmem:11 pagetables:61810 bounce:0
 free:88806 free_pcp:168 free_cma:0
[Fri Jan 18 18:44:51 2019] Node 0 DMA free:15896kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[Fri Jan 18 18:44:51 2019] lowmem_reserve[]: 0 1770 64206 64206
[Fri Jan 18 18:44:51 2019] Node 0 DMA32 free:250940kB min:1236kB low:1544kB high:1852kB active_anon:1135332kB inactive_anon:379496kB active_file:4kB inactive_file:40kB unevictable:14820kB isolated(anon):0kB isolated(file):0kB present:1985268kB managed:1813236kB mlocked:14820kB dirty:0kB writeback:80kB mapped:556kB shmem:0kB slab_reclaimable:4608kB slab_unreclaimable:3668kB kernel_stack:272kB pagetables:3476kB unstable:0kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:94994 all_unreclaimable? no
[Fri Jan 18 18:44:51 2019] lowmem_reserve[]: 0 0 62435 62435
[Fri Jan 18 18:44:51 2019] Node 0 Normal free:43364kB min:43704kB low:54628kB high:65556kB active_anon:53639012kB inactive_anon:2425736kB active_file:296kB inactive_file:252kB unevictable:340048kB isolated(anon):768kB isolated(file):0kB present:65011712kB managed:63933900kB mlocked:340048kB dirty:0kB writeback:32kB mapped:13148kB shmem:20kB slab_reclaimable:78764kB slab_unreclaimable:116780kB kernel_stack:8496kB pagetables:127244kB unstable:0kB bounce:0kB free_pcp:552kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1929972 all_unreclaimable? no
[Fri Jan 18 18:44:51 2019] lowmem_reserve[]: 0 0 0 0
[Fri Jan 18 18:44:51 2019] Node 1 Normal free:45024kB min:45152kB low:56440kB high:67728kB active_anon:56105668kB inactive_anon:2440872kB active_file:0kB inactive_file:0kB unevictable:1516kB isolated(anon):2176kB isolated(file):0kB present:67108864kB managed:66046496kB mlocked:1516kB dirty:0kB writeback:12kB mapped:1764kB shmem:24kB slab_reclaimable:20216kB slab_unreclaimable:95496kB kernel_stack:3408kB pagetables:116520kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:95139533 all_unreclaimable? yes
[Fri Jan 18 18:44:51 2019] lowmem_reserve[]: 0 0 0 0
[Fri Jan 18 18:44:51 2019] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15896kB
[Fri Jan 18 18:44:51 2019] Node 0 DMA32: 45*4kB (UE) 160*8kB (UE) 125*16kB (UE) 90*32kB (UE) 74*64kB (UEM) 36*128kB (UEM) 23*256kB (UEM) 4*512kB (U) 2*1024kB (UE) 0*2048kB 55*4096kB (M) = 250948kB
[Fri Jan 18 18:44:51 2019] Node 0 Normal: 2799*4kB (UEM) 2953*8kB (UE) 558*16kB (UEM) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 43748kB
[Fri Jan 18 18:44:51 2019] Node 1 Normal: 42*4kB (UEM) 471*8kB (UEM) 701*16kB (UE) 450*32kB (UE) 174*64kB (UE) 5*128kB (U) 5*256kB (UE) 3*512kB (UM) 1*1024kB (U) 0*2048kB 0*4096kB = 45168kB
[Fri Jan 18 18:44:51 2019] Node 0 hugepages_total=6 hugepages_free=2 hugepages_surp=0 hugepages_size=1048576kB
[Fri Jan 18 18:44:51 2019] Node 1 hugepages_total=6 hugepages_free=6 hugepages_surp=0 hugepages_size=1048576kB
[Fri Jan 18 18:44:51 2019] 1315825 total pagecache pages
[Fri Jan 18 18:44:51 2019] 1312355 pages in swap cache
[Fri Jan 18 18:44:51 2019] Swap cache stats: add 1616206, delete 303861, find 8076/11080
[Fri Jan 18 18:44:51 2019] Free swap  = 10389872kB
[Fri Jan 18 18:44:51 2019] Total swap = 16777212kB
[Fri Jan 18 18:44:51 2019] 33530456 pages RAM
[Fri Jan 18 18:44:51 2019] 0 pages HighMem/MovableOnly

~~~

Comment 4 Alex Williamson 2019-01-19 16:17:02 UTC
This appears to be a case of user error, a VM making use of device assignment cannot be swapped.  The January 11th statement in the customer case that a 40G VM should be able to be launched on a 20G host with 20G swap is incorrect for an assigned device VM.  All of the memory for the VM must be pinned in memory at the instantiation of the VM for device assignment.  If swap is present the host will try very hard to free memory, often invoking the OOM killer to free that memory.  Host behavior is undesirable during this phase.  Running the host system without swap can improve the behavior through this transition, the VM will fail more quickly without such a stall on the host system overall.  This looks like a configuration error, is there more to this request?

Comment 5 David Vallee Delisle 2019-01-22 21:25:33 UTC
Alex,

I believe I might have not reproduced the exact symptoms as the customer. After looking closely, our stack traces aren't the same.

This is the customer's [1] and this is mine [2]. In my case, there's 3 retries and after the 3rd one, the compute is back to an operational state. In the customer's case, the compute is still frozen after 15h.

When looking in messages [3], we see that the driver (?) is unable to remove the VF's MAC and ~45s when the second try is starting, it's setting a new one and it looks like it's at this moment that the host is completely frozen.

Customer has enabled sysrq and will try to reproduce this issue and have a dump for us.

In the meantime, do we have anything helpful in these traces or with this new information?

Thank you very much,

DVD

[1]
~~~
Dec 14 23:33:41 compute-038 kernel: [<ffffffff885135d4>] dump_stack+0x19/0x1b
Dec 14 23:33:41 compute-038 kernel: [<ffffffff8850e79f>] dump_header+0x90/0x229
Dec 14 23:33:41 compute-038 kernel: [<ffffffff880dc63b>] ? cred_has_capability+0x6b/0x120
Dec 14 23:33:41 compute-038 kernel: [<ffffffff87f9ac64>] oom_kill_process+0x254/0x3d0
Dec 14 23:33:41 compute-038 kernel: [<ffffffff880dc71e>] ? selinux_capable+0x2e/0x40
Dec 14 23:33:41 compute-038 kernel: [<ffffffff87f9b4a6>] out_of_memory+0x4b6/0x4f0
Dec 14 23:33:41 compute-038 kernel: [<ffffffff8850f2a3>] __alloc_pages_slowpath+0x5d6/0x724
Dec 14 23:33:41 compute-038 kernel: [<ffffffff87fa17f5>] __alloc_pages_nodemask+0x405/0x420
Dec 14 23:33:41 compute-038 kernel: [<ffffffff87fef7c5>] alloc_pages_vma+0xb5/0x200
Dec 14 23:33:41 compute-038 kernel: [<ffffffff87fc8a17>] handle_pte_fault+0x887/0xd10
Dec 14 23:33:41 compute-038 kernel: [<ffffffff87fcae3d>] handle_mm_fault+0x39d/0x9b0
Dec 14 23:33:41 compute-038 kernel: [<ffffffff87fb8223>] ? zone_statistics+0x63/0xa0
Dec 14 23:33:41 compute-038 kernel: [<ffffffff87fc16c6>] __get_user_pages+0x1c6/0x760
Dec 14 23:33:41 compute-038 kernel: [<ffffffff87fc1fcd>] get_user_pages_unlocked+0x15d/0x1f0
Dec 14 23:33:41 compute-038 kernel: [<ffffffff87e7911f>] get_user_pages_fast+0x9f/0x1a0
Dec 14 23:33:41 compute-038 kernel: [<ffffffffc0856396>] vaddr_get_pfn+0x156/0x170 [vfio_iommu_type1]
Dec 14 23:33:41 compute-038 kernel: [<ffffffffc085695f>] vfio_pin_pages_remote+0x11f/0x370 [vfio_iommu_type1]
Dec 14 23:33:41 compute-038 kernel: [<ffffffffc0857c42>] vfio_iommu_type1_ioctl+0x532/0x970 [vfio_iommu_type1]
Dec 14 23:33:41 compute-038 kernel: [<ffffffffc08748c8>] vfio_fops_unl_ioctl+0x68/0x2b0 [vfio]
Dec 14 23:33:41 compute-038 kernel: [<ffffffff88034040>] do_vfs_ioctl+0x360/0x550
Dec 14 23:33:41 compute-038 kernel: [<ffffffff880dccdf>] ? file_has_perm+0x9f/0xb0
Dec 14 23:33:41 compute-038 kernel: [<ffffffff880342d1>] SyS_ioctl+0xa1/0xc0
Dec 14 23:33:41 compute-038 kernel: [<ffffffff8852579b>] system_call_fastpath+0x22/0x27
~~~

[2]
~~~
Jan 18 18:45:01 compute-0 kernel: CPU 4/KVM invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
Jan 18 18:45:01 compute-0 kernel: CPU 4/KVM cpuset=vcpu4 mems_allowed=0-1
Jan 18 18:45:01 compute-0 kernel: CPU: 16 PID: 62175 Comm: CPU 4/KVM Kdump: loaded Not tainted 3.10.0-862.11.6.el7.x86_64 #1
Jan 18 18:45:01 compute-0 kernel: Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.4.2 01/09/2017
Jan 18 18:45:01 compute-0 kernel: Call Trace:
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa97135d4>] dump_stack+0x19/0x1b
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa970e79f>] dump_header+0x90/0x229
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa92dc63b>] ? cred_has_capability+0x6b/0x120
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa919ac64>] oom_kill_process+0x254/0x3d0
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa92dc71e>] ? selinux_capable+0x2e/0x40
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa919b4a6>] out_of_memory+0x4b6/0x4f0
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa970f2a3>] __alloc_pages_slowpath+0x5d6/0x724
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa91a17f5>] __alloc_pages_nodemask+0x405/0x420
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa91ef7c5>] alloc_pages_vma+0xb5/0x200
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa91dde85>] __read_swap_cache_async+0x115/0x190
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa91ddf26>] read_swap_cache_async+0x26/0x60
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa91de008>] swapin_readahead+0xa8/0x110
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa91c89a2>] handle_pte_fault+0x812/0xd10
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa91cae3d>] ? handle_mm_fault+0x39d/0x9b0
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa91cae3d>] handle_mm_fault+0x39d/0x9b0
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa91c16c6>] __get_user_pages+0x1c6/0x760
Jan 18 18:45:01 compute-0 kernel: [<ffffffffc08cc549>] __gfn_to_pfn_memslot+0x179/0x480 [kvm]
Jan 18 18:45:01 compute-0 kernel: [<ffffffffc08f29b7>] try_async_pf+0x67/0x1f0 [kvm]
Jan 18 18:45:01 compute-0 kernel: [<ffffffffc08f4a6a>] tdp_page_fault+0x13a/0x260 [kvm]
Jan 18 18:45:01 compute-0 kernel: [<ffffffffc063fea2>] ? vmx_vcpu_run+0x352/0xa90 [kvm_intel]
Jan 18 18:45:01 compute-0 kernel: [<ffffffffc08ebf51>] kvm_mmu_page_fault+0x71/0x120 [kvm]
Jan 18 18:45:01 compute-0 kernel: [<ffffffffc063fea2>] ? vmx_vcpu_run+0x352/0xa90 [kvm_intel]
Jan 18 18:45:01 compute-0 kernel: [<ffffffffc06389dd>] handle_ept_violation+0x8d/0x100 [kvm_intel]
Jan 18 18:45:01 compute-0 kernel: [<ffffffffc0641a14>] vmx_handle_exit+0x294/0xc90 [kvm_intel]
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa914bdf4>] ? rcu_eqs_exit_common.isra.31+0x24/0xe0
Jan 18 18:45:01 compute-0 kernel: [<ffffffffc063feae>] ? vmx_vcpu_run+0x35e/0xa90 [kvm_intel]
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa914bf00>] ? rcu_eqs_exit+0x50/0xa0
Jan 18 18:45:01 compute-0 kernel: [<ffffffffc08de74d>] vcpu_enter_guest+0x64d/0x12c0 [kvm]
Jan 18 18:45:01 compute-0 kernel: [<ffffffffc08f292f>] ? kvm_can_do_async_pf+0x4f/0x70 [kvm]
Jan 18 18:45:01 compute-0 kernel: [<ffffffffc08e64e1>] ? kvm_arch_can_inject_async_page_present+0x21/0x30 [kvm]
Jan 18 18:45:01 compute-0 kernel: [<ffffffffc08e5ea8>] kvm_arch_vcpu_ioctl_run+0x358/0x480 [kvm]
Jan 18 18:45:01 compute-0 kernel: [<ffffffffc08cb641>] kvm_vcpu_ioctl+0x2b1/0x650 [kvm]
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa922045e>] ? do_readv_writev+0x19e/0x260
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa9234040>] do_vfs_ioctl+0x360/0x550
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa914bdf4>] ? rcu_eqs_exit_common.isra.31+0x24/0xe0
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa92dccdf>] ? file_has_perm+0x9f/0xb0
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa914bf00>] ? rcu_eqs_exit+0x50/0xa0
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa92342d1>] SyS_ioctl+0xa1/0xc0
Jan 18 18:45:01 compute-0 kernel: [<ffffffffa9725a1b>] tracesys+0xa3/0xc9
~~~

[3]
~~~
Dec 14 23:33:45 compute-038 kernel: ixgbe 0000:09:00.0: removing MAC on VF 29
Dec 14 23:33:45 compute-038 kernel: ixgbe 0000:09:00.0: Could NOT remove the VF MAC address.
Dec 14 23:33:45 compute-038 kernel: ixgbe 0000:09:00.1: removing MAC on VF 29
Dec 14 23:33:45 compute-038 kernel: ixgbe 0000:09:00.1: Could NOT remove the VF MAC address.
Dec 14 23:33:45 compute-038 kernel: ixgbe 0000:09:00.0: removing MAC on VF 28
Dec 14 23:33:45 compute-038 kernel: ixgbe 0000:09:00.0: Could NOT remove the VF MAC address.
Dec 14 23:33:45 compute-038 kernel: ixgbe 0000:09:00.1: removing MAC on VF 28
Dec 14 23:33:45 compute-038 kernel: ixgbe 0000:09:00.1: Could NOT remove the VF MAC address.
Dec 14 23:33:45 compute-038 libvirtd: 2018-12-14 23:33:45.414+0000: 11521: error : virNetDevSetVfConfig:1701 : Cannot set interface MAC/vlanid to 00:00:00:00:00:00/0 for ifname ens1f0 vf 29: Cannot allocate memory
Dec 14 23:33:45 compute-038 libvirtd: 2018-12-14 23:33:45.416+0000: 11521: error : virNetDevSetVfConfig:1701 : Cannot set interface MAC/vlanid to 00:00:00:00:00:00/0 for ifname ens1f1 vf 29: Cannot allocate memory
Dec 14 23:33:45 compute-038 libvirtd: 2018-12-14 23:33:45.419+0000: 11521: error : virNetDevSetVfConfig:1701 : Cannot set interface MAC/vlanid to 00:00:00:00:00:00/0 for ifname ens1f0 vf 28: Cannot allocate memory
Dec 14 23:33:45 compute-038 libvirtd: 2018-12-14 23:33:45.422+0000: 11521: error : virNetDevSetVfConfig:1701 : Cannot set interface MAC/vlanid to 00:00:00:00:00:00/0 for ifname ens1f1 vf 28: Cannot allocate memory
<snip>
Dec 14 23:34:32 compute-038 kernel: ixgbe 0000:09:00.0: setting MAC fa:16:3e:cf:b5:4f on VF 28
Dec 14 23:34:32 compute-038 kernel: ixgbe 0000:09:00.0: Reload the VF driver to make this change effective.
Dec 14 23:34:32 compute-038 kernel: ixgbe 0000:09:00.1: setting MAC fa:16:3e:6f:48:da on VF 28
Dec 14 23:34:32 compute-038 kernel: ixgbe 0000:09:00.1: Reload the VF driver to make this change effective.
~~~

Comment 6 Alex Williamson 2019-01-22 21:53:43 UTC
(In reply to David Vallee Delisle from comment #5)
> Alex,
> 
> I believe I might have not reproduced the exact symptoms as the customer.
> After looking closely, our stack traces aren't the same.

No, you're not reproducing the issue.  The customer stack trace is the point at which the vfio driver is trying to pin the guest memory and appears as a classic attempt to over-commit the host memory.  In your case, I don't see what the issue is.  If you run enough VMs, you can always induce an out-of-memory condition, the limit is simply much higher with VMs that allow over-committing rather than device assignment VMs which do not.  The ixgbe messages in your logs suggest you haven't enabled device assignment and are probably using some sort of macvlan approach which does not involved the vfio driver.

Comment 7 Alex Williamson 2019-03-20 21:45:49 UTC
As in comment 4, this seems to be a case of a VM being provisioned on a host with insufficient resources for it, resulting in a swap storm, OOM, and generally poor behavior on the host.  Device assignment VMs do not support memory over-commit.  Please re-open with additional information if there's reason to pursue further.