Bug 1496788 - Live migration fails when a instance has high load average (RAM)
Summary: Live migration fails when a instance has high load average (RAM)
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 8.0 (Liberty)
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: async
: 8.0 (Liberty)
Assignee: Kashyap Chamarthy
QA Contact: Joe H. Rahme
URL:
Whiteboard:
Depends On: 1497275
Blocks: 1381612
TreeView+ depends on / blocked
 
Reported: 2017-09-28 12:17 UTC by Sergii Mykhailushko
Modified: 2023-09-15 00:04 UTC (History)
20 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-08-24 12:55:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
QEMU command-line log for the instance instance-00000bf4 on host "cfs1lnc02" (37.13 KB, text/plain)
2017-09-29 13:08 UTC, Kashyap Chamarthy
no flags Details
QEMU command-line log for the instance instance-00000bf4 on host "cfs1lnc03" (59.73 KB, text/plain)
2017-09-29 13:09 UTC, Kashyap Chamarthy
no flags Details

Description Sergii Mykhailushko 2017-09-28 12:17:40 UTC
- Description of problem:

Cutomer is performing a stress test in lab environment, and upon high memory load live migration fails for an instance (8 vCPU, 16 GB RAM). This leads to having the instance in inconsistent state. This is critical for the customer, as they need to migrate a lot of high-loaded instances in some of procedures such as an upgrade process in their production environment.


- Version-Release number of selected component (if applicable):

Red Hat OpenStack Platform 8
openstack-nova-12.0.6-20.el7ost.noarch


- Steps to Reproduce:

[test #1]

1. create an instance with 8 vCPU, 16 GB RAM 
2. "stress --cpu 5 --io 5 --timeout 600s" inside an instance && live-migrate
As command has been launched without memory option, the live migration has been properly finished. The total load has been around 10 points. 
3. "stress --cpu 2 --io 2 --vm 6 --timeout 600s" && live-migrate


Actual results:
The process has failed with this error message:
~~~
ERROR nova.virt.libvirt.driver [req-b3852272-97ac-41ac-a5ed-8bc99c95e980 1bb8b87bb73c4c2b9f4d7237c146ad38 621f1bb5601c4de0a121093b6f2896bd - - -] [instance: 541d8efc-3606-4fe7-976e-fddde08f652f] Live Migration failure: internal error: unable to execute QEMU command 'cont': Resetting the Virtual Machine is required
~~~

The instance remains in 'Error' state tough it is still running on the source hypervisor:

~~~
[root@sourcenode ~]# virsh list --all | grep instance-00000bf7
 12    instance-00000bf7              running
[root@sourcenode ~]# virsh domiflist instance-00000bf7
Interface  Type       Source     Model       MAC
-------------------------------------------------------
tapabdb5ccf-86 bridge     qbrabdb5ccf-86 virtio      fa:16:3e:03:f0:5b
~~~

But the OVS and linuxbridge ports are in the destination host : 

~~~
[root@destnode ~]# ovs-vsctl show | grep abdb5ccf-86
        Port "qvoabdb5ccf-86"
            Interface "qvoabdb5ccf-86"
[root@destnode ~]# brctl show | grep abdb5ccf-86
qbrabdb5ccf-86          8000.fee217ced539       no              qvbabdb5ccf-86
~~~

The OpenFlow rules are also created in the destination host.

[test #2]

1. create an instance with 8 vCPU, 16 GB RAM 
2. stress --cpu 5 --io 5 --timeout 600s && live-migrate
Without memory option, the live migration has been properly finished with a total load around 10 points.
3. stress --cpu 2 --io 2 --vm 6 --timeout 600s && live-migrate

In this case, with memory option, the live migration fails.
The error messages in this case are the following:

~~~
ERROR nova.compute.manager [req-36b649c8-f2e8-4297-9047-33f09ad027e8 1bb8b87bb73c4c2b9f4d7237c146ad38 621f1bb5601c4de0a121093b6f2
896bd - - -] [instance: 4e0b9ede-2d08-485f-a3e5-a9cd73d06d44] Unexpected error during post live migration at destination host.
DEBUG nova.compute.manager [req-36b649c8-f2e8-4297-9047-33f09ad027e8 1bb8b87bb73c4c2b9f4d7237c146ad38 621f1bb5601c4de0a121093b6f2
896bd - - -] [instance: 4e0b9ede-2d08-485f-a3e5-a9cd73d06d44] Checking state _get_power_state /usr/lib/python2.7/site-packages/nova/compute/manager.py:1331
ERROR oslo_messaging.rpc.dispatcher [req-36b649c8-f2e8-4297-9047-33f09ad027e8 1bb8b87bb73c4c2b9f4d7237c146ad38 621f1bb5601c4de0a1
21093b6f2896bd - - -] Exception during message handling: Instance instance-00000bf4 could not be found.
~~~

The instance has also remained in 'Error' state and it is still running in the source hypervisor: 

[root@sourcenode ~]# virsh list --all | grep instance-00000bf4
 50    instance-00000bf4              running

[root@sourcenode ~]# virsh domiflist instance-00000bf4
Interface  Type       Source     Model       MAC
-------------------------------------------------------
tapd1dfc419-09 bridge     qbrd1dfc419-09 virtio      fa:16:3e:78:6b:e3



But the OVS and linuxbridge ports are in the destination host : 

[root@destnode ~]# ovs-vsctl show | grep d1dfc419-09
        Port "qvod1dfc419-09"
            Interface "qvod1dfc419-09"
[root@destnode ~]# brctl show | grep d1dfc419-09
qbrd1dfc419-09          8000.fecd31b18d15       no              qvbd1dfc419-09


- Expected results:

The instance under high load can be live-migrated to destination hypervisor 


- Additional info:

It seems that the issue appears when the instance that is to be migrated is under high RAM load.

Comment 3 Kashyap Chamarthy 2017-09-29 13:08:17 UTC
Created attachment 1332388 [details]
QEMU command-line log for the instance instance-00000bf4 on host "cfs1lnc02"

Comment 4 Kashyap Chamarthy 2017-09-29 13:09:38 UTC
Created attachment 1332389 [details]
QEMU command-line log for the instance instance-00000bf4 on host "cfs1lnc03"

Comment 5 Kashyap Chamarthy 2017-09-29 13:15:28 UTC
If you notice the last QEMU instance launch from the host "cfs1lnc02",
from the attachment in comment#3
(https://bugzilla.redhat.com/attachment.cgi?id=1332388), you see the
following KVM error, after the migration was initiated:

    [...]
    2017-09-26 14:25:25.110+0000: initiating migration
    KVM internal error. Suberror: 3
    extra data[0]: 80000b0e
    extra data[1]: 3e
    RAX=00007ff01f800010 RBX=00007ff0107b3010 RCX=00007ff0107b3010 RDX=000000000f04d000
    RSI=0000000010001000 RDI=0000000000000000 RBP=0000000010000000 RSP=00007ffe13cb1d40
    R8 =ffffffffffffffff R9 =0000000010000000 R10=0000000000000022 R11=0000000000001000
    R12=0000000000001000 R13=00007ff0207b2010 R14=0000000000000002 R15=fffffffffffff000
    RIP=00007ff02109be60 RFL=00010206 [-----P-] CPL=3 II=0 A20=1 SMM=0 HLT=0
    ES =0000 0000000000000000 ffffffff 00000000
    CS =0033 0000000000000000 ffffffff 00a0fb00 DPL=3 CS64 [-RA]
    SS =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
    DS =0000 0000000000000000 ffffffff 00000000
    FS =0000 00007ff02108d740 ffffffff 00000000
    GS =0000 0000000000000000 ffffffff 00000000
    LDT=0000 0000000000000000 000fffff 00000000
    TR =0040 ffff88043fd94000 00002087 00008b00 DPL=0 TSS64-busy
    GDT=     ffff88043fd89000 0000007f
    IDT=     ffffffffff529000 00000fff
    CR0=80050033 CR2=00007ff01f800010 CR3=000000041aefe000 CR4=003406e0
    DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
    DR6=00000000fffe0ff0 DR7=0000000000000400
    EFER=0000000000000d01
    Code=27 02 00 00 48 85 ed 48 89 d8 7e 19 0f 1f 84 00 00 00 00 00 <c6> 00 5a 4c 01 e0 48 89 c2 48 29 da 48 39 d5 7f ef 48 83 3c 24 00 0f 84 ad 02 00 00 7e 19

Comment 6 Kashyap Chamarthy 2017-09-29 15:19:54 UTC
Given the error in the 
https://bugzilla.redhat.com/show_bug.cgi?id=1496788#c5, talking to KVM &
QEMU migration maintainers, the error indicates a page fault that caused
a PML (Page-modification log) to be full.

It was fixed by the Kernel commit b244c9f ("KVM: VMX: handle PML full 
VMEXIT that occurs during event delivery")

And is available in RHEL 7.4 via kernel-3.10.0-581.el7.


---

Or for an immediate workaround, you can try turning off PML (Page 
Modification Logging -- a feature on newer CPUs to speed up migration
dirty bit tracking) on the KVM Kernel module for both Compute nodes:

  $ rmmod kvm-intel
  $ echo "options kvm-intel pml=n" > /etc/modprobe.d/disable-pml.conf
  $ modprobe kvm-intel

Which should result in:

   $ cat /sys/module/kvm_intel/parameters/pml
   N

Comment 7 Kashyap Chamarthy 2017-09-29 16:01:17 UTC
I also filed this Z-stream bug for RHEL Kernel 7.3, which requests to backport the related PML fixes:

     https://bugzilla.redhat.com/show_bug.cgi?id=1497275 --
     Backport PML (Page Modification Logging) related fixes to RHEL 7.3.z

Comment 10 Paolo Bonzini 2018-08-24 12:55:58 UTC
This is a kernel bug, not an openstack bug; the last batch update (EUS) for rhel-7.3 is behind us, and this is fixed in 7.4.  Closing.

Comment 11 Red Hat Bugzilla 2023-09-15 00:04:09 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.