Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1545330 - I/O latency of cinder volume after live migration increases [NEEDINFO]
I/O latency of cinder volume after live migration increases
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova (Show other bugs)
9.0 (Mitaka)
x86_64 Linux
high Severity high
: zstream
: 9.0 (Mitaka)
Assigned To: Lee Yarwood
awaugama
: TestOnly, Triaged, ZStream
Depends On: 1463897 1482921 1545324
Blocks:
  Show dependency treegraph
 
Reported: 2018-02-14 11:35 EST by Lee Yarwood
Modified: 2018-10-02 14:53 EDT (History)
24 users (show)

See Also:
Fixed In Version: openstack-nova-13.1.4-18.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1545324
Environment:
Last Closed: 2018-10-02 14:52:25 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
jamsmith: needinfo? (lyarwood)


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Launchpad 1706083 None None None 2018-02-14 11:35 EST
OpenStack gerrit 485752 None None None 2018-02-14 11:35 EST
OpenStack gerrit 488959 None None None 2018-02-14 11:35 EST
Red Hat Product Errata RHSA-2018:2855 None None None 2018-10-02 14:53 EDT

  None (edit)
Description Lee Yarwood 2018-02-14 11:35:53 EST
+++ This bug was initially created as a clone of Bug #1545324 +++

+++ This bug was initially created as a clone of Bug #1463897 +++

Description of problem:

The I/O latency of a cinder volume after live migration of an instance to which it's attached increases significantly. This stays increased till the VM is stopped and started again.[ vm is booted with cinder volume]

This is not the case when using a disk from a nova store backend [ without cinder volume] (or at least the difference isn't so significantly high after a live migration).

Ceph 2.0 is backend 

Version-Release number of selected component (if applicable):

 


How reproducible:


Steps to Reproduce:
1. create a vm with cinder volume live migrate it 
2. check using ioping 
3.

Actual results:


Expected results:


Additional info:
he I/O latency of a cinder volume after live migration of an instance to which it's attached increases significantly. This stays increased till the VM is stopped and started again.

--- Additional comment from Kashyap Chamarthy on 2017-07-29 11:21:04 EDT ---

The patch for Git master is merged.

And here's the upstream stable/newton backport, in-progress:

    https://review.openstack.org/#/c/488959/

--- Additional comment from Kashyap Chamarthy on 2017-09-18 09:37:30 EDT ---

Verification notes for this bug:

*Without* this bug fix (from openstack-nova-14.0.8-2.el7ost), when you 
migrate a Nova instance with a Cinder volume -- where both Nova
instance's disk and the Cinder volume are on Ceph -- the cache value for 
the Cinder volume (erroneously) changes from 'writeback' to 'none':

    [Check by doing `ps -ef | grep qemu`, and look for the relevant QEMU
    process associated with the Nova instance.]

    Pre-migration, QEMU command-line for the Nova instance:

        [...] -drive file=rbd:volumes/volume-[...],cache=writeback

    Post-migration, QEMU command-line for the Nova instance:

        [...] -drive file=rbd:volumes/volume-[...],cache=none

*With* the bug fix (from openstack-nova-14.0.8-2.el7ost), the cache 
value for the Cinder volume should remain 'writeback':

    Pre-migration, QEMU command-line for the Nova instance:

        [...] -drive file=rbd:volumes/volume-[...],cache=writeback

    Post-migration, QEMU command-line for the Nova instance:

        [...] -drive file=rbd:volumes/volume-[...],cache=writeback

--- Additional comment from Martin Schuppert on 2018-02-14 11:25:30 EST ---

OSP8 is also affected by this:

# rpm -q openstack-nova-compute
openstack-nova-compute-12.0.6-21.el7ost.noarch

* before migration:
    <disk type='network' device='disk'>
      <driver name='qemu' type='raw' cache='writeback' discard='unmap'/>
      <auth username='cinder'>
        <secret type='ceph' uuid='475b69d9-9ea3-4356-ac22-762b17a875e3'/>
      </auth>
      <source protocol='rbd' name='osp8-vms/9715a493-60be-4d76-9d4c-34b37dad7366_disk'>
        <host name='192.168.122.5' port='6789'/>
        <host name='192.168.122.6' port='6789'/>
        <host name='192.168.122.7' port='6789'/>
      </source>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>
    <disk type='network' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <auth username='cinder'>
        <secret type='ceph' uuid='475b69d9-9ea3-4356-ac22-762b17a875e3'/>
      </auth>
      <source protocol='rbd' name='osp8-volumes/volume-ce556e6c-dab1-40c2-b186-762d1f8afd4e'>
        <host name='192.168.122.5' port='6789'/>
        <host name='192.168.122.6' port='6789'/>
        <host name='192.168.122.7' port='6789'/>
      </source>
      <backingStore/>
      <target dev='vdb' bus='virtio'/>
      <serial>ce556e6c-dab1-40c2-b186-762d1f8afd4e</serial>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>


* after migration:

    <disk type='network' device='disk'>
      <driver name='qemu' type='raw' cache='writeback' discard='unmap'/>
      <auth username='cinder'>
        <secret type='ceph' uuid='475b69d9-9ea3-4356-ac22-762b17a875e3'/>
      </auth>
      <source protocol='rbd' name='osp8-vms/9715a493-60be-4d76-9d4c-34b37dad7366_disk'>
        <host name='192.168.122.5' port='6789'/>
        <host name='192.168.122.6' port='6789'/>
        <host name='192.168.122.7' port='6789'/>
      </source>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>
    <disk type='network' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <auth username='cinder'>
        <secret type='ceph' uuid='475b69d9-9ea3-4356-ac22-762b17a875e3'/>
      </auth>
      <source protocol='rbd' name='osp8-volumes/volume-ce556e6c-dab1-40c2-b186-762d1f8afd4e'>
        <host name='192.168.122.5' port='6789'/>
        <host name='192.168.122.6' port='6789'/>
        <host name='192.168.122.7' port='6789'/>
      </source>
      <backingStore/>
      <target dev='vdb' bus='virtio'/>
      <serial>ce556e6c-dab1-40c2-b186-762d1f8afd4e</serial>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>

works with:
# diff -u driver.py.org driver.py
--- driver.py.org       2018-02-14 11:00:23.986251918 -0500
+++ driver.py   2018-02-14 11:12:07.310126939 -0500
@@ -1074,8 +1074,10 @@
         driver.disconnect_volume(connection_info, disk_dev)
 
     def _get_volume_config(self, connection_info, disk_info):
-        driver = self._get_volume_driver(connection_info)
-        return driver.get_config(connection_info, disk_info)
+        vol_driver = self._get_volume_driver(connection_info)
+        conf = vol_driver.get_config(connection_info, disk_info)
+        self._set_cache_mode(conf)
+        return conf
 
     def _get_volume_encryptor(self, connection_info, encryption):
         encryptor = encryptors.get_volume_encryptor(connection_info,
@@ -1119,7 +1121,6 @@
             instance, CONF.libvirt.virt_type, image_meta, bdm)
         self._connect_volume(connection_info, disk_info)
         conf = self._get_volume_config(connection_info, disk_info)
-        self._set_cache_mode(conf)
 
         try:
             state = guest.get_power_state(self._host)
@@ -3489,9 +3490,6 @@
             vol['connection_info'] = connection_info
             vol.save()
 
-        for d in devices:
-            self._set_cache_mode(d)
-
         if image_meta.properties.get('hw_scsi_model'):
             hw_scsi_model = image_meta.properties.hw_scsi_model
             scsi_controller = vconfig.LibvirtConfigGuestController()
Comment 2 Lon Hohberger 2018-05-22 06:36:12 EDT
According to our records, this should be resolved by openstack-nova-13.1.4-21.el7ost.  This build is available now.
Comment 7 errata-xmlrpc 2018-10-02 14:52:25 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2855

Note You need to log in before you can comment on or make changes to this bug.