Bug 1727871 - [V2V] VM migration fails when using slow NFS remote storage
Summary: [V2V] VM migration fails when using slow NFS remote storage
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: V2V
Version: 2.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 2.2.0
Assignee: Tomáš Golembiovský
QA Contact: Igor Braginsky
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-08 12:04 UTC by Igor Braginsky
Modified: 2020-01-20 07:15 UTC (History)
11 users (show)

Fixed In Version: kubevirt-v2v-conversion-container-v2.1.0-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-01-13 05:40:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Igor Braginsky 2019-07-08 12:04:43 UTC
Description of problem:


Version-Release number of selected component (if applicable): v2.0.0-33


How reproducible: 100%


Steps to Reproduce:
1. Create NFS PVs
2. Create VDDK-PVC for those PVs
3. Run migration selecting NFS share as storage

Actual results:
Conversion pod goes to failed state, migration can't be completed

Expected results:
Migration should be completed

Additional info:
$ oc get pv
NAME                CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                               STORAGECLASS   REASON   AGE
local-pv-19e5241f   10Gi       RWO            Delete           Available                                       local-sc                46h
local-pv-3217bf2c   25Gi       RWO            Delete           Available                                       local-sc                46h
local-pv-527a68b6   25Gi       RWO            Delete           Available                                       local-sc                46h
local-pv-9116487e   25Gi       RWO            Delete           Available                                       local-sc                46h
local-pv-9bdd9589   25Gi       RWO            Delete           Available                                       local-sc                46h
local-pv-a0a6c2b6   10Gi       RWO            Delete           Available                                       local-sc                46h
local-pv-a794aed    25Gi       RWO            Delete           Available                                       local-sc                46h
local-pv-a85c2833   25Gi       RWO            Delete           Available                                       local-sc                46h
local-pv-b74f0dfd   10Gi       RWO            Delete           Available                                       local-sc                46h
local-pv-ba561a78   10Gi       RWO            Delete           Available                                       local-sc                46h
local-pv-c507043c   10Gi       RWO            Delete           Available                                       local-sc                46h
local-pv-c58b5d7    10Gi       RWO            Delete           Available                                       local-sc                25h
nfsshare-nfs-vol1   20Gi       RWX            Recycle          Available                                       nfs-sc                  25h
nfsshare-nfs-vol2   20Gi       RWX            Recycle          Available                                       nfs-sc                  25h
nfsshare-nfs-vol3   20Gi       RWX            Recycle          Bound       default/vddk-pvc                    nfs-sc                  25h
nfsshare-nfs-vol4   20Gi       RWX            Recycle          Available                                       nfs-sc                  25h
nfsshare-nfs-vol5   20Gi       RWX            Recycle          Bound       default/v2v-conversion-temp-ks7gg   nfs-sc                  25h
nfsshare-nfs-vol6   20Gi       RWX            Recycle          Bound       default/hd1-m72xs                   nfs-sc                  25h

$ oc get pvc
NAME                        STATUS   VOLUME              CAPACITY   ACCESS MODES   STORAGECLASS   AGE
hd1-m72xs                   Bound    nfsshare-nfs-vol6   20Gi       RWX            nfs-sc         3h16m
v2v-conversion-temp-ks7gg   Bound    nfsshare-nfs-vol5   20Gi       RWX            nfs-sc         3h16m
vddk-pvc                    Bound    nfsshare-nfs-vol3   20Gi       RWX            nfs-sc         25h

$ oc describe pod kubevirt-v2v-conversion-kqghd
Name:               kubevirt-v2v-conversion-kqghd
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               host-172-16-0-41/172.16.0.41
Start Time:         Thu, 04 Jul 2019 08:43:42 +0000
Labels:             <none>
Annotations:        k8s.v1.cni.cncf.io/networks-status:
Status:             Failed
IP:                 10.129.2.57
Containers:
  kubevirt-v2v-conversion:
    Container ID:   cri-o://f9a7ae0cb810bb54b8efdacefa5c55d553491bba5ca4546467a74fc9548f2788
    Image:          brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/container-native-virtualization/kubevirt-v2v-conversion:v2.0.0-14.8
    Image ID:       brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/container-native-virtualization/kubevirt-v2v-conversion@sha256:920797c3081212fad2c3459fd56d64d21951c3d4b0effb24edab12400f73ca20
    Port:           <none>
    Host Port:      <none>
    State:          Terminated
      Reason:       Error
      Exit Code:    142
      Started:      Thu, 04 Jul 2019 08:43:56 +0000
      Finished:     Thu, 04 Jul 2019 08:54:20 +0000
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /data/input from configuration (rw)
      /data/vddklib from vddk-pvc (rw)
      /data/vm/disk1 from hd1 (rw)
      /dev/kvm from kvm (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kubevirt-v2v-conversion-thskp-token-rfdts (ro)
      /var/tmp from v2v-conversion-temp (rw)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  configuration:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  kubevirt-v2v-conversion-snm6l
    Optional:    false
  kvm:
    Type:          HostPath (bare host directory volume)
    Path:          /dev/kvm
    HostPathType:  
  hd1:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  hd1-m72xs
    ReadOnly:   false
  v2v-conversion-temp:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  v2v-conversion-temp-ks7gg
    ReadOnly:   false
  vddk-pvc:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  vddk-pvc
    ReadOnly:   false
  kubevirt-v2v-conversion-thskp-token-rfdts:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  kubevirt-v2v-conversion-thskp-token-rfdts
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

$ oc logs kubevirt-v2v-conversion-kqghd
+ VDDK=/opt/vmware-vix-disklib-distrib/
+ ls -l /usr/lib64/nbdkit/plugins/nbdkit-vddk-plugin.so
-rwxr-xr-x. 1 root root 15520 Nov  8  2018 /usr/lib64/nbdkit/plugins/nbdkit-vddk-plugin.so
+ ls -ld /opt/vmware-vix-disklib-distrib/
drwxr-xr-x. 7 root root 4096 Oct  8  2018 /opt/vmware-vix-disklib-distrib/
++ find /opt/vmware-vix-disklib-distrib/ -name libvixDiskLib.so.6
+ lib=/opt/vmware-vix-disklib-distrib/lib64/libvixDiskLib.so.6
++ dirname /opt/vmware-vix-disklib-distrib/lib64/libvixDiskLib.so.6
+ LD_LIBRARY_PATH=/opt/vmware-vix-disklib-distrib/lib64
+ nbdkit --dump-plugin vddk
path=/usr/lib64/nbdkit/plugins/nbdkit-vddk-plugin.so
name=vddk
version=1.2.6
api_version=1
struct_size=200
thread_model=serialize_all_requests
errno_is_preserved=0
has_longname=1
has_load=1
has_unload=1
has_dump_plugin=1
has_config=1
has_config_complete=1
has_config_help=1
has_open=1
has_close=1
has_get_size=1
has_pread=1
has_pwrite=1
vddk_default_libdir=/usr/lib64/vmware-vix-disklib
vddk_has_nfchostport=1
+ LIBGUESTFS_BACKEND=direct
+ libguestfs-test-tool
     ************************************************************
     *                    IMPORTANT NOTICE
     *
     * When reporting bugs, include the COMPLETE, UNEDITED
     * output below in your bug report.
     *
     ************************************************************
LIBGUESTFS_BACKEND=direct
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
SELinux: Disabled
guestfs_get_append: (null)
guestfs_get_autosync: 1
guestfs_get_backend: direct
guestfs_get_backend_settings: []
guestfs_get_cachedir: /var/tmp
guestfs_get_hv: /usr/libexec/qemu-kvm
guestfs_get_memsize: 500
guestfs_get_network: 0
guestfs_get_path: /usr/lib64/guestfs
guestfs_get_pgroup: 0
guestfs_get_program: libguestfs-test-tool
guestfs_get_recovery_proc: 1
guestfs_get_smp: 1
guestfs_get_sockdir: /tmp
guestfs_get_tmpdir: /tmp
guestfs_get_trace: 0
guestfs_get_verbose: 1
host_cpu: x86_64
Launching appliance, timeout set to 600 seconds.
libguestfs: launch: program=libguestfs-test-tool
libguestfs: launch: version=1.38.2rhel=7,release=12.100.lp.el7ev,libvirt
libguestfs: launch: backend registered: unix
libguestfs: launch: backend registered: uml
libguestfs: launch: backend registered: libvirt
libguestfs: launch: backend registered: direct
libguestfs: launch: backend=direct
libguestfs: launch: tmpdir=/tmp/libguestfsexZMZq
libguestfs: launch: umask=0022
libguestfs: launch: euid=0
libguestfs: begin building supermin appliance
libguestfs: run supermin
libguestfs: command: run: /usr/bin/supermin5
libguestfs: command: run: \ --build
libguestfs: command: run: \ --verbose
libguestfs: command: run: \ --if-newer
libguestfs: command: run: \ --lock /var/tmp/.guestfs-0/lock
libguestfs: command: run: \ --copy-kernel
libguestfs: command: run: \ -f ext2
libguestfs: command: run: \ --host-cpu x86_64
libguestfs: command: run: \ /usr/lib64/guestfs/supermin.d
libguestfs: command: run: \ -o /var/tmp/.guestfs-0/appliance.d
supermin: version: 5.1.19
supermin: rpm: detected RPM version 4.11
supermin: package handler: fedora/rpm
supermin: acquiring lock on /var/tmp/.guestfs-0/lock
supermin: build: /usr/lib64/guestfs/supermin.d
supermin: reading the supermin appliance
supermin: build: visiting /usr/lib64/guestfs/supermin.d/base.tar.gz type gzip base image (tar)
supermin: build: visiting /usr/lib64/guestfs/supermin.d/daemon.tar.gz type gzip base image (tar)
supermin: build: visiting /usr/lib64/guestfs/supermin.d/excludefiles type uncompressed excludefiles
supermin: build: visiting /usr/lib64/guestfs/supermin.d/hostfiles type uncompressed hostfiles
supermin: build: visiting /usr/lib64/guestfs/supermin.d/init.tar.gz type gzip base image (tar)
supermin: build: visiting /usr/lib64/guestfs/supermin.d/packages type uncompressed packages
supermin: build: visiting /usr/lib64/guestfs/supermin.d/udev-rules.tar.gz type gzip base image (tar)
supermin: build: visiting /usr/lib64/guestfs/supermin.d/zz-winsupport.tar.gz type gzip base image (tar)
supermin: mapping package names to installed packages
BDB2053 Freeing read locks for locker 0x64: 9/140660052699264
BDB2053 Freeing read locks for locker 0x66: 9/140660052699264
BDB2053 Freeing read locks for locker 0x67: 9/140660052699264
BDB2053 Freeing read locks for locker 0x68: 9/140660052699264
supermin: resolving full list of package dependencies
supermin: build: 208 packages, including dependencies
supermin: build: 31986 files
supermin: build: 7696 files, after matching excludefiles
supermin: build: 7702 files, after adding hostfiles
supermin: build: 7676 files, after removing unreadable files
supermin: build: 7701 files, after munging
supermin: kernel: looking for kernel using environment variables ...
supermin: kernel: looking for kernels in /lib/modules/*/vmlinuz ...
supermin: kernel: looking for kernels in /boot ...
supermin: kernel: kernel version of /boot/vmlinuz-3.10.0-957.21.3.el7.x86_64 = 3.10.0-957.21.3.el7.x86_64 (from content)
supermin: kernel: picked modules path /lib/modules/3.10.0-957.21.3.el7.x86_64
supermin: kernel: picked vmlinuz /boot/vmlinuz-3.10.0-957.21.3.el7.x86_64
supermin: kernel: kernel_version 3.10.0-957.21.3.el7.x86_64
supermin: kernel: modpath /lib/modules/3.10.0-957.21.3.el7.x86_64
supermin: ext2: creating empty ext2 filesystem '/var/tmp/.guestfs-0/appliance.d.p0lf0anr/root'
supermin: ext2: populating from base image
supermin: ext2: copying files from host filesystem
/usr/local/bin/entrypoint: line 18:    14 Alarm clock             LIBGUESTFS_BACKEND='direct' libguestfs-test-tool

Comment 1 Tomas Jelinek 2019-07-08 12:56:31 UTC
The problem is the timeout (10 minutes) until which the conversion pod needs to prepare. On faster NFS it is able to do it, on slower not.

Comment 3 Nelly Credi 2019-07-10 11:47:12 UTC
was moved by mistake

Comment 7 Daniel Gur 2019-07-15 07:43:42 UTC
Close no, Retarget probably yes
Restoring need info to tgolembi

Comment 8 Tomas Jelinek 2019-07-15 10:18:37 UTC
If you have a slower storage it certainly affects the VM performance but there is nothing which would timeout so in this sense it is v2v only issue.

Targeting to 2.1

Comment 17 Igor Braginsky 2019-09-11 10:58:09 UTC
@Brett, I will validate if the env is functional, if not - I will need to build it.
I can do that with no problem, it will just take some time

Comment 19 Igor Braginsky 2019-09-18 12:26:04 UTC
I don't have environment where to test it, all my deployments are failing, or, even if deployment is successful, migration doesn't work due to V2V issues.
Do you have any working environment where I can add NFS storage and try migration?

Comment 21 Fabien Dupont 2019-09-18 13:44:54 UTC
From IMS point of view, we don't consider this BZ as a blocker.

Comment 22 Tomáš Golembiovský 2019-09-18 13:46:37 UTC
adding need info on QE (for tracking)

Comment 23 Federico Simoncelli 2019-09-19 09:12:04 UTC
(In reply to Fabien Dupont from comment #21)
> From IMS point of view, we don't consider this BZ as a blocker.

If everyone agrees, can anyone move this out of 2.1.0? (Either to 2.1.1 or 2.2)
Thank you.

Comment 24 Brett Thurber 2019-09-19 13:46:53 UTC
(In reply to Federico Simoncelli from comment #23)
> (In reply to Fabien Dupont from comment #21)
> > From IMS point of view, we don't consider this BZ as a blocker.
> 
> If everyone agrees, can anyone move this out of 2.1.0? (Either to 2.1.1 or
> 2.2)
> Thank you.

Will make the call on this next week.

Comment 26 Brett Thurber 2019-10-21 18:40:41 UTC
Placing this back to ON_QA for additional testing with new conversion POD.

Comment 27 Nelly Credi 2019-11-11 12:30:54 UTC
please add fixed in version

Comment 29 Brett Thurber 2020-01-13 05:40:21 UTC
Closing as NFS isn't supported storage for CNV.


Note You need to log in before you can comment on or make changes to this bug.