Bug 2025279

Summary: "Importer" pod fails during warm migration
Product: Migration Toolkit for Virtualization Reporter: Amos Mastbaum <amastbau>
Component: ControllerAssignee: Fabien Dupont <fdupont>
Status: CLOSED ERRATA QA Contact: Amos Mastbaum <amastbau>
Severity: unspecified Docs Contact: Avital Pinnick <apinnick>
Priority: unspecified    
Version: 2.2.0CC: istein, marnold
Target Milestone: ---   
Target Release: 2.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-12-09 19:21:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Amos Mastbaum 2021-11-21 12:03:47 UTC
Description of problem:
warm migration (VMware) fails: 


Importer Pod is Crash Looping:

E1121 11:41:40.964296       1 importer.go:172] ServerFaultCode: Error caused by file /vmfs/volumes/5bfe667e-7bd9825c-0be5-b8ca3a638940/mtv-rhel8-2disks2nics-for-warm/mtv-rhel8-2disks2nics-for-warm-000002.vmdk


CheckID is passed by MTY in 2.2.0-90 (where this was found):

spec:
  checkpoints:
  - current: snapshot-8314
    previous: ""
  - current: snapshot-8315
    previous: 52 4c bd 71 27 c9 0d 5c-84 52 48 a6 a9 99 ee d4/169





Version-Release number of selected component (if applicable):


How reproducible
100%


Steps to Reproduce:
1. Start a Warm Migration and wait.
2.
3.

Actual results:


Expected results:


Additional info:
Plan: http://pastebin.test.redhat.com/1009842
DV :http://pastebin.test.redhat.com/1009843
CDI-Deployment log attached

Comment 1 Amos Mastbaum 2021-11-21 13:05:29 UTC
Created attachment 1842896 [details]
cd-deployment-log

Comment 2 Matthew Arnold 2021-11-22 02:37:14 UTC
I haven't seen that error before. Is there any chance you can catch an importer log before it gets deleted?

Comment 3 Amos Mastbaum 2021-11-22 08:30:19 UTC
@marnold 

This is all there is in the importer log, asaicc:

I1122 08:28:10.625006       1 importer.go:52] Starting importer
I1122 08:28:10.625863       1 importer.go:135] begin import process
I1122 08:28:10.625876       1 vddk-datasource_amd64.go:756] Creating VDDK data source: backingFile [[v2v_general_porpuse_ISCSI_DC] mtv-rhel8-2disks2nics-for-warm/mtv-rhel8-2disks2nics-for-warm.vmdk], currentCheckpoint [snapshot-8319], previousCheckpoint [52 4c bd 71 27 c9 0d 5c-84 52 48 a6 a9 99 ee d4/171], finalCheckpoint [false]
I1122 08:28:10.673314       1 vddk-datasource_amd64.go:426] VM 42030def-cb7b-cd22-d401-541b110d3658 found in datacenter Datacenter:datacenter-2 @ /Datacenter: vm-8272
I1122 08:28:10.674915       1 vddk-datasource_amd64.go:283] Current VM power state: poweredOff
I1122 08:28:10.711239       1 vddk-datasource_amd64.go:304] Logged out of VMware.
E1122 08:28:10.711256       1 importer.go:172] ServerFaultCode: Error caused by file /vmfs/volumes/5bfe667e-7bd9825c-0be5-b8ca3a638940/mtv-rhel8-2disks2nics-for-warm/mtv-rhel8-2disks2nics-for-warm-000003.vmdk

Comment 4 Amos Mastbaum 2021-11-22 08:34:01 UTC
(In reply to Amos Mastbaum from comment #3)
> @marnold 
> 
> This is all there is in the importer log, asaicc:
> 
> I1122 08:28:10.625006       1 importer.go:52] Starting importer
> I1122 08:28:10.625863       1 importer.go:135] begin import process
> I1122 08:28:10.625876       1 vddk-datasource_amd64.go:756] Creating VDDK
> data source: backingFile [[v2v_general_porpuse_ISCSI_DC]
> mtv-rhel8-2disks2nics-for-warm/mtv-rhel8-2disks2nics-for-warm.vmdk],
> currentCheckpoint [snapshot-8319], previousCheckpoint [52 4c bd 71 27 c9 0d
> 5c-84 52 48 a6 a9 99 ee d4/171], finalCheckpoint [false]
> I1122 08:28:10.673314       1 vddk-datasource_amd64.go:426] VM
> 42030def-cb7b-cd22-d401-541b110d3658 found in datacenter
> Datacenter:datacenter-2 @ /Datacenter: vm-8272
> I1122 08:28:10.674915       1 vddk-datasource_amd64.go:283] Current VM power
> state: poweredOff
> I1122 08:28:10.711239       1 vddk-datasource_amd64.go:304] Logged out of
> VMware.
> E1122 08:28:10.711256       1 importer.go:172] ServerFaultCode: Error caused
> by file
> /vmfs/volumes/5bfe667e-7bd9825c-0be5-b8ca3a638940/mtv-rhel8-2disks2nics-for-
> warm/mtv-rhel8-2disks2nics-for-warm-000003.vmdk

for the failed copy.

Comment 5 Matthew Arnold 2021-11-22 16:44:45 UTC
Here is the error from the VMware side. The change ID definitely doesn't match up with the snapshot, like you were saying in the other bug.

2021-11-22T15:38:59.753Z info vpxa[CB26B70] [Originator@6876 sub=vpxLro opID=69d928ce-47] [VpxLRO] -- BEGIN lro-359982 -- vm-2149 -- vim.VirtualMachine.queryChangedDiskAreas -- 526c33ce-48dd-670a-be73-133977b94f0d
2021-11-22T15:38:59.772Z error vpxa[CB26B70] [Originator@6876 sub=vpxaVmomi opID=69d928ce-47] [VpxaClientAdapter::InvokeCommon] Re-throwing method-fault 'N3Vim5Fault9FileFault9ExceptionE(vim.fault.FileFault)
--> [context]zKq7AUoCAgAAAOXL6AAidnB4YQAAHywubGlidm1hY29yZS5zbwAA4lUSAG+8DQFmT95saWJ2aW0tdHlwZXMuc28AAVHD1QKHWw1saWJ2bW9taS5zbwAC7GANA5i4FXZweGEAA+u6FQLwmg8BmSjEA6xPIANbCRoBMq6zAjcsDALaUQwDwBQpA4qoFQM7DSgDM7knA8lyFgNkAigD9R4oA9u4JwNdwCcDbncoA42VEwPJchYAE5sgAD6fIADl5iAAZKouBMpsAGxpYnB0aHJlYWQuc28uMAAFPkwNbGliYy5zby42AA==[/context]' received while invoking queryChangedDiskAreas on vim.VirtualMachine:2580      2021-11-22T15:38:59.777Z info vpxa[CB26B70] [Originator@6876 sub=vpxLro opID=69d928ce-47] [VpxLRO] -- FINISH lro-359982
2021-11-22T15:38:59.777Z info vpxa[CB26B70] [Originator@6876 sub=Default opID=69d928ce-47] [VpxLRO] -- ERROR lro-359982 -- vm-2149 -- vim.VirtualMachine.queryChangedDiskAreas: vim.fault.FileFault:
--> Result:
--> (vim.fault.FileFault) {
-->    faultCause = (vmodl.MethodFault) null,                                                                                            -->    faultMessage = (vmodl.LocalizableMessage) [      
-->       (vmodl.LocalizableMessage) {
-->          key = "vim.hostd.vmsvc.cbt.cannotGetChanges",
-->          arg = (vmodl.KeyAnyValue) [                                                                                                 -->             (vmodl.KeyAnyValue) {
-->                key = "path",
-->                value = "/vmfs/volumes/5bfe667e-7bd9825c-0be5-b8ca3a638940/mtv-rhel8-2disks2nics-for-warm/mtv-rhel8-2disks2nics-for-warm-000003.vmdk"
-->             },                                                                                                                       
-->             (vmodl.KeyAnyValue) {
-->                key = "reason",                                                                                                       
-->                value = "Nonmatching change UUID"
-->             }                                                                                                                        
-->          ],                                                                                                                          -->          message = "Cannot compute changes for disk /vmfs/volumes/5bfe667e-7bd9825c-0be5-b8ca3a638940/mtv-rhel8-2disks2nics-for-warm/mtv-rhel8-2disks2nics-for-warm-000003.vmdk: Nonmatching change UUID."
-->       }                                                                                                                              
-->    ],
-->    file = "/vmfs/volumes/5bfe667e-7bd9825c-0be5-b8ca3a638940/mtv-rhel8-2disks2nics-for-warm/mtv-rhel8-2disks2nics-for-warm-000003.vmdk"
-->    msg = "Received SOAP response fault from [<cs p:05879328, TCP:localhost:8307>]: queryChangedDiskAreas
--> Error caused by file /vmfs/volumes/5bfe667e-7bd9825c-0be5-b8ca3a638940/mtv-rhel8-2disks2nics-for-warm/mtv-rhel8-2disks2nics-for-warm-000003.vmdk"
--> }
--> Args:
-->
--> Arg snapshot:
--> 'vim.vm.Snapshot:2580-snapshot-14'
--> Arg deviceKey:
--> 2000
--> Arg startOffset:
--> 0
--> Arg changeId:
--> "52 4c bd 71 27 c9 0d 5c-84 52 48 a6 a9 99 ee d4/171"

Comment 6 Matthew Arnold 2021-11-23 20:44:23 UTC
It looks like I told Sam a slightly wrong way to get change IDs, so they weren't working for snapshots with multiple disks.

Comment 7 Fabien Dupont 2021-11-24 10:56:07 UTC
Please, verify with mtv-operator-bundle-container-2.2.0-96 / iib:137780, or later.

Comment 8 Ilanit Stein 2021-11-25 15:24:01 UTC
This bug verification is blocked by bug 2024138.

Comment 9 Fabien Dupont 2021-12-02 14:54:51 UTC
No need to wait for bug 2024138. Warm migrate will fail 100% at the 2nd iteration if it is not fixed.

Comment 10 Ilanit Stein 2021-12-06 06:30:12 UTC
Verified on MTV-2.2.0-104, based on bug 2024138 verification (VMware warm migration successful runs).

Comment 13 errata-xmlrpc 2021-12-09 19:21:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (MTV 2.2.0 Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:5066