Bug 2025279 - "Importer" pod fails during warm migration
Summary: "Importer" pod fails during warm migration
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Migration Toolkit for Virtualization
Classification: Red Hat
Component: Controller
Version: 2.2.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 2.2.0
Assignee: Fabien Dupont
QA Contact: Amos Mastbaum
Avital Pinnick
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-11-21 12:03 UTC by Amos Mastbaum
Modified: 2021-12-09 19:21 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-12-09 19:21:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github konveyor forklift-operator pull 188 0 None Merged Revert "Bug 1942651: Make the boolean controller specific" 2021-11-24 10:56:07 UTC
Red Hat Product Errata RHEA-2021:5066 0 None None None 2021-12-09 19:21:18 UTC

Description Amos Mastbaum 2021-11-21 12:03:47 UTC
Description of problem:
warm migration (VMware) fails: 


Importer Pod is Crash Looping:

E1121 11:41:40.964296       1 importer.go:172] ServerFaultCode: Error caused by file /vmfs/volumes/5bfe667e-7bd9825c-0be5-b8ca3a638940/mtv-rhel8-2disks2nics-for-warm/mtv-rhel8-2disks2nics-for-warm-000002.vmdk


CheckID is passed by MTY in 2.2.0-90 (where this was found):

spec:
  checkpoints:
  - current: snapshot-8314
    previous: ""
  - current: snapshot-8315
    previous: 52 4c bd 71 27 c9 0d 5c-84 52 48 a6 a9 99 ee d4/169





Version-Release number of selected component (if applicable):


How reproducible
100%


Steps to Reproduce:
1. Start a Warm Migration and wait.
2.
3.

Actual results:


Expected results:


Additional info:
Plan: http://pastebin.test.redhat.com/1009842
DV :http://pastebin.test.redhat.com/1009843
CDI-Deployment log attached

Comment 1 Amos Mastbaum 2021-11-21 13:05:29 UTC
Created attachment 1842896 [details]
cd-deployment-log

Comment 2 Matthew Arnold 2021-11-22 02:37:14 UTC
I haven't seen that error before. Is there any chance you can catch an importer log before it gets deleted?

Comment 3 Amos Mastbaum 2021-11-22 08:30:19 UTC
@marnold 

This is all there is in the importer log, asaicc:

I1122 08:28:10.625006       1 importer.go:52] Starting importer
I1122 08:28:10.625863       1 importer.go:135] begin import process
I1122 08:28:10.625876       1 vddk-datasource_amd64.go:756] Creating VDDK data source: backingFile [[v2v_general_porpuse_ISCSI_DC] mtv-rhel8-2disks2nics-for-warm/mtv-rhel8-2disks2nics-for-warm.vmdk], currentCheckpoint [snapshot-8319], previousCheckpoint [52 4c bd 71 27 c9 0d 5c-84 52 48 a6 a9 99 ee d4/171], finalCheckpoint [false]
I1122 08:28:10.673314       1 vddk-datasource_amd64.go:426] VM 42030def-cb7b-cd22-d401-541b110d3658 found in datacenter Datacenter:datacenter-2 @ /Datacenter: vm-8272
I1122 08:28:10.674915       1 vddk-datasource_amd64.go:283] Current VM power state: poweredOff
I1122 08:28:10.711239       1 vddk-datasource_amd64.go:304] Logged out of VMware.
E1122 08:28:10.711256       1 importer.go:172] ServerFaultCode: Error caused by file /vmfs/volumes/5bfe667e-7bd9825c-0be5-b8ca3a638940/mtv-rhel8-2disks2nics-for-warm/mtv-rhel8-2disks2nics-for-warm-000003.vmdk

Comment 4 Amos Mastbaum 2021-11-22 08:34:01 UTC
(In reply to Amos Mastbaum from comment #3)
> @marnold 
> 
> This is all there is in the importer log, asaicc:
> 
> I1122 08:28:10.625006       1 importer.go:52] Starting importer
> I1122 08:28:10.625863       1 importer.go:135] begin import process
> I1122 08:28:10.625876       1 vddk-datasource_amd64.go:756] Creating VDDK
> data source: backingFile [[v2v_general_porpuse_ISCSI_DC]
> mtv-rhel8-2disks2nics-for-warm/mtv-rhel8-2disks2nics-for-warm.vmdk],
> currentCheckpoint [snapshot-8319], previousCheckpoint [52 4c bd 71 27 c9 0d
> 5c-84 52 48 a6 a9 99 ee d4/171], finalCheckpoint [false]
> I1122 08:28:10.673314       1 vddk-datasource_amd64.go:426] VM
> 42030def-cb7b-cd22-d401-541b110d3658 found in datacenter
> Datacenter:datacenter-2 @ /Datacenter: vm-8272
> I1122 08:28:10.674915       1 vddk-datasource_amd64.go:283] Current VM power
> state: poweredOff
> I1122 08:28:10.711239       1 vddk-datasource_amd64.go:304] Logged out of
> VMware.
> E1122 08:28:10.711256       1 importer.go:172] ServerFaultCode: Error caused
> by file
> /vmfs/volumes/5bfe667e-7bd9825c-0be5-b8ca3a638940/mtv-rhel8-2disks2nics-for-
> warm/mtv-rhel8-2disks2nics-for-warm-000003.vmdk

for the failed copy.

Comment 5 Matthew Arnold 2021-11-22 16:44:45 UTC
Here is the error from the VMware side. The change ID definitely doesn't match up with the snapshot, like you were saying in the other bug.

2021-11-22T15:38:59.753Z info vpxa[CB26B70] [Originator@6876 sub=vpxLro opID=69d928ce-47] [VpxLRO] -- BEGIN lro-359982 -- vm-2149 -- vim.VirtualMachine.queryChangedDiskAreas -- 526c33ce-48dd-670a-be73-133977b94f0d
2021-11-22T15:38:59.772Z error vpxa[CB26B70] [Originator@6876 sub=vpxaVmomi opID=69d928ce-47] [VpxaClientAdapter::InvokeCommon] Re-throwing method-fault 'N3Vim5Fault9FileFault9ExceptionE(vim.fault.FileFault)
--> [context]zKq7AUoCAgAAAOXL6AAidnB4YQAAHywubGlidm1hY29yZS5zbwAA4lUSAG+8DQFmT95saWJ2aW0tdHlwZXMuc28AAVHD1QKHWw1saWJ2bW9taS5zbwAC7GANA5i4FXZweGEAA+u6FQLwmg8BmSjEA6xPIANbCRoBMq6zAjcsDALaUQwDwBQpA4qoFQM7DSgDM7knA8lyFgNkAigD9R4oA9u4JwNdwCcDbncoA42VEwPJchYAE5sgAD6fIADl5iAAZKouBMpsAGxpYnB0aHJlYWQuc28uMAAFPkwNbGliYy5zby42AA==[/context]' received while invoking queryChangedDiskAreas on vim.VirtualMachine:2580      2021-11-22T15:38:59.777Z info vpxa[CB26B70] [Originator@6876 sub=vpxLro opID=69d928ce-47] [VpxLRO] -- FINISH lro-359982
2021-11-22T15:38:59.777Z info vpxa[CB26B70] [Originator@6876 sub=Default opID=69d928ce-47] [VpxLRO] -- ERROR lro-359982 -- vm-2149 -- vim.VirtualMachine.queryChangedDiskAreas: vim.fault.FileFault:
--> Result:
--> (vim.fault.FileFault) {
-->    faultCause = (vmodl.MethodFault) null,                                                                                            -->    faultMessage = (vmodl.LocalizableMessage) [      
-->       (vmodl.LocalizableMessage) {
-->          key = "vim.hostd.vmsvc.cbt.cannotGetChanges",
-->          arg = (vmodl.KeyAnyValue) [                                                                                                 -->             (vmodl.KeyAnyValue) {
-->                key = "path",
-->                value = "/vmfs/volumes/5bfe667e-7bd9825c-0be5-b8ca3a638940/mtv-rhel8-2disks2nics-for-warm/mtv-rhel8-2disks2nics-for-warm-000003.vmdk"
-->             },                                                                                                                       
-->             (vmodl.KeyAnyValue) {
-->                key = "reason",                                                                                                       
-->                value = "Nonmatching change UUID"
-->             }                                                                                                                        
-->          ],                                                                                                                          -->          message = "Cannot compute changes for disk /vmfs/volumes/5bfe667e-7bd9825c-0be5-b8ca3a638940/mtv-rhel8-2disks2nics-for-warm/mtv-rhel8-2disks2nics-for-warm-000003.vmdk: Nonmatching change UUID."
-->       }                                                                                                                              
-->    ],
-->    file = "/vmfs/volumes/5bfe667e-7bd9825c-0be5-b8ca3a638940/mtv-rhel8-2disks2nics-for-warm/mtv-rhel8-2disks2nics-for-warm-000003.vmdk"
-->    msg = "Received SOAP response fault from [<cs p:05879328, TCP:localhost:8307>]: queryChangedDiskAreas
--> Error caused by file /vmfs/volumes/5bfe667e-7bd9825c-0be5-b8ca3a638940/mtv-rhel8-2disks2nics-for-warm/mtv-rhel8-2disks2nics-for-warm-000003.vmdk"
--> }
--> Args:
-->
--> Arg snapshot:
--> 'vim.vm.Snapshot:2580-snapshot-14'
--> Arg deviceKey:
--> 2000
--> Arg startOffset:
--> 0
--> Arg changeId:
--> "52 4c bd 71 27 c9 0d 5c-84 52 48 a6 a9 99 ee d4/171"

Comment 6 Matthew Arnold 2021-11-23 20:44:23 UTC
It looks like I told Sam a slightly wrong way to get change IDs, so they weren't working for snapshots with multiple disks.

Comment 7 Fabien Dupont 2021-11-24 10:56:07 UTC
Please, verify with mtv-operator-bundle-container-2.2.0-96 / iib:137780, or later.

Comment 8 Ilanit Stein 2021-11-25 15:24:01 UTC
This bug verification is blocked by bug 2024138.

Comment 9 Fabien Dupont 2021-12-02 14:54:51 UTC
No need to wait for bug 2024138. Warm migrate will fail 100% at the 2nd iteration if it is not fixed.

Comment 10 Ilanit Stein 2021-12-06 06:30:12 UTC
Verified on MTV-2.2.0-104, based on bug 2024138 verification (VMware warm migration successful runs).

Comment 13 errata-xmlrpc 2021-12-09 19:21:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (MTV 2.2.0 Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:5066


Note You need to log in before you can comment on or make changes to this bug.