Bug 2011207

Summary: Migration with container disk ":latest" images is failing when image changed
Product: Container Native Virtualization (CNV) Reporter: Israel Pinto <ipinto>
Component: VirtualizationAssignee: Roman Mohr <rmohr>
Status: CLOSED ERRATA QA Contact: zhe peng <zpeng>
Severity: high Docs Contact:
Priority: medium    
Version: 4.9.0CC: cnv-qe-bugs, fdeutsch, rmohr, sgott, zpeng
Target Milestone: ---   
Target Release: 4.9.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: virt-operator-container-v4.9.1-4 hco-bundle-registry-container-v4.9.1-15 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-12-13 19:59:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Israel Pinto 2021-10-06 09:18:13 UTC
Description of problem:
While migration VM with container disk using latest image migration is failing since image changed. 
The image is different when pulled on the source, and some time later, pulled on the destination.


Version-Release number of selected component (if applicable):
CNV 4.9


Steps to Reproduce:
1. Create VM with ":latest" image  
2. Change the latest image 
3. Migrate the VM

Actual results:
Migration is failing with reason: 
migration of disk vda failed: Source and target image have different sizes
 

Expected results:
Block migration in this case.


Additional info:


$ oc logs virt-launcher-vm-c3-v8hnk -c compute | grep "Source and target image have different sizes" | tail

{"component":"virt-launcher","kind":"","level":"error","msg":"Live migration failed","name":"vm-c3","namespace":"default","pos":"manager.go:731","reason":"virError(Code=9, Domain=10, Message='operation failed: migration of disk vda failed: Source and target image have different sizes')","timestamp":"2021-10-05T16:04:37.674708Z","uid":"d32341c6-722d-426a-9aaa-f11c15a271ee"}


From Running VMI:

    machine:
      type: pc-q35-rhel8.2.0


Image difference between old and new virt-launcher
$ diff virt-launcher-vm-c3-v8hnk.yaml virt-launcher-vm-c3-zm4fd.yaml | grep image
<     image: registry.redhat.io/container-native-virtualization/virt-launcher@sha256:3211e8f15932a1a43ec9aa03a800bc690b08d3b2596afcfa9827b50ae1f5f183
>     image: registry.redhat.io/container-native-virtualization/virt-launcher@sha256:1b951d299a36d557600bb8376d8879621f390c059c36bedd856dcb6a4ec5d6d0

Comment 1 Roman Mohr 2021-10-06 12:30:22 UTC
One option would be to not allow migrations like you said, but we can actualy get the shasum pretty easy, so I would directly go to falling back to the shasum for the migration target pod. Here the info from the pod status once it is started:

```
  - containerID: cri-o://51feeec858a19f95172fe1c8767dfae72a63c1b45b9b542f90b3fd7b2cc16458
    image: registry:5000/kubevirt/cirros-container-disk-demo:devel
    imageID: registry:5000/kubevirt/cirros-container-disk-demo@sha256:90e064fca2f47eabce210d218a45ba48cc7105b027d3f39761f242506cad15d6
    lastState: {}
    name: volumecontainerdisk
```

You can see what the user requested: "registry:5000/kubevirt/cirros-container-disk-demo:devel" and also what exact shasum we got: "registry:5000/kubevirt/cirros-container-disk-demo@sha256:90e064fca2f47eabce210d218a45ba48cc7105b027d3f39761f242506cad15d6".

Comment 3 Roman Mohr 2021-11-02 16:27:47 UTC
Backport: https://github.com/kubevirt/kubevirt/pull/6716

Comment 4 zhe peng 2021-11-12 03:37:04 UTC
I can reproduce this with 4.9.0
can get error msg:
{"component":"virt-launcher","kind":"","level":"error","msg":"Recevied a live migration error. Will check the latest migration status.","name":"vm-cirros","namespace":"default","pos":"live-migration-source.go:663","reason":"error encountered during MigrateToURI3 libvirt api call: virError(Code=9, Domain=10, Message='operation failed: migration of disk vda failed: Source and target image have different sizes')","timestamp":"2021-11-12T03:07:21.470842Z","uid":"6051713a-3d3a-4b00-ae10-4a0f0c0e3f71"}

verified with build HCO:[v4.9.1-15]
step:
1. create registry image in openshift
2. create container disk vm with ":latest" image of registry
3. start vm, using podman change registry image and push
4. migrate vm

migration succeed. no error msg. 
move to verified.

Comment 10 errata-xmlrpc 2021-12-13 19:59:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Virtualization 4.9.1 Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:5091