Bug 1948376 - Failed to migrate vm when migration encryption is enabled - new deployments
Summary: Failed to migrate vm when migration encryption is enabled - new deployments
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: General
Version: 4.4.6.3
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.4.6
: 4.4.6.5
Assignee: Milan Zamazal
QA Contact: Qin Yuan
URL:
Whiteboard:
Depends On: 1949134
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-12 06:18 UTC by Qin Yuan
Modified: 2021-05-05 05:35 UTC (History)
5 users (show)

Fixed In Version: ovirt-engine-4.4.6.5
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-05 05:35:56 UTC
oVirt Team: Virt
Embargoed:
pm-rhel: ovirt-4.4+
pm-rhel: blocker?


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 114315 0 master MERGED ansible: Create client migration certificates 2021-04-21 10:12:34 UTC

Description Qin Yuan 2021-04-12 06:18:40 UTC
Description of problem:
When migration encryption is enabled at VM or Cluster level, VM migration will fail with following error in vdsm log on source host:

2021-04-12 08:36:42,835+0300 INFO  (migsrc/1b609da3) [virt.vm] (vmId='1b609da3-a60b-4a17-a366-1ce7ded3655b') starting migration to qemu+tls://private.redhat.com/system with miguri tcp://privateip (migration:527)
2021-04-12 08:36:50,507+0300 ERROR (migsrc/1b609da3) [virt.vm] (vmId='1b609da3-a60b-4a17-a366-1ce7ded3655b') operation failed: migration out job: Cannot write to TLS channel: Input/output error (migration:294)
2021-04-12 08:36:50,964+0300 ERROR (migsrc/1b609da3) [virt.vm] (vmId='1b609da3-a60b-4a17-a366-1ce7ded3655b') Failed to migrate (migration:460)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/virt/migration.py", line 441, in _regular_run
    time.time(), machineParams
  File "/usr/lib/python3.6/site-packages/vdsm/virt/migration.py", line 530, in _startUnderlyingMigration
    self._perform_with_conv_schedule(duri, muri)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/migration.py", line 619, in _perform_with_conv_schedule
    self._perform_migration(duri, muri)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/migration.py", line 548, in _perform_migration
    self._migration_flags)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 159, in call
    return getattr(self._vm._dom, name)(*a, **kw)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 101, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python3.6/site-packages/libvirt.py", line 2119, in migrateToURI3
    raise libvirtError('virDomainMigrateToURI3() failed')
libvirt.libvirtError: operation failed: migration out job: Cannot write to TLS channel: Input/output error


Version-Release number of selected component (if applicable):
vdsm-4.40.60.3-1.el8ev.x86_64
libvirt-7.0.0-13.module+el8.4.0+10604+5608c2b4.x86_64
ovirt-engine-4.4.6.3-0.8.el8ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. Create and run a VM(used latest-rhel-guest-image-8.3-infra template)
2. Enable migration encryption at Cluster or VM level.
3. Migrate the VM.

Actual results:
1. Migration failed. The error is as above.

Expected results:
1. Migration should succeed.

Additional info:

Comment 1 Milan Zamazal 2021-04-13 10:57:40 UTC
I can reproduce the bug. QEMU on the destination complains:

qemu-kvm: Verify failed: No certificate was found.

It looks like some change in RHEL/AV 8.4, I must look into it further.

Comment 2 Milan Zamazal 2021-04-13 13:55:13 UTC
Encrypted migrations work when migrating on 8.3 or when migrating from 8.4 to 8.3. But encrypted migrations to 8.4 don't.

It looks like a change or regression on the platform, I filed a libvirt bug: BZ 1949134

Comment 3 Milan Zamazal 2021-04-14 11:29:17 UTC
The problem is caused, as explained in BZ 1949134, by a change in the default libvirt configuration. libvirt requires now not only server migration certificates, but also client migration certificates, for good reasons.

The simplest way to remedy the problem is to reuse migration server certificates as migration client certificates, by making the corresponding links in libvirt-migrate certificate directory on the host. Unless anybody objects to this solution, I'll make a patch implementing it.

Comment 6 Qin Yuan 2021-04-23 04:57:18 UTC
Verified with:
ovirt-engine-4.4.6.5-0.17.el8ev.noarch
vdsm-4.40.60.5-1.el8ev.x86_64
libvirt-7.0.0-13.module+el8.4.0+10604+5608c2b4.x86_64
host kernel: kernel-4.18.0-304.el8.x86_64
guest kernel: kernel-4.18.0-240.el8.x86_64

Steps:
1. Create a Data Center with Compatibility Version 4.6 on ovirt-engine-4.4.6.5
2. Add a new Cluster
3. Add two RHEL 8.4 hosts
4. Create and run a VM:
   - Template latest-rhel-guest-image-8.3-infra 
   - enable migration encryption
5. Migrate the VM

Result:
Migrating VM with migration encryption enabled from RHEL 8.4 host to RHEL 8.4 host succeeds.

Comment 7 RHEL Program Management 2021-04-25 07:16:32 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 8 Sandro Bonazzola 2021-05-05 05:35:56 UTC
This bugzilla is included in oVirt 4.4.6 release, published on May 4th 2021.

Since the problem described in this bug report should be resolved in oVirt 4.4.6 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.