Bug 1712481 - Migration fail between non-FIPS <-> FIPS enabled host
Summary: Migration fail between non-FIPS <-> FIPS enabled host
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.3.11
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.4.5
: ---
Assignee: Liran Rotenberg
QA Contact: Qin Yuan
URL:
Whiteboard:
Depends On: 1712325
Blocks: 1919809
TreeView+ depends on / blocked
 
Reported: 2019-05-21 15:16 UTC by Liran Rotenberg
Modified: 2021-03-18 15:13 UTC (History)
6 users (show)

Fixed In Version: ovirt-engine-4.4.5.5
Clone Of:
: 1919809 (view as bug list)
Environment:
Last Closed: 2021-03-18 15:13:04 UTC
oVirt Team: Virt
Embargoed:
pm-rhel: ovirt-4.4+
pm-rhel: planning_ack+
ahadas: devel_ack+
pm-rhel: testing_ack+


Attachments (Terms of Use)
logs (427.63 KB, application/x-xz)
2019-05-21 15:16 UTC, Liran Rotenberg
no flags Details
SHE trying to start on FIPS after being on non-FIPS (13.59 MB, text/plain)
2019-05-27 06:51 UTC, Liran Rotenberg
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 112640 0 master MERGED core: introduce fips mode to cluster 2021-02-18 12:09:33 UTC
oVirt gerrit 112684 0 master MERGED webadmin: introduce cluster fips mode 2021-02-18 12:09:33 UTC
oVirt gerrit 112685 0 master MERGED core: handle hosts after fips update 2021-02-18 12:09:33 UTC
oVirt gerrit 112686 0 master MERGED api: introduce fips mode to cluster 2021-02-18 12:09:33 UTC
oVirt gerrit 112687 0 master MERGED api-model: add fips mode to cluster 2021-02-18 12:09:33 UTC
oVirt gerrit 113163 0 master MERGED Update to model 4.4.23, metamodel 1.3.4 2021-02-18 12:09:33 UTC

Description Liran Rotenberg 2019-05-21 15:16:20 UTC
Created attachment 1571679 [details]
logs

Description of problem:
For a VM with VNC console (VNC or VNC+SPICE) migrating from or to FIPS enabled host to a not FIPS enabled host is failing.

From source host's log, non-FIPS to FIPS host:
2019-05-21 17:14:39,593+0300 ERROR (migsrc/cb2cf375) [virt.vm] (vmId='cb2cf375-36ef-446c-9b39-c35921f1ef65') Failed to migrate (migration:450)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 431, in _regular_run
    time.time(), migrationParams, machineParams
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 507, in _startUnderlyingMigration
    self._perform_with_downtime_thread(duri, muri)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 584, in _perform_with_downtime_thread
    self._perform_migration(duri, muri)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 525, in _perform_migration
    self._migration_flags)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 100, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 94, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1779, in migrateToURI3
    if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
libvirtError: internal error: qemu unexpectedly closed the monitor: 2019-05-21T14:14:38.496187Z qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config, ability to start up with partial NUMA mappings is obsoleted and will be removed in future
2019-05-21T14:14:38.520052Z qemu-kvm: -vnc 10.35.30.6:2,password,tls,x509=/etc/pki/vdsm/libvirt-vnc,sasl: Failed to start VNC server: VNC password auth disabled due to FIPS mode, consider using the VeNCrypt or SASL authentication methods as an alternative

From destination host's log, FIPS to non-FIPS:
2019-05-21 17:16:58,469+0300 ERROR (jsonrpc/2) [api] FINISH create error=Error creating the requested VM (api:131)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 124, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 234, in create
    "VNC is allowed to start VNC with no password auth and "
CannotCreateVM: Error creating the requested VM


Version-Release number of selected component (if applicable):
ovirt-engine-4.3.4-0.1.el7.noarch
vdsm-4.30.15-1.el7ev.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Have an environment, with normal host(without FIPS) - host1 and another with FIPS - host2.
2. Create a VM with VNC console or VNC+SPICE console.
3. Migrate a VM from host1 to host2.
4. Migrate a VM from host2 to host1.

Actual results:
Migration failed.

Expected results:
Migration pass.

Additional info:
Starting the VM is possible on host1/host2. Only migration fails.

Comment 1 Michal Skrivanek 2019-05-22 07:32:56 UTC
it should be a cluster level policy and "misbehaving" hosts should be non-operational

Comment 2 Liran Rotenberg 2019-05-27 06:50:03 UTC
For the Hosted Engine VM, when VNC is set, starting on a non-FIPS host, migrating fails as above.
Trying to set global maintenance mode, shutting the VM down and starting it on the FIPS enabled host fails.

2019-05-27 09:48:15,143+0300 ERROR (vm/71e1cf2d) [virt.vm] (vmId='71e1cf2d-9f24-471f-871d-0c2b3fbcb70e') The vm start process failed (vm:933)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 867, in _startUnderlyingVm
    self._run()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2880, in _run
    dom.createWithFlags(flags)
  File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 94, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1110, in createWithFlags
    if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
libvirtError: internal error: qemu unexpectedly closed the monitor: 2019-05-27T06:48:14.341127Z qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config, ability to start up with partial NUMA mappings is obsoleted and will be removed in future
2019-05-27T06:48:14.373663Z qemu-kvm: -vnc 10.35.30.6:2,password,tls,x509=/etc/pki/vdsm/libvirt-vnc,sasl: Failed to start VNC server: VNC password auth disabled due to FIPS mode, consider using the VeNCrypt or SASL authentication methods as an alternative
2019-05-27 09:48:15,144+0300 INFO  (vm/71e1cf2d) [virt.vm] (vmId='71e1cf2d-9f24-471f-871d-0c2b3fbcb70e') Changed state to Down: internal error: qemu unexpectedly closed the monitor: 2019-05-27T06:48:14.341127Z qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config, ability to start up with partial NUMA mappings is obsoleted and will be removed in future
2019-05-27T06:48:14.373663Z qemu-kvm: -vnc 10.35.30.6:2,password,tls,x509=/etc/pki/vdsm/libvirt-vnc,sasl: Failed to start VNC server: VNC password auth disabled due to FIPS mode, consider using the VeNCrypt or SASL authentication methods as an alternative (code=1) (vm:1690)
2019-05-27 09:48:15,147+0300 INFO  (vm/71e1cf2d) [virt.vm] (vmId='71e1cf2d-9f24-471f-871d-0c2b3fbcb70e') Stopping connection (guestagent:455)

Comment 3 Liran Rotenberg 2019-05-27 06:51:57 UTC
Created attachment 1573769 [details]
SHE trying to start on FIPS after being on non-FIPS

Comment 4 Ryan Barry 2019-12-09 15:25:58 UTC
Still reproducible?

Comment 5 Beni Pelled 2019-12-17 11:52:24 UTC
yes, still reproducible,

1. Install host with RHEL 7.7 and FIPS enabled.
2. Add the host to RHV 4.3.8.0-0.1.el7 (another host is connected to the same env. with FIPS disabled)
3. Then try migration as following:

---- migration non-FIPS --> FIPS (log from source host):

2019-12-17 13:44:32,850+0200 ERROR (migsrc/9aec7b74) [virt.vm] (vmId='9aec7b74-1c75-475b-bb9a-a97aabfb8105') Failed to migrate (migration:450)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 431, in _regular_run
    time.time(), migrationParams, machineParams
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 507, in _startUnderlyingMigration
    self._perform_with_downtime_thread(duri, muri)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 584, in _perform_with_downtime_thread
    self._perform_migration(duri, muri)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 525, in _perform_migration
    self._migration_flags)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 100, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 94, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1779, in migrateToURI3
    if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
libvirtError: internal error: qemu unexpectedly closed the monitor: 2019-12-17T11:44:31.606368Z qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config, ability to start up with partial NUMA mappings is obsoleted and will be removed in future
2019-12-17T11:44:31.631296Z qemu-kvm: -vnc 10.35.30.5:5,password,tls,x509=/etc/pki/vdsm/libvirt-vnc,sasl: Failed to start VNC server: VNC password auth disabled due to FIPS mode, consider using the VeNCrypt or SASL authentication methods as an alternative
2019-12-17 13:44:32,896+0200 INFO  (jsonrpc/6) [api.virt] START getMigrationStatus() from=::ffff:10.35.30.101,41166, vmId=9aec7b74-1c75-475b-bb9a-a97aabfb8105 (api:48)
2019-12-17 13:44:32,896+0200 INFO  (jsonrpc/6) [api.virt] FINISH getMigrationStatus return={'status': {'message': 'Done', 'code': 0}, 'migrationStats': {'status': {'message': 'Fatal error during migration', 'code': 12}, 'progress': 0}} from=::ffff:10.35.30.101,41166, vmId=9aec7b74-1c75-475b-bb9a-a97aabfb8105 (api:54)

---- migration FIPS --> non-FIPS (log from destination host):

2019-12-17 13:47:38,277+0200 ERROR (jsonrpc/0) [api] FINISH create error=Error creating the requested VM (api:131)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 124, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 234, in create
    "VNC is allowed to start VNC with no password auth and "
CannotCreateVM: Error creating the requested VM
2019-12-17 13:47:38,278+0200 INFO  (jsonrpc/0) [api.virt] FINISH create return={'status': {'message': 'Error creating the requested VM', 'code': 9}} from=::ffff:10.35.30.5,60548, vmId=ad67a8c2-7fc4-4470-a036-724cb85f42f8 (api:54)

Comment 6 Tomasz Barański 2020-02-17 15:49:22 UTC
Fix to bug #1712325 might have fixed it.

Comment 7 Beni Pelled 2020-05-07 06:34:32 UTC
(In reply to Tomasz Barański from comment #6)
> Fix to bug #1712325 might have fixed it.

Nope, still reproducible,

Verified with: 
- ovirt-engine-4.4.0-0.33.master.el8ev.noarch
- vdsm-4.40.13-1.el8ev.x86_64
- HOST with RHEL8.2

Comment 10 Qin Yuan 2021-02-24 12:47:21 UTC
Verified with:
ovirt-engine-4.4.5.6-0.11.el8ev.noarch

Steps:
1. Create cluster with undefined, enabled or disabled fips mode using REST API.
<cluster>
  <name>cluster1</name>
  <fips_mode>undefined</fips_mode>
  <data_center id="1"/>
</cluster>
<cluster>
  <name>cluster2</name>
  <fips_mode>enabled</fips_mode>
  <data_center id="1"/>
</cluster>
<cluster>
  <name>cluster3</name>
  <fips_mode>disabled</fips_mode>
  <data_center id="1"/>
</cluster>
2. Get clusters info using REST API, check fips mode in response content.
3. Update cluster fips mode from undefined to enabled using REST API.
4. Create a new cluster4 with fips mode "Auto Detect" on UI.
5. Add a fips enabled host1 to cluster4, check cluster4 fips mode, and host1 status.
6. Add a fips disabled host2 to cluster4, check cluster4 fips mode, and host2 status.
7. Update cluster4 fips mode to Disabled on UI.
8. Move all hosts in cluster4 into maintenance, then update cluster4 fips mode to Disabled on UI.
9. Activate all hosts in cluster4, check cluster4 fips mode, and hosts status.

Results:
1. Could create cluster with undefined, enabled or disabled fips mode using REST API.
2. Fips mode in get response is correct.
3. Could update cluster fips mode from undefined to enabled using REST API.
4. Could create cluster with "Auto Detect" fips mode on UI.
5. cluster4 fips moded changes from "Auto Detect" to "Enabled" after the fips enabled host1 added to it.
6. When add the fips disabled host2 to cluster4, host2 is nonoperational, cluster4's fips mode is still "Enabled".
7. It fails to update cluster4 fips mode when hosts are not in maintenance mode, saying "Cannot update Cluster FIPS mode unless all Hosts attached to this Cluster are in Maintenance."
8. Could update cluster4 fips mode to "Disabled" when all hosts attached to it are moved to maintenance.
9. When activate all hosts in cluster4 after fips mode updated to "Disabled", the fips mode disabled host2 is up, while the fips mode enabled host1 is nonoperational.

All tests work as expected, so move this bug to verified.

Comment 11 Sandro Bonazzola 2021-03-18 15:13:04 UTC
This bugzilla is included in oVirt 4.4.5 release, published on March 18th 2021.

Since the problem described in this bug report should be resolved in oVirt 4.4.5 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.