This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
Bug 2247593 - Live Migration fails after volume hotplug
Summary: Live Migration fails after volume hotplug
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Storage
Version: 4.14.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 4.14.1
Assignee: Alex Kalenyuk
QA Contact: Jenia Peimer
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-11-02 07:17 UTC by Ying Cui
Modified: 2023-12-14 16:16 UTC (History)
2 users (show)

Fixed In Version: CNV v4.14.1.rhel9-11
Doc Type: Known Issue
Doc Text:
Live migration cannot be enabled for a virtual machine instance (VMI) after a hotplug volume has been added and removed. (BZ#2247593)
Clone Of:
Environment:
Last Closed: 2023-12-14 16:16:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
live-migration-failure-script.sh (1.61 KB, text/plain)
2023-11-02 07:20 UTC, Ying Cui
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt kubevirt pull 10689 0 None open cgroupsv2: reconstruct device allowlist/drop internal device allow list state 2023-11-06 13:29:46 UTC
Red Hat Issue Tracker   CNV-34724 0 None None None 2023-12-14 16:16:31 UTC
Red Hat Issue Tracker CNV-34761 0 None None None 2023-11-02 07:34:40 UTC

Description Ying Cui 2023-11-02 07:17:07 UTC
Original issue in Jira: https://issues.redhat.com/browse/CNV-34724. Moving it to bugzilla because CNV 4.14.0 we are still using bugzilla to report bugs. We will use Jira for all components starting CNV 4.15. 


Description of problem:

Live migration is no longer possible after a VMI has a hotplug volume added and removed. 

While this has nothing to do directly with hypershift/kubevirt, this issue was discovered while testing hypershift/kubevirt due to our usage of hotplug to provide volumes to the worker node VMs. Once we use kubevirt-csi to hotplug a volume and then we remove that volume, we noticed the VMIs failed to live migrate later on.

It is trivial to reproduce this error outside of hypershift/kubevirt using a fedora vm.


Version-Release number of selected component (if applicable):

CNV 4.14


How reproducible:

100%


Steps to Reproduce:

See the live-migration-failure-script.sh script attached to this issue to reproduce this easily. Below are the general steps that script performs.

This was reproduced using ODF 4.13 on OCP 4.14.0 with the latest CNV 4.14.0 pre-release

1. Create a vm and start it
2. live migrate the vmi to prove it is live migratable.
3. add a hotplug volume (RWX) vmi
4. remove a hotplug volume from the vmi
5. live migration is now permanently broken for the vmi



Actual results:

Live migration fails after adding and removing a  RMX hotplug volume



Expected results:

Live migration should continue to work after hotplug



Additional info:

The qemu log on the source pod reports the following after the migration fails.

{code:java}
#2023-10-30T19:53:14.880648Z qemu-kvm: qemu_savevm_state_complete_precopy_non_iterable: bdrv_inactivate_all() failed (-1)
#2023-10-30T19:53:15.053395Z qemu-kvm: Unable to read from socket: Bad file descriptor
#2023-10-30T19:53:15.053477Z qemu-kvm: Unable to read from socket: Bad file descriptor
#2023-10-30T19:53:15.053500Z qemu-kvm: Unable to read from socket: Bad file descriptor

The libvirt logs on the src only indicate that the migration failed due to an expected error.

{code:java}
{"component":"virt-launcher","kind":"","level":"error","msg":"Live migration failed.","name":"test-vm","namespace":"default","pos":"live-migration-source.go:718","reason":"error encountered during MigrateToURI3 libvirt api call: virError(Code=1, Domain=7, Message='internal error: client socket is closed')","timestamp":"2023-10-31T20:03:26.521711Z","uid":"388fd212-9187-488b-9989-43d2f19368f1"}

Comment 1 Ying Cui 2023-11-02 07:20:54 UTC
Created attachment 1996690 [details]
live-migration-failure-script.sh

Comment 2 Jenia Peimer 2023-12-05 10:48:30 UTC
Verified on CNV-v4.14.1.rhel9-62


Note You need to log in before you can comment on or make changes to this bug.