This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
Bug 1943613 - Report a better error message to users if they attempt to live migrate a vm after a neutron network mtu change without first hard rebooting it.
Summary: Report a better error message to users if they attempt to live migrate a vm a...
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: ---
Assignee: OSP DFG:Compute
QA Contact: OSP DFG:Compute
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-03-26 15:36 UTC by Jakub Libosvar
Modified: 2025-01-18 02:58 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2025-01-18 02:56:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
nova compute logs from target node (1.21 MB, text/plain)
2021-03-26 15:39 UTC, Jakub Libosvar
no flags Details
nova compute logs from source node (1.06 MB, text/plain)
2021-03-26 15:40 UTC, Jakub Libosvar
no flags Details


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 791235 0 None NEW WIP: Regression test for changing the MTU during a live migration 2024-05-13 15:28:47 UTC
OpenStack gerrit 852367 0 None NEW libvirt: Check if VIF MTU matches network MTU 2024-05-13 15:26:39 UTC
Red Hat Issue Tracker   OSP-1429 0 None None None 2025-01-18 02:56:29 UTC
Red Hat Issue Tracker OSP-33494 0 None None None 2025-01-18 02:58:09 UTC

Description Jakub Libosvar 2021-03-26 15:36:45 UTC
Description of problem:
VMs cannot live migrate after MTU of a network its port is in is changed.

Version-Release number of selected component (if applicable):
openstack-nova-compute-20.4.1-1.20200917173450.el8ost.noarch

How reproducible:
Always

Steps to Reproduce:
1. Have a VM on a network A
2. openstack network set --mtu 1442 a
3. Live migrate the VM

Actual results:
Fails because MTU of source and target node differs

Expected results:
Succeeds with the new MTU

Additional info:
2021-03-26 15:35:13.615 7 ERROR nova.virt.libvirt.driver [-] [instance: a79d5ec4-d768-4dd0-8c08-bc6f4b07e321] Live Migration failure: unsupported configuration: Target network card MTU 1442 does not match source 1450: libvirt.libvirtError: unsupported configuration: Target network card MTU 1442 does not match source 1450

Comment 1 Jakub Libosvar 2021-03-26 15:39:33 UTC
Created attachment 1766669 [details]
nova compute logs from target node

Comment 2 Jakub Libosvar 2021-03-26 15:40:56 UTC
Created attachment 1766671 [details]
nova compute logs from source node

Comment 6 smooney 2021-04-07 19:43:12 UTC
libvirt does not allow the mtu to be modified on a running vm so nova cannot update the mtu when its updated in neutron and cannot update it during a live migration.
As such the current procedure is expected to fail.We have determined that the current behaviour is correct and the initial bug report was invalid.
this change bz has been kept open to track improving the error message and possible enhancing the documentation related to live migrations and mtu changes.

Comment 7 smooney 2021-04-26 12:39:34 UTC
by the way for reference we added the MTU to the xml in 
https://github.com/openstack/nova/commit/f02b3800051234ecc14f3117d5987b1a8ef75877

to resolve https://bugs.launchpad.net/nova/+bug/1747496

so we cannot remove setting it in the xml or we woudl break jumbo frames.

if we were to remove setting it then we would need to enhance ovs, ovn or the neutron l2 agent to manage the mtu.

we could do this by defining a new Dynamic MTU extension in neutron that was only reported when it was enabled.
how this extion would work is as follows, backends that support it are commit to take over managmenet of the
interface mtu including updating it if the network mtu changes. if nova sees the extension report it will
not generate the mtu elements and delegate the managmeent to neutron.


this would not resolve the issue for 16 or 17 but it would resolve the issue in osp 18 or later.

for now we can keep this bug for the better error reporting but i think this would a viable path forward in the long term.

Comment 8 Jakub Libosvar 2022-01-26 22:22:55 UTC
(In reply to smooney from comment #7)
> by the way for reference we added the MTU to the xml in 
> https://github.com/openstack/nova/commit/
> f02b3800051234ecc14f3117d5987b1a8ef75877
> 
> to resolve https://bugs.launchpad.net/nova/+bug/1747496
> 
> so we cannot remove setting it in the xml or we woudl break jumbo frames.
> 
> if we were to remove setting it then we would need to enhance ovs, ovn or
> the neutron l2 agent to manage the mtu.
> 
> we could do this by defining a new Dynamic MTU extension in neutron that was
> only reported when it was enabled.
> how this extion would work is as follows, backends that support it are
> commit to take over managmenet of the
> interface mtu including updating it if the network mtu changes. if nova sees
> the extension report it will
> not generate the mtu elements and delegate the managmeent to neutron.
> 
> 
> this would not resolve the issue for 16 or 17 but it would resolve the issue
> in osp 18 or later.
> 
> for now we can keep this bug for the better error reporting but i think this
> would a viable path forward in the long term.

Wouldn't it be better to request an RFE to libvirt to be able to change MTU during live migration? With that, we can calculate that MTU is no longer valid and request a new MTU on the target node. If this can't be done, it doesn't make much sense to have an option to change MTU in the Neutron API as it breaks other features.

Comment 9 Artom Lifshitz 2022-02-01 14:57:30 UTC
(In reply to Jakub Libosvar from comment #8)
> (In reply to smooney from comment #7)
> > by the way for reference we added the MTU to the xml in 
> > https://github.com/openstack/nova/commit/
> > f02b3800051234ecc14f3117d5987b1a8ef75877
> > 
> > to resolve https://bugs.launchpad.net/nova/+bug/1747496
> > 
> > so we cannot remove setting it in the xml or we woudl break jumbo frames.
> > 
> > if we were to remove setting it then we would need to enhance ovs, ovn or
> > the neutron l2 agent to manage the mtu.
> > 
> > we could do this by defining a new Dynamic MTU extension in neutron that was
> > only reported when it was enabled.
> > how this extion would work is as follows, backends that support it are
> > commit to take over managmenet of the
> > interface mtu including updating it if the network mtu changes. if nova sees
> > the extension report it will
> > not generate the mtu elements and delegate the managmeent to neutron.
> > 
> > 
> > this would not resolve the issue for 16 or 17 but it would resolve the issue
> > in osp 18 or later.
> > 
> > for now we can keep this bug for the better error reporting but i think this
> > would a viable path forward in the long term.
> 
> Wouldn't it be better to request an RFE to libvirt to be able to change MTU
> during live migration? With that, we can calculate that MTU is no longer
> valid and request a new MTU on the target node. If this can't be done, it
> doesn't make much sense to have an option to change MTU in the Neutron API
> as it breaks other features.

We could, but Nova doesn't want to support that. Supporting changing the MTU for a running instance is a much larger problem (detecting network-vif-changed events, somehow handling changing the XML of running instance by either unplugging/replugging or rebooting or something else, etc), and we'd rather explicitly refuse it then implement just this tiny subset dealing with live migration. So in this case I think Neutron should make the MTU field read-only to avoid getting into this mess altogether.


Note You need to log in before you can comment on or make changes to this bug.