Bug 1931376
| Summary: | VMs disconnected from nmstate-defined bridge after CNV-2.5.4->CNV-2.6.0 upgrade | |||
|---|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Inbar Rose <irose> | |
| Component: | Networking | Assignee: | Petr Horáček <phoracek> | |
| Status: | CLOSED ERRATA | QA Contact: | Meni Yakove <myakove> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | 2.6.0 | CC: | cnv-qe-bugs, danken, fdeutsch, pelauter, phoracek, sgordon, vindicators | |
| Target Milestone: | --- | Keywords: | Automation, Regression | |
| Target Release: | 2.6.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | kubernetes-nmstate-handler-container-v2.6.0-23 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1936432 (view as bug list) | Environment: | ||
| Last Closed: | 2021-03-10 11:23:40 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1932247 | |||
| Bug Blocks: | 1936432 | |||
|
Comment 4
Sylvain Réault
2021-02-22 14:57:33 UTC
Inbar, Those logs are taken after the upgrade finished? Both of those VMs vm-upgrade-a and vm-upgrade-b are on the same host. They report correct IP addresses. The bridge they are connected to is available on their node and seems healthy. The one issue I see on your cluster is that VMs are not getting the MAC address they were allocated using the KubeMacPool. This is caused by NetworkAttachmentDefinition missing the `cnv-tuning` plugin. Please adjust your NetworkAttachmentDefinitions to mimick https://docs.openshift.com/container-platform/4.6/virt/virtual_machines/vm_networking/virt-attaching-vm-multiple-networks.html#virt-creating-bridge-nad-cli_virt-attaching-multiple-networks. I would need access to the cluster to perform further debugging and see where the traffic gets stuck. (In reply to Sylvain Réault from comment #4) > Hello, > > We have same results with used "macvtap" peripheral. I used the > updates-testing repos. > > On my another Servers, this problem does not appear but I don't used the > updates-testing repos. > > Sylvain Thanks for reporting this. Unfortunately, macvtap is not supported in OpenShift Virtualization. If you have issues with it, please open an Issue on KubeVirt's GitHub https://github.com/kubevirt/kubevirt/issues. (In reply to Petr Horáček from comment #5) > Inbar, > > Those logs are taken after the upgrade finished? > > Both of those VMs vm-upgrade-a and vm-upgrade-b are on the same host. They > report correct IP addresses. The bridge they are connected to is available > on their node and seems healthy. > > The one issue I see on your cluster is that VMs are not getting the MAC > address they were allocated using the KubeMacPool. This is caused by > NetworkAttachmentDefinition missing the `cnv-tuning` plugin. Please adjust > your NetworkAttachmentDefinitions to mimick > https://docs.openshift.com/container-platform/4.6/virt/virtual_machines/ > vm_networking/virt-attaching-vm-multiple-networks.html#virt-creating-bridge- > nad-cli_virt-attaching-multiple-networks. > > I would need access to the cluster to perform further debugging and see > where the traffic gets stuck. I enabled cnv-tuning, I will run tests again and hopefully that solves the issue nmstate-handler version is: v2.6.0-21
After nmstate-handler pods restart I lost the veth interface for the bridge.
Before restart pods:
5: ens10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc fq_codel master br1test state UP mode DEFAULT group default qlen 1000
88: br1test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP mode DEFAULT group default qlen 1000
90: vethca31f423@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br1test state UP mode DEFAULT group default
After restart pods:
5: ens10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc fq_codel master br1test state UP mode DEFAULT group default qlen 1000
88: br1test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP mode DEFAULT group default qlen 1000
And no connectivity between the VMs.
The code from the pod:
def _bring_slave_up_if_not_in_desire(self):
"""
When slave been included in master, automactially set it as state UP
if not defiend in desire state
"""
for iface in self._ifaces.values():
if iface.is_up and iface.is_master:
cur_iface = self.current_ifaces.get(iface.name)
for slave_name in iface.slaves:
if cur_iface and slave_name in cur_iface.slaves:
# Nmstate should bring up the port interface if it has
# been added to the state not in all transactions
continue
slave_iface = self._ifaces[slave_name]
if not slave_iface.is_desired and not slave_iface.is_up:
slave_iface.mark_as_up()
slave_iface.mark_as_changed()
I can confirm. While the veths stayed attached on a simple bridge without any NIC attached, when the bridge is attached to host's physical NIC, veths get disconnected.
Sound NNCP:
interfaces:
- bridge:
options:
stp:
enabled: false
ipv4:
dhcp: false
enabled: false
ipv6:
enabled: false
name: br1
state: up
type: linux-bridge
Failing one:
interfaces:
- bridge:
options:
stp:
enabled: false
port:
- name: ens9
ipv4:
dhcp: false
enabled: false
ipv6:
enabled: false
name: br1
state: up
type: linux-bridge
Failed to verify the latest fix.
Failed to apply nncp.
nmstate-handler version is: v2.6.0-22
Traceback (most recent call last):
File "/usr/bin/nmstatectl", line 11, in <module>
load_entry_point('nmstate==0.3.4', 'console_scripts', 'nmstatectl')()
File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 67, in main
return args.func(args)
File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 267, in apply
args.save_to_disk,
File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 289, in apply_state
save_to_disk=save_to_disk,
File "/usr/lib/python3.6/site-packages/libnmstate/netapplier.py", line 71, in apply
_apply_ifaces_state(plugins, net_state, verify_change, save_to_disk)
File "/usr/lib/python3.6/site-packages/libnmstate/netapplier.py", line 104, in _apply_ifaces_state
plugin.apply_changes(net_state, save_to_disk)
File "/usr/lib/python3.6/site-packages/libnmstate/nm/plugin.py", line 174, in apply_changes
nm_applier.apply_changes(self.context, net_state, save_to_disk)
File "/usr/lib/python3.6/site-packages/libnmstate/nm/applier.py", line 84, in apply_changes
for iface in net_state.ifaces.all_ifaces.values():
AttributeError: 'Ifaces' object has no attribute 'all_ifaces'
code from nmstate-handler pod:
/usr/lib/python3.6/site-packages/libnmstate/ifaces/ifaces.py
def _bring_slave_up_if_not_in_desire(self):
"""
When slave been included in master, automactially set it as state UP
if not defiend in desire state
"""
for iface in self._ifaces.values():
if iface.is_desired and iface.is_up and iface.is_master:
cur_iface = self.current_ifaces.get(iface.name)
for slave_name in iface.slaves:
if cur_iface and slave_name in cur_iface.slaves:
# Nmstate should bring up the port interface if it has
# been added to the state not in all transactions
continue
slave_iface = self._ifaces[slave_name]
if not slave_iface.is_desired and not slave_iface.is_up:
slave_iface.mark_as_up()
slave_iface.mark_as_changed()
def _remove_unknown_interface_type_slaves(self):
"""
When master containing slaves with unknown interface type or down
state, they should be removed from master slave list before verifying.
"""
for iface in self._ifaces.values():
if iface.is_up and iface.is_master and iface.slaves:
for slave_name in iface.slaves:
slave_iface = self._ifaces[slave_name]
if (
slave_iface.type == InterfaceType.UNKNOWN
or slave_iface.state != InterfaceState.UP
):
iface.remove_slave(slave_name)
Verified. nmstate-handler version is: v2.6.0-23 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 2.6.0 security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:0799 |