Bug 1931376
Summary: | VMs disconnected from nmstate-defined bridge after CNV-2.5.4->CNV-2.6.0 upgrade | |||
---|---|---|---|---|
Product: | Container Native Virtualization (CNV) | Reporter: | Inbar Rose <irose> | |
Component: | Networking | Assignee: | Petr Horáček <phoracek> | |
Status: | CLOSED ERRATA | QA Contact: | Meni Yakove <myakove> | |
Severity: | urgent | Docs Contact: | ||
Priority: | urgent | |||
Version: | 2.6.0 | CC: | cnv-qe-bugs, danken, fdeutsch, pelauter, phoracek, sgordon, vindicators | |
Target Milestone: | --- | Keywords: | Automation, Regression | |
Target Release: | 2.6.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | kubernetes-nmstate-handler-container-v2.6.0-23 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1936432 (view as bug list) | Environment: | ||
Last Closed: | 2021-03-10 11:23:40 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1932247 | |||
Bug Blocks: | 1936432 |
Comment 4
Sylvain Réault
2021-02-22 14:57:33 UTC
Inbar, Those logs are taken after the upgrade finished? Both of those VMs vm-upgrade-a and vm-upgrade-b are on the same host. They report correct IP addresses. The bridge they are connected to is available on their node and seems healthy. The one issue I see on your cluster is that VMs are not getting the MAC address they were allocated using the KubeMacPool. This is caused by NetworkAttachmentDefinition missing the `cnv-tuning` plugin. Please adjust your NetworkAttachmentDefinitions to mimick https://docs.openshift.com/container-platform/4.6/virt/virtual_machines/vm_networking/virt-attaching-vm-multiple-networks.html#virt-creating-bridge-nad-cli_virt-attaching-multiple-networks. I would need access to the cluster to perform further debugging and see where the traffic gets stuck. (In reply to Sylvain Réault from comment #4) > Hello, > > We have same results with used "macvtap" peripheral. I used the > updates-testing repos. > > On my another Servers, this problem does not appear but I don't used the > updates-testing repos. > > Sylvain Thanks for reporting this. Unfortunately, macvtap is not supported in OpenShift Virtualization. If you have issues with it, please open an Issue on KubeVirt's GitHub https://github.com/kubevirt/kubevirt/issues. (In reply to Petr Horáček from comment #5) > Inbar, > > Those logs are taken after the upgrade finished? > > Both of those VMs vm-upgrade-a and vm-upgrade-b are on the same host. They > report correct IP addresses. The bridge they are connected to is available > on their node and seems healthy. > > The one issue I see on your cluster is that VMs are not getting the MAC > address they were allocated using the KubeMacPool. This is caused by > NetworkAttachmentDefinition missing the `cnv-tuning` plugin. Please adjust > your NetworkAttachmentDefinitions to mimick > https://docs.openshift.com/container-platform/4.6/virt/virtual_machines/ > vm_networking/virt-attaching-vm-multiple-networks.html#virt-creating-bridge- > nad-cli_virt-attaching-multiple-networks. > > I would need access to the cluster to perform further debugging and see > where the traffic gets stuck. I enabled cnv-tuning, I will run tests again and hopefully that solves the issue nmstate-handler version is: v2.6.0-21 After nmstate-handler pods restart I lost the veth interface for the bridge. Before restart pods: 5: ens10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc fq_codel master br1test state UP mode DEFAULT group default qlen 1000 88: br1test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP mode DEFAULT group default qlen 1000 90: vethca31f423@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br1test state UP mode DEFAULT group default After restart pods: 5: ens10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc fq_codel master br1test state UP mode DEFAULT group default qlen 1000 88: br1test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP mode DEFAULT group default qlen 1000 And no connectivity between the VMs. The code from the pod: def _bring_slave_up_if_not_in_desire(self): """ When slave been included in master, automactially set it as state UP if not defiend in desire state """ for iface in self._ifaces.values(): if iface.is_up and iface.is_master: cur_iface = self.current_ifaces.get(iface.name) for slave_name in iface.slaves: if cur_iface and slave_name in cur_iface.slaves: # Nmstate should bring up the port interface if it has # been added to the state not in all transactions continue slave_iface = self._ifaces[slave_name] if not slave_iface.is_desired and not slave_iface.is_up: slave_iface.mark_as_up() slave_iface.mark_as_changed() I can confirm. While the veths stayed attached on a simple bridge without any NIC attached, when the bridge is attached to host's physical NIC, veths get disconnected. Sound NNCP: interfaces: - bridge: options: stp: enabled: false ipv4: dhcp: false enabled: false ipv6: enabled: false name: br1 state: up type: linux-bridge Failing one: interfaces: - bridge: options: stp: enabled: false port: - name: ens9 ipv4: dhcp: false enabled: false ipv6: enabled: false name: br1 state: up type: linux-bridge Failed to verify the latest fix. Failed to apply nncp. nmstate-handler version is: v2.6.0-22 Traceback (most recent call last): File "/usr/bin/nmstatectl", line 11, in <module> load_entry_point('nmstate==0.3.4', 'console_scripts', 'nmstatectl')() File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 67, in main return args.func(args) File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 267, in apply args.save_to_disk, File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 289, in apply_state save_to_disk=save_to_disk, File "/usr/lib/python3.6/site-packages/libnmstate/netapplier.py", line 71, in apply _apply_ifaces_state(plugins, net_state, verify_change, save_to_disk) File "/usr/lib/python3.6/site-packages/libnmstate/netapplier.py", line 104, in _apply_ifaces_state plugin.apply_changes(net_state, save_to_disk) File "/usr/lib/python3.6/site-packages/libnmstate/nm/plugin.py", line 174, in apply_changes nm_applier.apply_changes(self.context, net_state, save_to_disk) File "/usr/lib/python3.6/site-packages/libnmstate/nm/applier.py", line 84, in apply_changes for iface in net_state.ifaces.all_ifaces.values(): AttributeError: 'Ifaces' object has no attribute 'all_ifaces' code from nmstate-handler pod: /usr/lib/python3.6/site-packages/libnmstate/ifaces/ifaces.py def _bring_slave_up_if_not_in_desire(self): """ When slave been included in master, automactially set it as state UP if not defiend in desire state """ for iface in self._ifaces.values(): if iface.is_desired and iface.is_up and iface.is_master: cur_iface = self.current_ifaces.get(iface.name) for slave_name in iface.slaves: if cur_iface and slave_name in cur_iface.slaves: # Nmstate should bring up the port interface if it has # been added to the state not in all transactions continue slave_iface = self._ifaces[slave_name] if not slave_iface.is_desired and not slave_iface.is_up: slave_iface.mark_as_up() slave_iface.mark_as_changed() def _remove_unknown_interface_type_slaves(self): """ When master containing slaves with unknown interface type or down state, they should be removed from master slave list before verifying. """ for iface in self._ifaces.values(): if iface.is_up and iface.is_master and iface.slaves: for slave_name in iface.slaves: slave_iface = self._ifaces[slave_name] if ( slave_iface.type == InterfaceType.UNKNOWN or slave_iface.state != InterfaceState.UP ): iface.remove_slave(slave_name) Verified. nmstate-handler version is: v2.6.0-23 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 2.6.0 security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:0799 |