Description of problem: during controller replacement from /var/log/messages Aug 6 08:13:50 undercloud-0 ovs-vswitchd[15336]: ovs|154705|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate Aug 6 08:13:50 undercloud-0 ovs-vswitchd[15336]: ovs|154706|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported Aug 6 08:13:51 undercloud-0 ovs-vswitchd[15336]: ovs|154707|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate Aug 6 08:13:51 undercloud-0 ovs-vswitchd[15336]: ovs|154708|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported Aug 6 08:13:52 undercloud-0 ovs-vswitchd[15336]: ovs|154709|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate Aug 6 08:13:52 undercloud-0 ovs-vswitchd[15336]: ovs|154710|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported Aug 6 08:13:53 undercloud-0 ovs-vswitchd[15336]: ovs|154711|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate Aug 6 08:13:53 undercloud-0 ovs-vswitchd[15336]: ovs|154712|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported Aug 6 08:13:54 undercloud-0 ovs-vswitchd[15336]: ovs|154713|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate Aug 6 08:13:54 undercloud-0 ovs-vswitchd[15336]: ovs|154714|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported [stack@undercloud-0 ~]$ sudo ovs-vsctl show 65e5a710-98d0-4f97-805d-48d4612e3d78 Manager "ptcp:6640:127.0.0.1" is_connected: true Bridge br-int Controller "tcp:127.0.0.1:6633" is_connected: true fail_mode: secure Port br-int Interface br-int type: internal Port int-br-ctlplane Interface int-br-ctlplane type: patch options: {peer=phy-br-ctlplane} Port "tap22cabec5-39" tag: 1 Interface "tap22cabec5-39" type: internal Bridge br-ctlplane Controller "tcp:127.0.0.1:6633" is_connected: true fail_mode: secure Port br-ctlplane Interface br-ctlplane type: internal Port "eth0" Interface "eth0" Port phy-br-ctlplane Interface phy-br-ctlplane type: patch options: {peer=int-br-ctlplane} ovs_version: "2.11.0" Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
[stack@undercloud-0 ~]$ sudo cat /var/log/messages | grep "ERR|Dropped 1 log messages in last 1 seconds" |wc -l 47995 [stack@undercloud-0 ~]$ sudo tail -n 10 /var/log/openvswitch/ovs-vswitchd.log 2019-08-06T12:20:30.186Z|155504|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate 2019-08-06T12:20:30.186Z|155505|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported 2019-08-06T12:20:31.186Z|155506|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate 2019-08-06T12:20:31.186Z|155507|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported 2019-08-06T12:20:32.186Z|155508|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate 2019-08-06T12:20:32.186Z|155509|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported 2019-08-06T12:20:33.186Z|155510|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate 2019-08-06T12:20:33.186Z|155511|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported 2019-08-06T12:20:34.188Z|155512|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate 2019-08-06T12:20:34.188Z|155513|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported [stack@undercloud-0 ~]$ sudo rpm -qa "*switch*" openvswitch2.11-2.11.0-14.el8fdp.x86_64 network-scripts-openvswitch2.11-2.11.0-14.el8fdp.x86_64 openvswitch-selinux-extra-policy-1.0-18.el8fdp.noarch puppet-vswitch-10.4.1-0.20190614170518.ee3e6e1.el8ost.noarch rhosp-openvswitch-2.11-0.3.el8ost.noarch
It looks related to internal ports being used in network namespace. I reproduced this on RHEL8 outside of openstack: ovs-vsctl add-br mybr ovs-vsctl add-port mybr myport -- set Interface myport type=internal ip net add myns ip l s dev myport netns myns grep myport /var/log/openvswitch/ovs-vswitchd.log It doesn't happen for internal ports in the root namespace. I'm resetting assignee for this bug. The impact is an error message in ovs-vswitchd.log every second.
Is there a temporary workaround to disable this message in the log until it's formally fixed? Generates lots of noise.
Sorry, I responded with a private comment. I suggest to try to disable the netdev_tc_offload logs using 'ovs-appctl vlog/set netdev_offload_tc:OFF' It will need to be done for each boot, though.
root.billing.ru ~ > ovs-appctl vlog/set netdev_offload_tc:OFF no destination, level, or module "netdev_offload_tc" ovs-appctl: ovs-vswitchd: server returned an error root.billing.ru ~ > ovs-appctl vlog/set netdev_tc_offloads:OFF root.billing.ru ~ >
*** Bug 1833754 has been marked as a duplicate of this bug. ***
This bug is related to logging and looking up ifindex information. It could generate a lot of logging errors and consume some disk space, log rotate will ensure logs don't get too large. It should not impact data traffic. The original issue is due to backporting a OVS TC/flwoer hardware offload feature with bad structure and this has been reverted, feature moved to later OVS release. The fix is planned for OSP 13z12 (June) and OSP 16.1 GA (July) There is a workaround available see comment #3, to turn off TC offload logging, this is not required when OVS offload is not being used. command will need to be run on each start / boot of openvswitch ovs-appctl vlog/set netdev_tc_offloads:OFF
There is a workaround available see comment #37, to turn off TC offload logging, this is not required when OVS offload is not being used. command will need to be run on each start / boot of openvswitch ovs-appctl vlog/set netdev_tc_offloads:OFF
(In reply to Anita Tragler from comment #49) > This bug is related to logging and looking up ifindex information. It could > generate a lot of logging errors and consume some disk space, log rotate > will ensure logs don't get too large. It should not impact data traffic. > The original issue is due to backporting a OVS TC/flwoer hardware offload > feature with bad structure and this has been reverted, feature moved to > later OVS release. > The fix is planned for OSP 13z12 (June) and OSP 16.1 GA (July) > > There is a workaround available see comment #3, to turn off TC offload > logging, this is not required when OVS offload is not being used. > > command will need to be run on each start / boot of openvswitch > > ovs-appctl vlog/set netdev_tc_offloads:OFF Hi Anita, Thanks for your feedback. I believe the issue here is that when you have a large number of ports, the print operation for each and every interface multiple times gets fairly expensive. The issue we are generally seeing with this is that at some stage, Openvswitch is no longer able to keep up with the requests coming from Neutron - particularly in environments with distributed virtual routing enabled. At that stage, it stops updating flows and ports which causes a loss of traffic to the tenant VM's on the hypervisor. Disabling the netdev_tc_offloads logging does indeed improve the issue, and in most cases we have verified that the impact of the issue is mitigated by this change. But to be clear, this definitely, 100% is causing traffic outages to VM's. And the only way it gets resolved is by ovs-appctl vlog/set netdev_tc_offloads:OFF So I'm glad to hear that the fix will be released in z12.
I'm using something like this in the meantime. $ cat << EOF | sudo tee /etc/systemd/system/ovs-disable-tc-logging-hack.service [Unit] Description=Hack to work around OVS tc logging [Service] Type=oneshot ExecStart=/bin/ovs-appctl vlog/set netdev_tc_offloads off User=root Group=systemd-journal Requires=openvswitch.service After=openvswitch.service EOF $ cat << EOF | sudo tee /etc/systemd/system/ovs-disable-tc-logging-hack.timer [Unit] Description=Run hack to work around OVS tc logging [Timer] OnBootSec=5min OnUnitActiveSec=1h [Install] WantedBy=timers.target EOF $ sudo systemctl daemon-reload $ sudo systemctl start ovs-disable-tc-logging-hack.timer $ sudo systemctl enable ovs-disable-tc-logging-hack.timer
Closing EOL, OSP 15 has been retired as of Sept 19, 2020