Bug 1737982
Summary: | [OSP 13-15-16] dump_create: failed to get ifindex for tap* : Operation not supported | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Artem Hrechanychenko <ahrechan> | |
Component: | openvswitch | Assignee: | Assaf Muller <amuller> | |
Status: | CLOSED EOL | QA Contact: | Eran Kuris <ekuris> | |
Severity: | urgent | Docs Contact: | ||
Priority: | urgent | |||
Version: | 15.0 (Stein) | CC: | aconole, agurenko, akaris, amasolov, apevec, asimonel, atragler, bcafarel, bfournie, bjacot, bshephar, cfontain, chrisbro, chris.smart, chrisw, coldford, cschimid, david_paterson, dhill, dsedgmen, dvd, ebarrera, fhallal, fiezzi, igortiunov, irichart, j.beisiegel, jhoylaer, jlibosva, juanluis.alarcon, kfida, mgarciac, michele, mvalsecc, nsatsia, rbarrott, rhos-maint, rszmigie, scohen, stchen, tredaelli, tvainio, yocha | |
Target Milestone: | --- | Keywords: | ZStream | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | openvswitch2.11-2.11.0-49.el8fdp | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1812613 1823804 (view as bug list) | Environment: | ||
Last Closed: | 2020-09-30 19:44:49 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1823804 |
Description
Artem Hrechanychenko
2019-08-06 12:18:27 UTC
[stack@undercloud-0 ~]$ sudo cat /var/log/messages | grep "ERR|Dropped 1 log messages in last 1 seconds" |wc -l 47995 [stack@undercloud-0 ~]$ sudo tail -n 10 /var/log/openvswitch/ovs-vswitchd.log 2019-08-06T12:20:30.186Z|155504|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate 2019-08-06T12:20:30.186Z|155505|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported 2019-08-06T12:20:31.186Z|155506|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate 2019-08-06T12:20:31.186Z|155507|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported 2019-08-06T12:20:32.186Z|155508|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate 2019-08-06T12:20:32.186Z|155509|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported 2019-08-06T12:20:33.186Z|155510|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate 2019-08-06T12:20:33.186Z|155511|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported 2019-08-06T12:20:34.188Z|155512|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate 2019-08-06T12:20:34.188Z|155513|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported [stack@undercloud-0 ~]$ sudo rpm -qa "*switch*" openvswitch2.11-2.11.0-14.el8fdp.x86_64 network-scripts-openvswitch2.11-2.11.0-14.el8fdp.x86_64 openvswitch-selinux-extra-policy-1.0-18.el8fdp.noarch puppet-vswitch-10.4.1-0.20190614170518.ee3e6e1.el8ost.noarch rhosp-openvswitch-2.11-0.3.el8ost.noarch It looks related to internal ports being used in network namespace. I reproduced this on RHEL8 outside of openstack: ovs-vsctl add-br mybr ovs-vsctl add-port mybr myport -- set Interface myport type=internal ip net add myns ip l s dev myport netns myns grep myport /var/log/openvswitch/ovs-vswitchd.log It doesn't happen for internal ports in the root namespace. I'm resetting assignee for this bug. The impact is an error message in ovs-vswitchd.log every second. Is there a temporary workaround to disable this message in the log until it's formally fixed? Generates lots of noise. Sorry, I responded with a private comment. I suggest to try to disable the netdev_tc_offload logs using 'ovs-appctl vlog/set netdev_offload_tc:OFF' It will need to be done for each boot, though. root.billing.ru ~ > ovs-appctl vlog/set netdev_offload_tc:OFF no destination, level, or module "netdev_offload_tc" ovs-appctl: ovs-vswitchd: server returned an error root.billing.ru ~ > ovs-appctl vlog/set netdev_tc_offloads:OFF root.billing.ru ~ > *** Bug 1833754 has been marked as a duplicate of this bug. *** This bug is related to logging and looking up ifindex information. It could generate a lot of logging errors and consume some disk space, log rotate will ensure logs don't get too large. It should not impact data traffic. The original issue is due to backporting a OVS TC/flwoer hardware offload feature with bad structure and this has been reverted, feature moved to later OVS release. The fix is planned for OSP 13z12 (June) and OSP 16.1 GA (July) There is a workaround available see comment #3, to turn off TC offload logging, this is not required when OVS offload is not being used. command will need to be run on each start / boot of openvswitch ovs-appctl vlog/set netdev_tc_offloads:OFF There is a workaround available see comment #37, to turn off TC offload logging, this is not required when OVS offload is not being used. command will need to be run on each start / boot of openvswitch ovs-appctl vlog/set netdev_tc_offloads:OFF (In reply to Anita Tragler from comment #49) > This bug is related to logging and looking up ifindex information. It could > generate a lot of logging errors and consume some disk space, log rotate > will ensure logs don't get too large. It should not impact data traffic. > The original issue is due to backporting a OVS TC/flwoer hardware offload > feature with bad structure and this has been reverted, feature moved to > later OVS release. > The fix is planned for OSP 13z12 (June) and OSP 16.1 GA (July) > > There is a workaround available see comment #3, to turn off TC offload > logging, this is not required when OVS offload is not being used. > > command will need to be run on each start / boot of openvswitch > > ovs-appctl vlog/set netdev_tc_offloads:OFF Hi Anita, Thanks for your feedback. I believe the issue here is that when you have a large number of ports, the print operation for each and every interface multiple times gets fairly expensive. The issue we are generally seeing with this is that at some stage, Openvswitch is no longer able to keep up with the requests coming from Neutron - particularly in environments with distributed virtual routing enabled. At that stage, it stops updating flows and ports which causes a loss of traffic to the tenant VM's on the hypervisor. Disabling the netdev_tc_offloads logging does indeed improve the issue, and in most cases we have verified that the impact of the issue is mitigated by this change. But to be clear, this definitely, 100% is causing traffic outages to VM's. And the only way it gets resolved is by ovs-appctl vlog/set netdev_tc_offloads:OFF So I'm glad to hear that the fix will be released in z12. I'm using something like this in the meantime. $ cat << EOF | sudo tee /etc/systemd/system/ovs-disable-tc-logging-hack.service [Unit] Description=Hack to work around OVS tc logging [Service] Type=oneshot ExecStart=/bin/ovs-appctl vlog/set netdev_tc_offloads off User=root Group=systemd-journal Requires=openvswitch.service After=openvswitch.service EOF $ cat << EOF | sudo tee /etc/systemd/system/ovs-disable-tc-logging-hack.timer [Unit] Description=Run hack to work around OVS tc logging [Timer] OnBootSec=5min OnUnitActiveSec=1h [Install] WantedBy=timers.target EOF $ sudo systemctl daemon-reload $ sudo systemctl start ovs-disable-tc-logging-hack.timer $ sudo systemctl enable ovs-disable-tc-logging-hack.timer Closing EOL, OSP 15 has been retired as of Sept 19, 2020 |