1737982 – [OSP 13-15-16] dump_create: failed to get ifindex for tap* : Operation not supported

Bug 1737982 - [OSP 13-15-16] dump_create: failed to get ifindex for tap* : Operation not supported

Summary: [OSP 13-15-16] dump_create: failed to get ifindex for tap* : Operation not su...

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openvswitch
Sub Component:
Version:	15.0 (Stein)
Hardware:	x86_64
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	---
Assignee:	Assaf Muller
QA Contact:	Eran Kuris
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1833754 (view as bug list)
Depends On:
Blocks:	1823804
TreeView+	depends on / blocked

Reported:	2019-08-06 12:18 UTC by Artem Hrechanychenko
Modified:	2023-12-15 16:40 UTC (History)
CC List:	43 users (show)
Fixed In Version:	openvswitch2.11-2.11.0-49.el8fdp
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1812613 1823804 (view as bug list)
Environment:
Last Closed:	2020-09-30 19:44:49 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1812958	urgent	CLOSED	[OSP 13-15-16] dump_create: failed to get ifindex for tap* : Operation not supported	2021-02-22 00:41:40 UTC
Red Hat Issue Tracker	OSP-29369	None	None	None	2023-10-06 18:32:18 UTC
Red Hat Knowledge Base (Solution)	4888941	None	None	None	2020-03-09 14:00:47 UTC

Description Artem Hrechanychenko 2019-08-06 12:18:27 UTC

Description of problem:
during controller replacement 

from /var/log/messages

Aug  6 08:13:50 undercloud-0 ovs-vswitchd[15336]: ovs|154705|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate
Aug  6 08:13:50 undercloud-0 ovs-vswitchd[15336]: ovs|154706|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported
Aug  6 08:13:51 undercloud-0 ovs-vswitchd[15336]: ovs|154707|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate
Aug  6 08:13:51 undercloud-0 ovs-vswitchd[15336]: ovs|154708|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported
Aug  6 08:13:52 undercloud-0 ovs-vswitchd[15336]: ovs|154709|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate
Aug  6 08:13:52 undercloud-0 ovs-vswitchd[15336]: ovs|154710|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported
Aug  6 08:13:53 undercloud-0 ovs-vswitchd[15336]: ovs|154711|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate
Aug  6 08:13:53 undercloud-0 ovs-vswitchd[15336]: ovs|154712|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported
Aug  6 08:13:54 undercloud-0 ovs-vswitchd[15336]: ovs|154713|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate
Aug  6 08:13:54 undercloud-0 ovs-vswitchd[15336]: ovs|154714|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported


[stack@undercloud-0 ~]$ sudo ovs-vsctl show
65e5a710-98d0-4f97-805d-48d4612e3d78
    Manager "ptcp:6640:127.0.0.1"
        is_connected: true
    Bridge br-int
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port br-int
            Interface br-int
                type: internal
        Port int-br-ctlplane
            Interface int-br-ctlplane
                type: patch
                options: {peer=phy-br-ctlplane}
        Port "tap22cabec5-39"
            tag: 1
            Interface "tap22cabec5-39"
                type: internal
    Bridge br-ctlplane
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port br-ctlplane
            Interface br-ctlplane
                type: internal
        Port "eth0"
            Interface "eth0"
        Port phy-br-ctlplane
            Interface phy-br-ctlplane
                type: patch
                options: {peer=int-br-ctlplane}
    ovs_version: "2.11.0"

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Artem Hrechanychenko 2019-08-06 12:21:21 UTC

[stack@undercloud-0 ~]$ sudo cat /var/log/messages | grep "ERR|Dropped 1 log messages in last 1 seconds" |wc -l
47995

[stack@undercloud-0 ~]$ sudo tail -n 10 /var/log/openvswitch/ovs-vswitchd.log
2019-08-06T12:20:30.186Z|155504|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate
2019-08-06T12:20:30.186Z|155505|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported
2019-08-06T12:20:31.186Z|155506|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate
2019-08-06T12:20:31.186Z|155507|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported
2019-08-06T12:20:32.186Z|155508|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate
2019-08-06T12:20:32.186Z|155509|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported
2019-08-06T12:20:33.186Z|155510|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate
2019-08-06T12:20:33.186Z|155511|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported
2019-08-06T12:20:34.188Z|155512|netdev_tc_offloads(revalidator57)|ERR|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate
2019-08-06T12:20:34.188Z|155513|netdev_tc_offloads(revalidator57)|ERR|dump_create: failed to get ifindex for tap22cabec5-39: Operation not supported


[stack@undercloud-0 ~]$ sudo rpm -qa "*switch*"
openvswitch2.11-2.11.0-14.el8fdp.x86_64
network-scripts-openvswitch2.11-2.11.0-14.el8fdp.x86_64
openvswitch-selinux-extra-policy-1.0-18.el8fdp.noarch
puppet-vswitch-10.4.1-0.20190614170518.ee3e6e1.el8ost.noarch
rhosp-openvswitch-2.11-0.3.el8ost.noarch

Comment 7 Jakub Libosvar 2019-08-27 11:58:35 UTC

It looks related to internal ports being used in network namespace. I reproduced this on RHEL8 outside of openstack:

ovs-vsctl add-br mybr
ovs-vsctl add-port mybr myport -- set Interface myport type=internal
ip net add myns
ip l s dev myport netns myns

grep myport /var/log/openvswitch/ovs-vswitchd.log

It doesn't happen for internal ports in the root namespace. I'm resetting assignee for this bug. The impact is an error message in ovs-vswitchd.log every second.

Comment 35 David Paterson 2020-04-08 16:24:36 UTC

Is there a temporary workaround to disable this message in the log until it's formally fixed?  Generates lots of noise.

Comment 36 Aaron Conole 2020-04-08 16:26:50 UTC

Sorry, I responded with a private comment.  I suggest to try to disable the netdev_tc_offload logs using 'ovs-appctl vlog/set netdev_offload_tc:OFF'

It will need to be done for each boot, though.

Comment 37 ITD27M01 2020-04-13 17:04:49 UTC

root.billing.ru ~ 
> ovs-appctl vlog/set netdev_offload_tc:OFF
no destination, level, or module "netdev_offload_tc"
ovs-appctl: ovs-vswitchd: server returned an error
root.billing.ru ~ 
> ovs-appctl vlog/set netdev_tc_offloads:OFF
root.billing.ru ~ 
>

Comment 39 Rodolfo Alonso 2020-05-11 09:33:00 UTC

*** Bug 1833754 has been marked as a duplicate of this bug. ***

Comment 49 Anita Tragler 2020-06-04 18:10:27 UTC

This bug is related to logging and looking up ifindex information. It could generate a lot of logging errors and consume some disk space, log rotate will ensure logs don't get too large. It should not impact data traffic.
The original issue is due to backporting a OVS TC/flwoer hardware offload feature with bad structure and this has been reverted, feature moved to later OVS release.
The fix is planned for OSP 13z12 (June) and OSP 16.1 GA (July)

There is a workaround available see comment #3, to turn off TC offload logging, this is not required  when OVS offload is not being used.

command will need to be run on each start / boot of openvswitch 

  ovs-appctl vlog/set netdev_tc_offloads:OFF

Comment 50 Anita Tragler 2020-06-04 18:11:41 UTC

There is a workaround available see comment #37, to turn off TC offload logging, this is not required  when OVS offload is not being used.

command will need to be run on each start / boot of openvswitch 

  ovs-appctl vlog/set netdev_tc_offloads:OFF

Comment 51 Brendan Shephard 2020-06-04 23:50:00 UTC

(In reply to Anita Tragler from comment #49)
> This bug is related to logging and looking up ifindex information. It could
> generate a lot of logging errors and consume some disk space, log rotate
> will ensure logs don't get too large. It should not impact data traffic.
> The original issue is due to backporting a OVS TC/flwoer hardware offload
> feature with bad structure and this has been reverted, feature moved to
> later OVS release.
> The fix is planned for OSP 13z12 (June) and OSP 16.1 GA (July)
> 
> There is a workaround available see comment #3, to turn off TC offload
> logging, this is not required  when OVS offload is not being used.
> 
> command will need to be run on each start / boot of openvswitch 
> 
>   ovs-appctl vlog/set netdev_tc_offloads:OFF

Hi Anita,

Thanks for your feedback. I believe the issue here is that when you have a large number of ports, the print operation for each and every interface multiple times gets fairly expensive. The issue we are generally seeing with this is that at some stage, Openvswitch is no longer able to keep up with the requests coming from Neutron - particularly in environments with distributed virtual routing enabled. At that stage, it stops updating flows and ports which causes a loss of traffic to the tenant VM's on the hypervisor. 

Disabling the netdev_tc_offloads logging does indeed improve the issue, and in most cases we have verified that the impact of the issue is mitigated by this change. 

But to be clear, this definitely, 100% is causing traffic outages to VM's. And the only way it gets resolved is by ovs-appctl vlog/set netdev_tc_offloads:OFF

So I'm glad to hear that the fix will be released in z12.

Comment 52 Chris Smart 2020-06-05 00:11:44 UTC

I'm using something like this in the meantime.


$ cat << EOF | sudo tee /etc/systemd/system/ovs-disable-tc-logging-hack.service
[Unit]
Description=Hack to work around OVS tc logging

[Service]
Type=oneshot
ExecStart=/bin/ovs-appctl vlog/set netdev_tc_offloads off
User=root
Group=systemd-journal
Requires=openvswitch.service
After=openvswitch.service
EOF

$ cat << EOF | sudo tee /etc/systemd/system/ovs-disable-tc-logging-hack.timer
[Unit]
Description=Run hack to work around OVS tc logging

[Timer]
OnBootSec=5min
OnUnitActiveSec=1h

[Install]
WantedBy=timers.target
EOF

$ sudo systemctl daemon-reload
$ sudo systemctl start ovs-disable-tc-logging-hack.timer
$ sudo systemctl enable ovs-disable-tc-logging-hack.timer

Comment 57 stchen 2020-09-30 19:44:49 UTC

Closing EOL, OSP 15 has been retired as of Sept 19, 2020

Note You need to log in before you can comment on or make changes to this bug.

aconole
agurenko
akaris
amasolov
apevec
asimonel
atragler
bcafarel
bfournie
bjacot
bshephar
cfontain
chrisbro
chris.smart
chrisw
coldford
cschimid
david_paterson
dhill
dsedgmen
dvd
ebarrera
fhallal
fiezzi
igortiunov
irichart
j.beisiegel
jhoylaer
jlibosva
juanluis.alarcon
kfida
mgarciac
michele
mvalsecc
nsatsia
rbarrott
rhos-maint
rszmigie
scohen
stchen
tredaelli
tvainio
yocha