Description of problem: I am testing new FDP 20B in osp16 and I have two testcase failing in the regression: * test_cold_migration * test_live_migration_block Checking openvswitch logs, we can see that it is having to parse messages: 2020-02-26T08:25:54.665Z|01034|dpif_netdev(revalidator50)|ERR|internal error parsing flow key skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(2),packet_type(ns=0,id=0),eth(src=d0:07:ca:34:e9:1f,dst=01:00:5e:00:00:01),eth_type(0x8100),vlan(vid=125,pcp=0),encap(eth_type(0x0800),ipv4(src=10.20.155.5,dst=224.0.0.1,proto=2,tos=0xc0,ttl=1,frag=no)) 2020-02-26T08:25:54.665Z|01036|dpif_netdev(revalidator50)|ERR|internal error parsing flow key skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(3),packet_type(ns=0,id=0),eth(src=d0:07:ca:34:e9:20,dst=01:00:5e:00:00:01),eth_type(0x8100),vlan(vid=125,pcp=0),encap(eth_type(0x0800),ipv4(src=10.20.155.5,dst=224.0.0.1,proto=2,tos=0xc0,ttl=1,frag=no)) 2020-02-26T08:26:19.898Z|01038|dpif_netdev(revalidator50)|ERR|internal error parsing flow key skb_priority(0),tunnel(tun_id=0x64,src=10.10.121.186,dst=10.10.121.166,ttl=64,tp_src=55673,tp_dst=4789,flags(key)),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(7),packet_type(ns=0,id=0),eth(src=fa:16:3e:08:02:04,dst=01:00:5e:00:00:16),eth_type(0x0800),ipv4(src=169.254.192.25,dst=224.0.0.22,proto=2,tos=0xc0,ttl=1,frag=no) 2020-02-26T08:26:19.898Z|01040|dpif_netdev(revalidator50)|ERR|internal error parsing flow key skb_priority(0),tunnel(tun_id=0x64,src=10.10.121.120,dst=10.10.121.166,ttl=64,tp_src=48491,tp_dst=4789,flags(key)),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(7),packet_type(ns=0,id=0),eth(src=fa:16:3e:41:37:ab,dst=01:00:5e:00:00:16),eth_type(0x0800),ipv4(src=169.254.193.102,dst=224.0.0.22,proto=2,tos=0xc0,ttl=1,frag=no) 2020-02-26T08:26:26.869Z|09326|netdev_dpdk|ERR|Failed to create mempool "ovse0f776c100021580262144" with a request of 262144 mbufs 2020-02-26T08:26:26.869Z|09327|netdev_dpdk|ERR|Failed to create memory pool for netdev vhu554e78da-97, with MTU 1500 on socket 0: File name too long 2020-02-26T08:26:26.869Z|09328|dpif_netdev|ERR|Failed to set interface vhu554e78da-97 new configuration 2020-02-26T08:26:43.468Z|01042|dpif_netdev(revalidator50)|ERR|internal error parsing flow key skb_priority(0),tunnel(tun_id=0x64,src=10.10.121.120,dst=10.10.121.166,ttl=64,tp_src=48491,tp_dst=4789,flags(key)),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(7),packet_type(ns=0,id=0),eth(src=fa:16:3e:41:37:ab,dst=01:00:5e:00:00:16),eth_type(0x0800),ipv4(src=169.254.193.102,dst=224.0.0.22,proto=2,tos=0xc0,ttl=1,frag=no) 2020-02-26T08:27:06.888Z|01049|dpif_netdev(revalidator50)|ERR|Dropped 5 log messages in last 23 seconds (most recently, 22 seconds ago) due to excessive rate 2020-02-26T08:27:06.888Z|01050|dpif_netdev(revalidator50)|ERR|internal error parsing flow key skb_priority(0),tunnel(tun_id=0x64,src=10.10.121.165,dst=10.10.121.166,ttl=64,tp_src=53165,tp_dst=4789,flags(key)),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(7),packet_type(ns=0,id=0),eth(src=fa:16:3e:3c:0d:f5,dst=01:00:5e:00:00:16),eth_type(0x0800),ipv4(src=169.254.195.11,dst=224.0.0.22,proto=2,tos=0xc0,ttl=1,frag=no) 2020-02-26T08:28:05.910Z|01057|dpif_netdev(revalidator50)|ERR|Dropped 5 log messages in last 45 seconds (most recently, 14 seconds ago) due to excessive rate 2020-02-26T08:28:05.910Z|01058|dpif_netdev(revalidator50)|ERR|internal error parsing flow key skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(3),packet_type(ns=0,id=0),eth(src=d0:07:ca:34:e9:20,dst=01:00:5e:00:00:01),eth_type(0x8100),vlan(vid=125,pcp=0),encap(eth_type(0x0800),ipv4(src=10.20.155.5,dst=224.0.0.1,proto=2,tos=0xc0,ttl=1,frag=no)) 2020-02-26T08:29:21.914Z|01061|dpif_netdev(revalidator50)|ERR|Dropped 1 log messages in last 45 seconds (most recently, 45 seconds ago) due to excessive rate 2020-02-26T08:30:06.933Z|01067|dpif_netdev(revalidator50)|ERR|internal error parsing flow key skb_priority(0),tunnel(tun_id=0x64,src=10.10.121.165,dst=10.10.121.166,ttl=64,tp_src=53165,tp_dst=4789,flags(key)),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(7),packet_type(ns=0,id=0),eth(src=fa:16:3e:3c:0d:f5,dst=01:00:5e:00:00:16),eth_type(0x0800),ipv4(src=169.254.195.11,dst=224.0.0.22,proto=2,tos=0xc0,ttl=1,frag=no) 2020-02-26T08:31:42.862Z|01072|dpif_netdev(revalidator50)|ERR|Dropped 3 log messages in last 96 seconds (most recently, 96 seconds ago) due to excessive rate 2020-02-26T08:31:42.862Z|01073|dpif_netdev(revalidator50)|ERR|internal error parsing flow key skb_priority(0),tunnel(tun_id=0x64,src=10.10.121.165,dst=10.10.121.166,ttl=64,tp_src=53165,tp_dst=4789,flags(key)),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(7),packet_type(ns=0,id=0),eth(src=fa:16:3e:3c:0d:f5,dst=01:00:5e:00:00:16),eth_type(0x0800),ipv4(src=169.254.195.11,dst=224.0.0.22,proto=2,tos=0xc0,ttl=1,frag=no) 2020-02-26T08:31:57.861Z|01075|dpif_netdev(revalidator50)|ERR|internal error parsing flow key skb_priority(0),tunnel(tun_id=0x64,src=10.10.121.165,dst=10.10.121.166,ttl=64,tp_src=53165,tp_dst=4789,flags(key)),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(7),packet_type(ns=0,id=0),eth(src=fa:16:3e:3c:0d:f5,dst=01:00:5e:00:00:16),eth_type(0x0800),ipv4(src=169.254.195.11,dst=224.0.0.22,proto=2,tos=0xc0,ttl=1,frag=no) 2020-02-26T08:32:01.924Z|01412|dpdk|ERR|VHOST_CONFIG: recvmsg failed 2020-02-26T08:34:07.861Z|01086|dpif_netdev(revalidator50)|ERR|Dropped 9 log messages in last 126 seconds (most recently, 79 seconds ago) due to excessive rate 2020-02-26T08:34:07.861Z|01087|dpif_netdev(revalidator50)|ERR|internal error parsing flow key skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(2),packet_type(ns=0,id=0),eth(src=d0:07:ca:34:e9:1f,dst=01:00:5e:00:00:01),eth_type(0x8100),vlan(vid=125,pcp=0),encap(eth_type(0x0800),ipv4(src=10.20.155.5,dst=224.0.0.1,proto=2,tos=0xc0,ttl=1,frag=no)) 2020-02-26T08:34:18.884Z|01089|dpif_netdev(revalidator50)|ERR|internal error parsing flow key skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(3),packet_type(ns=0,id=0),eth(src=d0:07:ca:34:e9:20,dst=01:00:5e:00:00:01),eth_type(0x8100),vlan(vid=125,pcp=0),encap(eth_type(0x0800),ipv4(src=10.20.155.5,dst=224.0.0.1,proto=2,tos=0xc0,ttl=1,frag=no)) 2020-02-26T08:38:30.488Z|01091|dpif_netdev(revalidator50)|ERR|internal error parsing flow key skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(3),packet_type(ns=0,id=0),eth(src=d0:07:ca:34:e9:20,dst=01:00:5e:00:00:01),eth_type(0x8100),vlan(vid=125,pcp=0),encap(eth_type(0x0800),ipv4(src=10.20.155.5,dst=224.0.0.1,proto=2,tos=0xc0,ttl=1,frag=no)) 2020-02-26T08:42:35.889Z|01093|dpif_netdev(revalidator50)|ERR|internal error parsing flow key skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(2),packet_type(ns=0,id=0),eth(src=d0:07:ca:34:e9:1f,dst=01:00:5e:00:00:01),eth_type(0x8100),vlan(vid=125,pcp=0),encap(eth_type(0x0800),ipv4(src=10.20.155.5,dst=224.0.0.1,proto=2,tos=0xc0,ttl=1,frag=no)) 2020-02-26T08:42:35.889Z|01094|dpif_netdev(revalidator50)|ERR|internal error parsing flow key skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(3),packet_type(ns=0,id=0),eth(src=d0:07:ca:34:e9:20,dst=01:00:5e:00:00:01),eth_type(0x8100),vlan(vid=125,pcp=0),encap(eth_type(0x0800),ipv4(src=10.20.155.5,dst=224.0.0.1,proto=2,tos=0xc0,ttl=1,frag=no)) 2020-02-26T08:44:36.927Z|01097|dpif_netdev(revalidator50)|ERR|internal error parsing flow key skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(2),packet_type(ns=0,id=0),eth(src=d0:07:ca:34:e9:1f,dst=01:00:5e:00:00:01),eth_type(0x8100),vlan(vid=125,pcp=0),encap(eth_type(0x0800),ipv4(src=10.20.155.5,dst=224.0.0.1,proto=2,tos=0xc0,ttl=1,frag=no)) 2020-02-26T08:44:38.808Z|01099|dpif_netdev(revalidator50)|ERR|internal error parsing flow key skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(3),packet_type(ns=0,id=0),eth(src=d0:07:ca:34:e9:20,dst=01:00:5e:00:00:01),eth_type(0x8100),vlan(vid=125,pcp=0),encap(eth_type(0x0800),ipv4(src=10.20.155.5,dst=224.0.0.1,proto=2,tos=0xc0,ttl=1,frag=no)) Version-Release number of selected component (if applicable): OSP16 RHOS_TRUNK-16.0-RHEL-8-20200224.n.0 I replaced in overcloud image openvswitch: ir_tripleo_undercloud_images_packages: http://download.eng.bos.redhat.com/brewroot/vol/rhel-8/packages/openvswitch2.11/2.11.0/47.el8fdp/x86_64/openvswitch2.11-2.11.0-47.el8fdp.x86_64.rpm,http://download.eng.bos.redhat.com/brewroot/vol/rhel-8/packages/openvswitch2.11/2.11.0/47.el8fdp/x86_64/python3-openvswitch2.11-2.11.0-47.el8fdp.x86_64.rpm ir_tripleo_undercloud_images_remove_no_deps_packages: openvswitch2.11 How reproducible: Execute testcases test_cold_migration or test_live_migration_block from nfv-tempest-plugin repo Steps to Reproduce: 1. 2. 3. Actual results: cold and live migration fails Expected results: cold and live migration should work Additional info:
I have checked that those log files are not related with FDP, they are already present in OSP16 RHOS_TRUNK-16.0-RHEL-8-20200224.n.0 without changing ovs. http://cougar11.scl.lab.tlv.redhat.com/DFG-nfv-16-director-3cont-2comp-ipv4-vxlan-dpdk-sriov-ctlplane-dataplane-bonding-hybrid/64 About live migration testcase failing in fdp20b there was a bz that has already been fixed https://bugzilla.redhat.com/show_bug.cgi?id=1798996 About those errors in log messages, not sure if something is not working, all of our testcases are working. In any case I think it should be analyzed and fixed.
I finally looked at this - this is an unfortunate consequence of the way openvswitch works with IGMP. See commit c645550bb249 ("odp-util: Always report ODP_FIT_TOO_LITTLE for IGMP."), which means that all IGMP messages will trigger this error message (and force installing a flow that always shunts to userspace). Also see an explanation at https://mail.openvswitch.org/pipermail/ovs-dev/2018-January/343665.html There's not currently a good way to reflect this to the user without blasting an ugly looking error message. I think we don't want to turn off the error messages for the revalidator (because they are useful), and we don't want to lose that this event happened. A good resolution would be to actually handle the IGMP fields in the datapath, but that's a larger RFE.