Description of problem: The flow based ping responder fails since the packet state goes to -trk and security groups drops it. In odl router ping responder is flow base(table 21 in [1]), the packet created by ping responder seems to have invalid conntrack packet state. The issue can be reproduced with [1] . [2] shows the stats. [1]https://gist.github.com/aswinsuryan/dd967b2bd404cd4e0671b2a3fb5b358e [2]https://gist.github.com/aswinsuryan/4c76be32141bf986f6d0302828c87431 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: Try the flows in the reproducer [1] in ovs2.9 with kmod installed Actual results: Ping is not successful and ct_state is -trk Expected results: Ping should be successful and ct_state should be +trk+est Additional info:
This is caused by ICMP actions not being supported by the kernel datapath causing them to occur in userspace. I'll try to explain what's happening, but it's tricky because some bits occur in the kernel and some in userspace. 1) echo request ingresses OVS bridge - ct(commit) action causes a ct lookup in the kernel and a commit to conntrack 2) echo request is modified into a reply (set_field:0->icmp_type) - flow slow pathed to userspace - action not supported by the kernel datapath so it must be sent to userspace - userspace happily does the ICMP packet modifications 3) ct_clear action occurs - since we changed the tuple in #2 conntrack state is no longer valid for the frame so it's cleared 4) ct(table=3) action occurs and triggers a recirculation - since the kernel datapath is in use, but flow is currently executing in userspace, conntrack actions are sent _independently_ to the kernel. The flow is forked and recirculated. - a copy of the packet is sent to the kernel to perform the ct() action - a copy of the packet is sent to the kernel to perform the recirc action 5) kernel performs recirc action - this continues execution at table=3, which was specified by ct(table=3) action - no ct() lookup actually occurs in the kernel, because it was done independently in #4 above. As such, the packet is ct_state=-trk. __WORKAROUND__ A workaround is to, in step 5, do another ct(table=xxx) action to force a ct lookup in the kernel. This works because the flow is currently executing in the kernel and not userspace. I have verified this works using the OVS testsuite.
(In reply to Eric Garver from comment #5) > This is caused by ICMP actions not being supported by the kernel datapath > causing them to occur in userspace. I'll try to explain what's happening, > but it's tricky because some bits occur in the kernel and some in userspace. > > 1) echo request ingresses OVS bridge > - ct(commit) action causes a ct lookup in the kernel and a commit to > conntrack > > 2) echo request is modified into a reply (set_field:0->icmp_type) > - flow slow pathed to userspace > - action not supported by the kernel datapath so it must be sent to > userspace > - userspace happily does the ICMP packet modifications > > 3) ct_clear action occurs > - since we changed the tuple in #2 conntrack state is no longer valid > for the frame so it's cleared > > 4) ct(table=3) action occurs and triggers a recirculation > - since the kernel datapath is in use, but flow is currently executing > in userspace, conntrack actions are sent _independently_ to the kernel. The > flow is forked and recirculated. > - a copy of the packet is sent to the kernel to perform the ct() action > - a copy of the packet is sent to the kernel to perform the recirc action > > 5) kernel performs recirc action > - this continues execution at table=3, which was specified by > ct(table=3) action > - no ct() lookup actually occurs in the kernel, because it was done > independently in #4 above. As such, the packet is ct_state=-trk. > > > __WORKAROUND__ > > A workaround is to, in step 5, do another ct(table=xxx) action to force a ct > lookup in the kernel. This works because the flow is currently executing in > the kernel and not userspace. > > I have verified this works using the OVS testsuite. Eric, This workaround need to be done in the pipeline? or in the ovs code base?
(In reply to Aswin Suryanarayanan from comment #6) > > Eric, > This workaround need to be done in the pipeline? or in the ovs code base? Pipeline.
Closing this bug as it was opened 2 years ago and has not recent activity, PM or customer request and was originally aiming ODL. Feel free to reopen if interest appears again
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days