Bug 1554233
| Summary: | Flow based ping responder fails since packet state goes to -trk | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Aswin Suryanarayanan <asuryana> |
| Component: | openvswitch | Assignee: | Rashid Khan <rkhan> |
| Status: | CLOSED CANTFIX | QA Contact: | Ofer Blaut <oblaut> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 13.0 (Queens) | CC: | amuller, apevec, bcafarel, chrisw, egarver, fhallal, mkolesni, rhos-maint, rkhan, srevivo |
| Target Milestone: | --- | Keywords: | Triaged, ZStream |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-01-07 16:18:16 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1534886 | ||
|
Description
Aswin Suryanarayanan
2018-03-12 07:34:58 UTC
This is caused by ICMP actions not being supported by the kernel datapath causing them to occur in userspace. I'll try to explain what's happening, but it's tricky because some bits occur in the kernel and some in userspace.
1) echo request ingresses OVS bridge
- ct(commit) action causes a ct lookup in the kernel and a commit to conntrack
2) echo request is modified into a reply (set_field:0->icmp_type)
- flow slow pathed to userspace
- action not supported by the kernel datapath so it must be sent to userspace
- userspace happily does the ICMP packet modifications
3) ct_clear action occurs
- since we changed the tuple in #2 conntrack state is no longer valid for the frame so it's cleared
4) ct(table=3) action occurs and triggers a recirculation
- since the kernel datapath is in use, but flow is currently executing in userspace, conntrack actions are sent _independently_ to the kernel. The flow is forked and recirculated.
- a copy of the packet is sent to the kernel to perform the ct() action
- a copy of the packet is sent to the kernel to perform the recirc action
5) kernel performs recirc action
- this continues execution at table=3, which was specified by ct(table=3) action
- no ct() lookup actually occurs in the kernel, because it was done independently in #4 above. As such, the packet is ct_state=-trk.
__WORKAROUND__
A workaround is to, in step 5, do another ct(table=xxx) action to force a ct lookup in the kernel. This works because the flow is currently executing in the kernel and not userspace.
I have verified this works using the OVS testsuite.
(In reply to Eric Garver from comment #5) > This is caused by ICMP actions not being supported by the kernel datapath > causing them to occur in userspace. I'll try to explain what's happening, > but it's tricky because some bits occur in the kernel and some in userspace. > > 1) echo request ingresses OVS bridge > - ct(commit) action causes a ct lookup in the kernel and a commit to > conntrack > > 2) echo request is modified into a reply (set_field:0->icmp_type) > - flow slow pathed to userspace > - action not supported by the kernel datapath so it must be sent to > userspace > - userspace happily does the ICMP packet modifications > > 3) ct_clear action occurs > - since we changed the tuple in #2 conntrack state is no longer valid > for the frame so it's cleared > > 4) ct(table=3) action occurs and triggers a recirculation > - since the kernel datapath is in use, but flow is currently executing > in userspace, conntrack actions are sent _independently_ to the kernel. The > flow is forked and recirculated. > - a copy of the packet is sent to the kernel to perform the ct() action > - a copy of the packet is sent to the kernel to perform the recirc action > > 5) kernel performs recirc action > - this continues execution at table=3, which was specified by > ct(table=3) action > - no ct() lookup actually occurs in the kernel, because it was done > independently in #4 above. As such, the packet is ct_state=-trk. > > > __WORKAROUND__ > > A workaround is to, in step 5, do another ct(table=xxx) action to force a ct > lookup in the kernel. This works because the flow is currently executing in > the kernel and not userspace. > > I have verified this works using the OVS testsuite. Eric, This workaround need to be done in the pipeline? or in the ovs code base? (In reply to Aswin Suryanarayanan from comment #6) > > Eric, > This workaround need to be done in the pipeline? or in the ovs code base? Pipeline. Closing this bug as it was opened 2 years ago and has not recent activity, PM or customer request and was originally aiming ODL. Feel free to reopen if interest appears again The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |