Bug 1805592
| Summary: | [OVN2.13][20.B]After guest send a malformed arp, it can't ping other host through a logical router | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux Fast Datapath | Reporter: | ying xu <yinxu> |
| Component: | ovn2.13 | Assignee: | OVN Team <ovnteam> |
| Status: | CLOSED NOTABUG | QA Contact: | Jianlin Shi <jishi> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | FDP 20.A | CC: | ctrautma, dcbw, fhallal, gcerami, jishi, nusiddiq, ralongi, tredaelli |
| Target Milestone: | --- | Flags: | gcerami:
needinfo?
|
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1775525 | Environment: | |
| Last Closed: | 2020-07-15 07:13:26 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1775525 | ||
| Bug Blocks: | 1791190 | ||
|
Description
ying xu
2020-02-21 05:20:33 UTC
I wouldn't say the packet is malformed. Packet is still valid but it's spoofing the mac using one of an existing host on the other switch. Anyway, I was unable to reproduce the bug in an environment with Fedora and the upstream version of ovs/ovn. Is there a way to access the lab with this environment and launch the test case you're mentioning ? Thanks. (In reply to Gabriele Cerami from comment #1) > I wouldn't say the packet is malformed. Packet is still valid but it's > spoofing the mac using one of an existing host on the other switch. > > Anyway, I was unable to reproduce the bug in an environment with Fedora and > the upstream version of ovs/ovn. > > Is there a way to access the lab with this environment and launch the test > case you're mentioning ? > > Thanks. If you want the env you can ping me in irc(id:yinxu),I will set it for you. Thanks Ying for the environment provided.
I was able to modify the test to get a step by step analysis before and after
sending the malformed packet
As a high overview, what happens when vm2 sends ICMP requests to the other hosts, is that the other
hosts receive the packet, they reply correctly, but the reply gets swallowed.
A trace on the return path (the ICMP replies from other hosts) shows the problem
ovn-trace --friendly-names s2 'inport == "hv1_vm00_vnet1" && icmp4.type == 0 && eth.src == 00:de:ad:01:00:01 && ip4.src == 172.16.102.11 && eth.dst == 00:de:ad:ff:01:02 && ip4.dst == 172.16.103.13 && ip.ttl == 64'
# icmp,reg14=0x2,vlan_tci=0x0000,dl_src=00:de:ad:01:00:01,dl_dst=00:de:ad:ff:01:02,nw_src=172.16.102.11,nw_dst=172.16.103.13,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0
[...]
ingress(dp="r1", inport="r1_s2")
--------------------------------
0. lr_in_admission (ovn-northd.c:7932): eth.dst == 00:de:ad:ff:01:02 && inport == "r1_s2", priority 50, uuid 34c199c5
next;
1. lr_in_lookup_neighbor (ovn-northd.c:7981): 1, priority 0, uuid e94a4498
reg9[3] = 1;
next;
2. lr_in_learn_neighbor (ovn-northd.c:7987): reg9[3] == 1 || reg9[2] == 1, priority 100, uuid a3439552
next;
9. lr_in_ip_routing (ovn-northd.c:7556): ip4.dst == 172.16.103.0/24, priority 49, uuid 31af4d7c
ip.ttl--;
reg8[0..15] = 0;
reg0 = ip4.dst;
reg1 = 172.16.103.1;
eth.src = 00:de:ad:ff:01:03;
outport = "r1_s3";
flags.loopback = 1;
next;
10. lr_in_ip_routing_ecmp (ovn-northd.c:9530): reg8[0..15] == 0, priority 150, uuid 5fb207d4
next;
12. lr_in_arp_resolve (ovn-northd.c:10010): ip4, priority 0, uuid cbfdafed
get_arp(outport, reg0);
/* MAC binding to 00:de:ad:01:00:01. */
next;
The malformed packet is successful in his original intention: poison the arp table of the router.
The get_arp function returns a MAC that is not the original MAC of vm2, but the mac sent in the malformed packet.
Of course then the packet gets dropped because the next table does not have any match for the packet contents.
Looking at the database in /var/lib/ovn/ovnsb_db.db I can see the port is added as
{"_date":1594675403735,"MAC_Binding":{"3311fdca-910e-4d43-8db9-f4692e57b607":{"ip":"172.16.103.13","logical_port":"r1_s3","mac":"00:00:00:00:00:02","datapath":["uuid","6987a142-b710-4e26-a350-ce943626af44"]}},"_comment":"ovn-controller: registering chassis 'hv0'"}
I can see the update to the db with the new mac binding caused by the malformed packet.
{"_date":1594679859567,"MAC_Binding":{"3311fdca-910e-4d43-8db9-f4692e57b607":{"mac":"00:de:ad:01:00:01"}},"_comment":"ovn-controller: registering chassis 'hv0'"}
Sending an arp from vm2 with the correct format and MAC, updates again to the correct binding and fixes the situation.
If we want to avoid this we need to trust only the initial binding information for that port and add rules that block the ARP packet from unknown MACs for that specific port.
We already support blocking ARP packets from unknown MACs. For that you need to set the port_security colum of logical switch port. You can set it as : ovn-nbctl lsp-set-port-security <port_name> "MAC IP1 .." In many deployments, the port security column will be same as "addresses" column. In your testing is port_security set ? If not, please set it and try again. Thanks Numan! the port security was not set. I launched that command and now the ARP is ignored, and ICMPs continue to flow regularly even after sending the malformed packet. So the test case probably needs update. Thanks all. Just to make things clearer for everyone. This is not a bug in OVN, without port security active this is the correct behaviour. If we want to protect the router from ARP poisoning attack the best mitigation technique is to activate port security. So the test needs update. |