Bug 709418 - NAT interface rules doesn't exclude multicast/broadcast networks
Summary: NAT interface rules doesn't exclude multicast/broadcast networks
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Virtualization Tools
Classification: Community
Component: libvirt
Version: unspecified
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Laine Stump
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1033020
TreeView+ depends on / blocked
 
Reported: 2011-05-31 17:05 UTC by Brian J. Murrell
Modified: 2014-01-04 22:07 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1033020 (view as bug list)
Environment:
Last Closed: 2013-11-17 20:36:04 UTC
Embargoed:


Attachments (Terms of Use)
exempts multicast from being natted (3.67 KB, patch)
2011-10-05 17:47 UTC, Brian J. Murrell
no flags Details | Diff
[1/2] [0.10.2-18.el6_4.5] util/viriptables: add/remove rules that short-circuit masquerading (6.28 KB, patch)
2013-05-28 02:41 UTC, Laszlo Ersek
no flags Details | Diff
[2/2] [0.10.2-18.el6_4.5] bridge driver: don't masquerade local subnet broadcast/multicast packets (4.99 KB, patch)
2013-05-28 02:41 UTC, Laszlo Ersek
no flags Details | Diff

Description Brian J. Murrell 2011-05-31 17:05:07 UTC
Description of problem:
The rules that are installed on the host to NAT for the guests excludes the virtual network from being NATted but it doesn't exclude the multicast networks and the result is that network traffic destined for local network multicast gets it's source address translated.  These should not be translated any more than the current exclusions for the "local network".

Version-Release number of selected component (if applicable):
0.8.3

How reproducible:
100%

Steps to Reproduce:
1. create a virtual network of machines bridging and NATting to the host's interface
2. generate multicast traffic on one VM and expect to receive it on another
3.
  
Actual results:
If you tcpdump on virbr0 on host, you will see the source addresses being NATted.

Expected results:
There should be no NATting for local multicast.

Additional info:

Comment 1 Brian J. Murrell 2011-10-05 17:45:27 UTC
Any chance of getting this fixed soon?  I'm on 0.8.8 now and this bug is still there.  Unfortunately I've wasted a day at least chasing a problem with another software package that ended up being caused by this NATting (masquerading, technically) of multicast packets.

It's quite simple.  Adding the following rule solves the problem:

# iptables -t nat -I POSTROUTING -d 224.0.0.0/4 -j RETURN

I'll attach a patch (compiles but completely untested) to see if it's the correct approach.  Feel free to take it over as a starting place if you want.

Comment 2 Brian J. Murrell 2011-10-05 17:47:17 UTC
Created attachment 526549 [details]
exempts multicast from being natted

Comment 3 Dave Allan 2011-10-05 19:22:00 UTC
Brian, many thanks for the patch.  Would you mind submitting it to the upstream libvir-list for discussion?  Also, 0.8.8 is fairly dated at this point, would you mind rebasing the patch to the git head and confirming that it solves the problem on your system?  Doing both of those things will give it the best chance of being accepted upstream quickly.

Comment 4 Brian J. Murrell 2011-10-05 20:18:49 UTC
(In reply to comment #3)
> Brian, many thanks for the patch.

Welcome.

> Would you mind submitting it to the upstream
> libvir-list for discussion?  Also, 0.8.8 is fairly dated at this
> point, would you mind rebasing the patch to the git head and confirming that it
> solves the problem on your system?

I'm afraid I'm not sure I have the time for a discussion and re-working with git head.  0.8.8 is what is in my current/stable O/S release here so that's what I use.  I really only use libvirt and moreso, multicast with it in my work tasks, which I am now behind on due to having to debug this and work up the patch.  I guess this is just a long winded way of saying I don't think I have any more time (in the near future at least) to devote to this.

> Doing both of those things will give it the
> best chance of being accepted upstream quickly.

Which is a good thing, I will most certainly agree.  I just don't have the time to debate the patch and run it through iterations of re-work against a code branch that I don't actually even use (yet).

Sorry I can't be more proactive with this but getting my day job done has to come first.

Comment 5 Laine Stump 2011-11-14 18:07:46 UTC
Just for future reference about this - the provided patch doesn't put anything specific to each network in the rule it inserts, but creates one of these identical rules for each network. It also doesn't remove these rules when a network is stopped, or the rules reloaded (e.g. due to libvirtd restarting). That means that the rule list will grow without bounds until the host is rebooted.

Before this patch could be posted upstream, at least those two issues would need to be addressed (unfortunately I don't have the time to address them right now).

Comment 6 Laszlo Ersek 2013-05-05 20:08:46 UTC
A similar problem is caused for the 255.255.255.255 broadcast destination address.

OVMF (Open Virtual Machine Firmware -- UEFI for virtual machines) has a built-in DHCP client. When the VM / firmware starts, it sends out a DHCP DISCOVER message. dnsmasq on virbr0 responds with a DHCP OFFER to 255.255.255.255. The MASQ rules translate the DHCP OFFER's source port to >= 1024, hence OVMF considers the DHCP OFFER packet invalid, and the DHCP conversation never succeeds.

Chain POSTROUTING (policy ACCEPT 73 packets, 5689 bytes)
    pkts      bytes target     prot opt in     out     source               destination
       0        0 MASQUERADE  tcp  --  *      *       192.168.122.0/24    !192.168.122.0/24    masq ports: 1024-65535
       1      362 MASQUERADE  udp  --  *      *       192.168.122.0/24    !192.168.122.0/24    masq ports: 1024-65535  <--- this one
       0        0 MASQUERADE  all  --  *      *       192.168.122.0/24    !192.168.122.0/24

Adding the rule below at position #2 (1-based) fixes it:

Chain POSTROUTING (policy ACCEPT 79 packets, 6075 bytes)
    pkts      bytes target     prot opt in     out     source               destination
       0        0 MASQUERADE  tcp  --  *      *       192.168.122.0/24    !192.168.122.0/24    masq ports: 1024-65535
       1      362 ACCEPT      udp  --  *      *       192.168.122.0/24     255.255.255.255                             <---- here
       1      362 MASQUERADE  udp  --  *      *       192.168.122.0/24    !192.168.122.0/24    masq ports: 1024-65535
       0        0 MASQUERADE  all  --  *      *       192.168.122.0/24    !192.168.122.0/24

More detailed description:

http://thread.gmane.org/gmane.comp.bios.tianocore.devel/1506/focus=2640

Found in:
- libvirt-0.10.2-18.el6_4.4.x86_64

Comment 7 Laszlo Ersek 2013-05-28 02:33:24 UTC
Posted upstream series (v1):
http://www.redhat.com/archives/libvir-list/2013-May/msg01872.html

Comment 9 Laszlo Ersek 2013-05-28 02:41:40 UTC
Created attachment 753703 [details]
[1/2] [0.10.2-18.el6_4.5] util/viriptables: add/remove rules that short-circuit masquerading


The functions
- iptablesAddForwardDontMasquerade(),
- iptablesRemoveForwardDontMasquerade
handle exceptions in the masquerading implemented in the POSTROUTING chain
of the "nat" table. Such exceptions should be added as chronologically
latest, logically top-most rules.

The bridge driver will call these functions beginning with the next patch:
some special destination IP addresses always refer to the local
subnetwork, even though they don't match any practical subnetwork's
netmask. Packets from virbrN targeting such IP addresses are never routed
outwards, but the current rules treat them as non-virbrN-destined packets
and masquerade them. This causes problems for some receivers on virbrN.

Signed-off-by: Laszlo Ersek <lersek>
---
 src/libvirt_private.syms |    2 +
 src/util/iptables.c      |  101 ++++++++++++++++++++++++++++++++++++++++++++++
 src/util/iptables.h      |   10 +++++
 3 files changed, 113 insertions(+), 0 deletions(-)

Comment 10 Laszlo Ersek 2013-05-28 02:41:55 UTC
Created attachment 753704 [details]
[2/2] [0.10.2-18.el6_4.5] bridge driver: don't masquerade local subnet broadcast/multicast packets


Packets sent by guests on virbrN, *or* by dnsmasq on the same, to
- 255.255.255.255/32 (netmask-independent local network broadcast
  address), or to
- 224.0.0.0/24 (local subnetwork multicast range)
are never forwarded, hence it is not necessary to masquerade them.

In fact we must not masquerade them: translating their source addresses or
source ports (where applicable) may confuse receivers on virbrN.

One example is the DHCP client in OVMF (= UEFI firmware for virtual
machines):

  http://thread.gmane.org/gmane.comp.bios.tianocore.devel/1506/focus=2640

It expects DHCP replies to arrive from remote source port 67. Even though
dnsmasq conforms to that, the destination address (255.255.255.255) and
the source address (eg. 192.168.122.1) in the reply allow the UDP
masquerading rule to match, which rewrites the source port to or above
1024. This prevents the DHCP client in OVMF from accepting the packet.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=709418

(I read attachment 526549 [details] in that BZ by Brian J. Murrell
<brian.ca>, and Laine Stump's review thereof, before starting
this patch.)

Signed-off-by: Laszlo Ersek <lersek>
---
 src/network/bridge_driver.c |   74 ++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 70 insertions(+), 4 deletions(-)

Comment 11 Laszlo Ersek 2013-05-28 02:51:50 UTC
Chain POSTROUTING (policy ACCEPT 386 packets, 25238 bytes)
    pkts      bytes target     prot opt in     out     source               destination         
       0        0 RETURN     all  --  *      *       192.168.122.0/24     224.0.0.0/24        
       1      362 RETURN     all  --  *      *       192.168.122.0/24     255.255.255.255     
       0        0 MASQUERADE  tcp  --  *      *       192.168.122.0/24    !192.168.122.0/24    masq ports: 1024-65535 
       4      304 MASQUERADE  udp  --  *      *       192.168.122.0/24    !192.168.122.0/24    masq ports: 1024-65535 
       4      240 MASQUERADE  all  --  *      *       192.168.122.0/24    !192.168.122.0/24

Comment 12 Brian J. Murrell 2013-05-28 07:24:14 UTC
(In reply to Laszlo Ersek from comment #10)
> Created attachment 753704 [details]
> [2/2] [0.10.2-18.el6_4.5] bridge driver: don't masquerade local subnet
> broadcast/multicast packets
> 
> 
> Packets sent by guests on virbrN, *or* by dnsmasq on the same, to
> - 255.255.255.255/32 (netmask-independent local network broadcast
>   address), or to
> - 224.0.0.0/24 (local subnetwork multicast range)
> are never forwarded, hence it is not necessary to masquerade them.

Multicast is much wider than just 224.0.0.0/24.  Per my patch in  attachment 526549 [details], it's in fact 224.0.0.0/4.  All multicast needs to be exempt from NAT.

Quoting from Wikipedia article http://en.wikipedia.org/wiki/Multicast_address:

IPv4 multicast addresses are defined by the leading address bits of 1110, originating from the classful network design of the early Internet when this group of addresses was designated as Class D. The Classless Inter-Domain Routing (CIDR) prefix of this group is 224.0.0.0/4.

Indeed, Wikipedia is not authoritative, but it does provide a good starting point to finding authoritative information if you want.

I suppose the authoritative doc is at http://www.iana.org/assignments/multicast-addresses/multicast-addresses.xml or somesuch.  It does not explicitly say 224.0.0.0/4 but it certainly implies it in discussing the entire space from 224.0.0.0 through 239.255.255.255.

Comment 13 Laszlo Ersek 2013-05-28 09:22:29 UTC
(In reply to Brian J. Murrell from comment #12)
> (In reply to Laszlo Ersek from comment #10)
> > Created attachment 753704 [details]
> > [2/2] [0.10.2-18.el6_4.5] bridge driver: don't masquerade local subnet
> > broadcast/multicast packets
> >
> >
> > Packets sent by guests on virbrN, *or* by dnsmasq on the same, to
> > - 255.255.255.255/32 (netmask-independent local network broadcast
> >   address), or to
> > - 224.0.0.0/24 (local subnetwork multicast range)
> > are never forwarded, hence it is not necessary to masquerade them.
>
> Multicast is much wider than just 224.0.0.0/24.  Per my patch in
> attachment 526549 [details], it's in fact 224.0.0.0/4.
>
> Quoting from Wikipedia article
> http://en.wikipedia.org/wiki/Multicast_address:

Yes, I consulted the same WP article. I changed the mask to /24 on purpose:
<http://en.wikipedia.org/wiki/Multicast_address#Local_subnetwork>:

> Addresses in the range 224.0.0.0 to 224.0.0.255 are individually assigned
> by IANA and designated for multicasting on the local subnetwork only. For
> example, the Routing Information Protocol (RIPv2) uses 224.0.0.9, Open
> Shortest Path First (OSPF) uses 224.0.0.5 & 224.0.0.6, and Zeroconf mDNS
> uses 224.0.0.251. Routers must not forward these messages outside the
> subnet in which they originate.

Your comment 0 too says

> network traffic destined for local network multicast gets it's source
> address translated


From comment 12,

> All multicast needs to be exempt from NAT.

That would conflict with the preexistent reasoning about source port
translation in networkAddMasqueradingIptablesRules()
[src/network/bridge_driver.c]:

> * The sport mappings are required, because default IPtables
> * MASQUERADE maintain port numbers unchanged where possible.
> *
> * NFS can be configured to only "trust" port numbers < 1023.
> *
> * Guests using NAT thus need to be prevented from having port
> * numbers < 1023, otherwise they can bypass the NFS "security"
> * check on the source port number.

In my series I don't want to let UDP or TCP packets escape virbrN without
source port translation, even if they are not directly related to NFS.

> Indeed, Wikipedia is not authoritative, but it does provide a good
> starting point to finding authoritative information if you want.

Agreed.

I propose to review / merge this series as I posted it to
<libvir-list>. With those bits in place it should be simple for
you to patch and argue for something like

  diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c
  index 2b3b250..be9979f 100644
  --- a/src/network/bridge_driver.c
  +++ b/src/network/bridge_driver.c
  @@ -1542,7 +1542,7 @@ networkRefreshDaemons(struct network_driver *driver)
       }
   }

  -static const char networkLocalMulticast[] = "224.0.0.0/24";
  +static const char networkLocalMulticast[] = "224.0.0.0/4";
   static const char networkLocalBroadcast[] = "255.255.255.255/32";

   static int

Personally I won't push for that, because I don't have a multicast use case,
and because (as said above) I'd perceive such a change to conflict with the
source port mapping goal. Of course, if you convince the upstream reviewers
regarding the /4 mask, I won't oppose it!

Thanks,
Laszlo

Comment 14 Laszlo Ersek 2013-09-23 14:06:19 UTC
Refreshed the series for upstream:
https://www.redhat.com/archives/libvir-list/2013-September/msg01269.html

(No other changes.)

Comment 16 Cole Robinson 2013-11-17 20:36:04 UTC
These changes appear to be upstream now.

Comment 17 purpleidea 2014-01-04 22:07:35 UTC
Looks like: 51e184e9821c3740ac9b52055860d683f27b0ab6


Note You need to log in before you can comment on or make changes to this bug.