Bug 1448987

Summary: NM does not use new route when adding host route for DHCP server
Product: Red Hat Enterprise Linux 7 Reporter: Jonathan Maxwell <jmaxwell>
Component: NetworkManagerAssignee: Beniamino Galvani <bgalvani>
Status: CLOSED ERRATA QA Contact: Desktop QE <desktop-qa-list>
Severity: high Docs Contact:
Priority: high    
Version: 7.3CC: aloughla, atragler, bgalvani, fgiudici, lrintel, rkhan, sukulkar, thaller, vbenes
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: NetworkManager-1.8.0-3.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 09:28:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
[PATCH] dhcp: don't add route to DHCP4 server
none
Routed DHCP test script none

Description Jonathan Maxwell 2017-05-09 02:09:08 UTC
Description of problem:

As per: https://bugzilla.redhat.com/show_bug.cgi?id=983325

NetworkManager will add a host route for the DHCP server. This is usually not a problem provided the customer has only one router per subnet, the default router. But if there are 2 routers on the same subnet and a new route has been defined for the subnet that the DHCP server is on, then this could present a problem depending on the customers external network configuration.

Take my LAB system, here we have where 10.7.1.66 is the dhcp server:

[root@unused ~]# ip addr

2: eno16780032: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 00:50:56:a5:70:8a brd ff:ff:ff:ff:ff:ff
    inet 10.7.1.233/24 brd 10.7.1.255 scope global dynamic eno16780032
       valid_lft 259138sec preferred_lft 259138sec
    inet6 fe80::250:56ff:fea5:708a/64 scope link 
       valid_lft forever preferred_lft forever

[root@unused ~]# ip route 
default via 10.7.1.254 dev eno16780032  proto static  metric 100 
10.7.1.66 via 10.7.1.254 dev eno16780032  proto dhcp  metric 100 << added by NM
10.7.1.0/24 dev eno16780032  proto kernel  scope link  src 10.7.1.233 
[root@unused ~]# 

Now if we add a new route for the 10.0.0.0/8 subnet:

# ip route add 10.0.0.0/8 via 10.7.1.2 dev eno16780032

Now see how IP addresses for 10.7.1.x e.g 10.7.1.1 resolves. DHCP server 10.7.1.66 would also resolve the same, if the host route was not there:

# ip route get 10.7.1.1
10.7.1.1 via 10.7.1.2 src 10.7.1.233

But if we restart NM then it still adds a route for 10.7.1.66 via 10.7.1.254 instead of the new router 10.7.1.2. This can be a problem for customers that expect the traffic to go trough the new route.

Version-Release number of selected component (if applicable):

RHEL7

How reproducible:

Always see above.

Actual results:

When the routing table looks like:

# ip route 
default via 10.7.1.254 dev eno16780032  proto static  metric 100 
10.0.0.0/8 via 10.7.1.2 dev eno16780032
10.7.1.66 via 10.7.1.254 dev eno16780032  proto dhcp  metric 100 
10.7.1.0/24 dev eno16780032  proto kernel  scope link  src 10.7.1.233 

Then restarting NM still shows that the DCHP server host route is still via 10.7.1.254 instead of 10.7.1.2.

# ip route 
default via 10.7.1.254 dev eno16780032  proto static  metric 100 
10.0.0.0/8 via 10.7.1.2 dev eno16780032
10.7.1.66 via 10.7.1.254 dev eno16780032  proto dhcp  metric 100 
10.7.1.0/24 dev eno16780032  proto kernel  scope link  src 10.7.1.233 

Expected results:

After restarting, expect NM to find the new route for 10.7.1.66 pointing to gw 10.7.1.2. Just like "ip route get" e.g:

# ip route get 10.7.1.1
10.7.1.1 via 10.7.1.2 src 10.7.1.233

default via 10.7.1.254 dev eno16780032  proto static  metric 100 
10.0.0.0/8 via 10.7.1.2 dev eno16780032
10.7.1.66 via 10.7.1.2 dev eno16780032  proto dhcp  metric 100 
10.7.1.0/24 dev eno16780032  proto kernel  scope link  src 10.7.1.233 

Additional info:

We have a customer that reported this after trying to kickstart RHEL7 through Anaconda and the route added by NM is breaking communications because the new route they defined for the DHCP server network is not being used.

Comment 2 Beniamino Galvani 2017-05-09 12:54:21 UTC
Hi,

it's wrong that NM adds such route because the DHCP server is already
reachable through the existing routes pushed by the server. Also, the
way in which the route gateway is chosen is wrong because it doesn't
consider the static routes, but only the default gateway. I think that
the fix for bug 983325 was implemented in the wrong way.

I wonder if that fix should exist at all, considering that:

 - dhclient script doesn't implement at all such logic (it has some
   code to add direct routes to the *gateway* if not directly
   reachable)

 - if needed, the server can be configured to push the missing route
   to be reached by the client in the classless-routes options

So, in my opinion that code should be removed or at least fixed to add
the route only when strictly necessary.

Comment 3 Jonathan Maxwell 2017-05-09 22:26:14 UTC
(In reply to Beniamino Galvani from comment #2)
> Hi,
> 
> it's wrong that NM adds such route because the DHCP server is already
> reachable through the existing routes pushed by the server. Also, the
> way in which the route gateway is chosen is wrong because it doesn't
> consider the static routes, but only the default gateway. I think that
> the fix for bug 983325 was implemented in the wrong way.
> 
> I wonder if that fix should exist at all, considering that:
> 
>  - dhclient script doesn't implement at all such logic (it has some
>    code to add direct routes to the *gateway* if not directly
>    reachable)
> 
>  - if needed, the server can be configured to push the missing route
>    to be reached by the client in the classless-routes options
> 

They are using the above to add classless-static routes already. 

> So, in my opinion that code should be removed or at least fixed to add
> the route only when strictly necessary.

Hi,

Thanks for looking into this.

I agree. That makes sense. Otherwise customers with more than one router on a subnet will have problems.

Comment 4 Beniamino Galvani 2017-05-10 16:07:57 UTC
Created attachment 1277653 [details]
[PATCH] dhcp: don't add route to DHCP4 server

Comment 5 Beniamino Galvani 2017-05-10 16:10:18 UTC
Created attachment 1277654 [details]
Routed DHCP test script

For future memory: a simple script to test scenarios like this

Comment 6 Thomas Haller 2017-05-10 19:54:06 UTC
(In reply to Beniamino Galvani from comment #4)
> Created attachment 1277653 [details]
> [PATCH] dhcp: don't add route to DHCP4 server

lgtm

Comment 7 Francesco Giudici 2017-05-16 08:33:34 UTC
I agree on #c2.
Route to DHCP server should be managed in DHCP server configuration in classless-routes options.

Patch lgtm

Comment 9 errata-xmlrpc 2017-08-01 09:28:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2299