Bug 1394500
| Summary: | NetworkManager doesn't honor ip address order | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Tuomo Soini <tis> | ||||||
| Component: | NetworkManager | Assignee: | Beniamino Galvani <bgalvani> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Desktop QE <desktop-qa-list> | ||||||
| Severity: | urgent | Docs Contact: | |||||||
| Priority: | urgent | ||||||||
| Version: | 7.3 | CC: | antaylor, atragler, bgalvani, c.handel, derfian, fedora, fgiudici, fweimer, ilmis, jan.engels, lrintel, mabrown, marco.mauritsch, msugaya, rkhan, rmanes, rsawhill, sukulkar, thaller, vasmith, vbenes, villapla | ||||||
| Target Milestone: | rc | Keywords: | ZStream | ||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | NetworkManager-1.8.0-0.2.git20170215.1d40c5f4.el7 | Doc Type: | If docs needed, set a value | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | |||||||||
| : | 1421193 1513009 (view as bug list) | Environment: | |||||||
| Last Closed: | 2017-08-01 09:19:37 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1421193 | ||||||||
| Attachments: | 
 | ||||||||
| 
        
          Description
        
        
          Tuomo Soini
        
        
        
        
        
          2016-11-12 18:10:15 UTC
        
       Proposed fix in branch bg/ipv4-order-rh1394500. >> ip4-config: don't change order of addresses in the same subnet the addresses get sorted in order to be shown in the UI in a consistent way. The sort order should only matter when exposing them to the client. It's not clear to me, that the client-related order is the same that you need internally. I think instead we should sort the addresses in NMIPxConfig via nm_ip4_config_addresses_sort(). Instead, sort them in get_property() before creating the variant for PROP_ADDRESSES and PROP_ADDRESS_DATA. Maybe that result variant should be cached and reused too. And on every modification of priv->addresses, the cached variant would be thrown away. Same for IPv6. Yes, probably the client-exposed sort-order should be extended to consider SECONDARY addresses -- meaning: their original sort order in face of same subnet -- exactly like your patch does. I just think, we should no longer sort the addresses in NMIP4Config inplace, but only for creating PROP_ADDRESS*. >> ip4-config: keep track of primary / secondary addresses >> platform: fix the order of addition of primary and secondary IPv4 addresses + but probably it's not a good + * idea doing it here since it would cause the execution of + * handlers whenever you call a platform operation, you must anticipate that changed-signals will be emitted and that the platform cache changes (invalidating all previous pointers). In practice that is not true for getters, like nm_platform_link_get(), but that is not guaranteed. Especially nm_platform_ip4_address_delete() very much invokes handlers right away. + We could resync the platform + * cache to know what happened, whether resync is enough, depends on whether kernel sends the RTM_DELADDR message for the other address right away with the netlink ACK or delayed. If delayed, there is race, and you never know that a RTM_DELADDR is still about to be sent. if you no longer nm_ip4_config_addresses_sort(), priv-addresses are in the order that you want. I don't think you need to track IFA_F_SECONDARY in NMIP4Config. It's a redundant setting. Just consider the sort order in nm_platform_ip4_address_sync(), instead of relying on a IFA_F_SECONDARY flag. That patch doesn't correct the problem. Created attachment 1220768 [details]
Revert 719742513705877f60a8b1cf69f393e5f4eed30f to fix ipv4 address order
This revert patch with original commit id fixes the ipv4 address order problem.
This is modified from original commit revert by leaving added header in place.
Changing defined order of ip addresses ANY WAY is a very very bad idea. (In reply to Tuomo Soini from comment #4) > That patch doesn't correct the problem. Did you take all the 3 patches in that branch? Well - with the info you gave I wouldn't have guessed I needed three commits from that branch. You didn't even tell which git tree to check :) Tested, those three commits applied fixes the issue. (In reply to Tuomo Soini from comment #8) > Well - with the info you gave I wouldn't have guessed I needed three commits > from that branch. > You didn't even tell which git tree to check :) > Tested, those three commits applied fixes the issue. Because I didn't expect that anyone except other developers would try building it :) Anyway, thanks for testing. (In reply to Thomas Haller from comment #3) > >> ip4-config: don't change order of addresses in the same subnet > > the addresses get sorted in order to be shown in the UI in a consistent way. > The sort order should only matter when exposing them to the client. > It's not clear to me, that the client-related order is the same that you > need internally. > > I think instead we should sort the addresses in NMIPxConfig via > nm_ip4_config_addresses_sort(). Instead, sort them in get_property() before > creating the variant for PROP_ADDRESSES and PROP_ADDRESS_DATA. > Maybe that result variant should be cached and reused too. And on every > modification of priv->addresses, the cached variant would be thrown away. Makes sense. > Same for IPv6. > Yes, probably the client-exposed sort-order should be extended to consider > SECONDARY addresses -- meaning: their original sort order in face of same > subnet -- exactly like your patch does. I just think, we should no longer > sort the addresses in NMIP4Config inplace, but only for creating > PROP_ADDRESS*. Ok. > >> ip4-config: keep track of primary / secondary addresses > >> platform: fix the order of addition of primary and secondary IPv4 addresses > > > + but probably it's not a good > + * idea doing it here since it would cause the execution > of > + * handlers > > whenever you call a platform operation, you must anticipate that > changed-signals will be emitted and that the platform cache changes > (invalidating all previous pointers). In practice that is not true for > getters, like nm_platform_link_get(), but that is not guaranteed. > Especially nm_platform_ip4_address_delete() very much invokes handlers right > away. Understood. > > + We could resync the platform > + * cache to know what happened, > > whether resync is enough, depends on whether kernel sends the RTM_DELADDR > message for the other address right away with the netlink ACK or delayed. If > delayed, there is race, and you never know that a RTM_DELADDR is still > about to be sent. > > > if you no longer nm_ip4_config_addresses_sort(), priv-addresses are in the > order that you want. I don't think you need to track IFA_F_SECONDARY in > NMIP4Config. It's a redundant setting. Just consider the sort order in > nm_platform_ip4_address_sync(), instead of relying on a IFA_F_SECONDARY flag. Yeah, the order determines also the primary / secondary flag, but that information is implicit and so sooner or later we must extract it again by scanning the list. I can remove that commit and gather that information when needed. How about the new version of the branch? (IPv6 is still to do) maybe replace:
         nm_clear_g_variant (&priv->address_data_variant);
         nm_clear_g_variant (&priv->addresses_variant);
         _notify (config, PROP_ADDRESS_DATA);
         _notify (config, PROP_ADDRESSES);
with
         _notify_addresses (self)
and maybe combine the two cases
     case PROP_ADDRESS_DATA:
     case PROP_ADDRESSES:
in one. Always create priv->address_data_variant and priv->addresses_variant together. Realistically, the only user of these properties is the D-Bus glue code, and it will always fetch them both anyway. So treating them separately is more lines of code, and more runtime-overhead.
>> platform: fix the order of addition of primary and secondary IPv4 addresses
this looks good.
I pushed a fixup for a suggested optimization :)
(In reply to Thomas Haller from comment #12) > maybe replace: > > nm_clear_g_variant (&priv->address_data_variant); > nm_clear_g_variant (&priv->addresses_variant); > _notify (config, PROP_ADDRESS_DATA); > _notify (config, PROP_ADDRESSES); > > with > > _notify_addresses (self) Ok. > and maybe combine the two cases > > case PROP_ADDRESS_DATA: > case PROP_ADDRESSES: > > in one. Always create priv->address_data_variant and priv->addresses_variant > together. Realistically, the only user of these properties is the D-Bus glue > code, and it will always fetch them both anyway. So treating them separately > is more lines of code, and more runtime-overhead. Makes sense. > >> platform: fix the order of addition of primary and secondary IPv4 addresses > > this looks good. > > I pushed a fixup for a suggested optimization :) I added a fix, but looks good, thanks! Repushed to a different branch bg/ip-order-rh1394500 (because now it changes also IPv6). lgtm *** Bug 1399464 has been marked as a duplicate of this bug. *** Branch merged to master: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=207a7470641f35867278994299a83621062e5663 While a future fix for NetworkManager might resolve the issue reported in this BZ, I'd like to point out that since before the RHEL6 days, the initscripts documentation (/usr/share/doc/initscripts*/sysconfig.txt) has said something like: > IPADDRn= ... > The "n" is expected to be consecutive positive integers starting from 0. > It can be omitted if there is only one address being configured. I.e., to anyone looking for an immediate solution to this issue: append a "0" index to your IPADDR= and PREFIX=/NETMASK= directives. For example, to fix the ifcfg-eth0 file from the BZ description comment #0: > IPADDR0=192.168.2.2 > PREFIX0=24 > IPADDR1=192.168.2.1 > PREFIX1=24 Ryan: you didn't understand at all what this bug was about. Configuring like you said doesn't resolve the bug. (In reply to Tuomo Soini from comment #22) > Ryan: you didn't understand at all what this bug was about. > Configuring like you said doesn't resolve the bug. Hmm. Curious. Perhaps you're right that I don't understand -- I'm open to that possibility; however, I helped a customer with this issue last week and after they implemented the change based on my above explanation, their issue was resolved. Good luck. (In reply to Ryan Sawhill from comment #23) > (In reply to Tuomo Soini from comment #22) > > Ryan: you didn't understand at all what this bug was about. > > Configuring like you said doesn't resolve the bug. > > Hmm. Curious. Perhaps you're right that I don't understand -- I'm open to > that possibility; however, I helped a customer with this issue last week and > after they implemented the change based on my above explanation, their issue > was resolved. Good luck. Ryan, my systems were already configured like your workaround suggests - I still encountered this bug. > # grep IPADDR /etc/sysconfig/network-scripts/ifcfg-enp7s0 > IPADDR0=x.118 > IPADDR1=x.104 > IPADDR2=x.103 Sorry Ryan but as Karl said, your workaround doesn´t resolve the bug:
# grep IPADDR /etc/sysconfig/network-scripts/ifcfg-eno16777984
IPADDR0=x.x.x.190
IPADDR1=x.x.x.159
IPADDR2=x.x.x.214
# ip addr
eno16777984: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether x:x:x:x:x:x brd ff:ff:ff:ff:ff:ff
    inet x.x.x.159/x brd x.x.x.x scope global eno16777984
       valid_lft forever preferred_lft forever
    inet x.x.x.190/x brd x.x.x.x scope global secondary eno16777984
       valid_lft forever preferred_lft forever
    inet x.x.x.214/x brd x.x.x.x scope global secondary eno16777984
       valid_lft forever preferred_lft forever
*** Bug 1416516 has been marked as a duplicate of this bug. *** I wonder if current patch handles this case correctly: IPADDR=192.168.1.1 PREFIX=24 IPADDR1=10.5.9.1 PREFIX1=24 This still means 192.168.1.1 MUST be the first ip added, so sorting network order must not happen. And similar issue with ipv6 too, order of primary ip must not be touched, and even order of netoworks can not be sorted. (In reply to Tuomo Soini from comment #38) > I wonder if current patch handles this case correctly: Yes, it does. Addresses from configuration are applied in the order they appear. Thank you Beniamino for confirmation. I verified this by backporting from upstream nm-1-4 branch all the changes related to this. Hello all, It appears this bug was copied: https://bugzilla.redhat.com/show_bug.cgi?id=1421193 And that bug has been closed/errata: https://rhn.redhat.com/errata/RHBA-2017-0377.html NetworkManager 1.4.0-17.el7_3 is the latest update. Andrew This is not fixed yet. Latest update changed ipv6 address ordering. Now First added ipv6 address is source for ipv6 connections. I explain in more detail: IPV6INIT=yes IPV6_AUTOCONF=no IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=yes IPV6ADDR=2001:db8:e:10::4/64 IPV6ADDR_SECONDARIES="2001:db8:e:10::57/64 2001:db8:e:10::30/64" IPV6_DEFAULTGW="2001:db8:e:10:.1" IPV6_PRIVACY=no This causes 2001:db8:e:10::30 to be the default source ipv6 address. When 2001:db8:e:10::4 should be the source. Please, fix ipv6 part properly. Before the fix in this bz was applied, IPv6 addresses in configuration were sorted lexically and then added to interface, which is clearly wrong. The fix removed the sorting so that now the order is consistent. As you have noticed, differently from IPv4, IPv6 addresses added later appear on the top of "ip addr" output because kernel consider them with greater priority. So, if you add IPv6 addresses (X, Y, Z), Z will be considered primary. I consider this a bug, but at the same time I see that both init-scripts and NM had this behavior for long time (RHEL6 and probably before) and so I'm not sure if we want to deliberately change it. If we really want to change behavior, the attached patch restores the correct order of IPv6 addresses according to NM semantics (first in configuration is preferred). Created attachment 1274606 [details]
[PATCH] core: apply IPv6 addresses in reverse order
There is real difference in rhel6 and rhel7. In rhel6 last added ipv6 address is last ip in ip addr show listing. And that's used as source ip. Now Last added ip is first in list. And that's used as source ip. Like my example config clearly shows, networkmanager add ipv6 addresses in wrong order now. Sorry, I did more testing. rhel6 works exactly same, I had my own fix for the order problem there (changed preferred_lft to 0 for secondary addresses). But yes, your patch seem to fix the ordering for ipv6 addresses to the way which is documented. I tried again and these are the results on different configurations:
 RHEL6 with initscripts and NM, as well as RHEL 7.0, 7.1, 7.2:
 addresses are in reverse configuration order:
     inet6 2001:db8:e:10::30/64 scope global
     inet6 2001:db8:e:10::57/64 scope global
     inet6 2001:db8:e:10::4/64 scope global
 RHEL7.3 with NM before the fix: NM sorts the addresses before applying
 them, breaking user order:
     inet6 2001:db8:e:10::57/64 scope global
     inet6 2001:db8:e:10::30/64 scope global
     inet6 2001:db8:e:10::4/64 scope global
 RHEL7.3 with latest NM (1.4.0-19): after the fix in this bz the
 original configuration order is restored:
     inet6 2001:db8:e:10::30/64 scope global
     inet6 2001:db8:e:10::57/64 scope global
     inet6 2001:db8:e:10::4/64 scope global
So, the fix seems correct in the sense that it restores the established behavior. We should now decide if that behavior is correct and we want to change it.
Yes, but with the attached patch, order will change to:
    inet6 2001:db8:e:10::4/64 scope global
    inet6 2001:db8:e:10::57/64 scope global
    inet6 2001:db8:e:10::30/64 scope global
Which is the order what users will expect. Primary (IPV6ADDR=) ip is expected to be the source ip server uses.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2299 |