Description of problem: On a system with channel bonding ppp selects the wrong proxyarp interface for ppp clients. It should select the bonded interface but doesn't, it checks for up interfaces but uses the last one seen (assuming the netmask is the same). As such the bond0 interface is seen but discarded in preference for eth0. Clients have no Internet connection with eth0. Modifying the arp table to delete the eth0 entry for the client and setting a bond0 entry then allows Internet traffic. The problem is that with an incorrect arp table entry the upstream router sends arp queries for the client IP/MAC address but the FC3 server never sends an arp reply (tcpdump shows this). As such traffic from the client can get out but never gets back to the client. doing the mod above on the arp table entry, and then tcpdump shows arp replies being sent out. Doing something on the FC3 server like 'ping -I eth0 141.163.1.250' causes an error (Destination Host Unreachable), whereas using bond0 it works. Note that on an FC2 server this worked fine, bond0 was always used. Unfortunately the FC2 server is live so cannot test with it. Version-Release number of selected component (if applicable): kernel-2.6.10-1.741_FC3 ppp-2.4.2-6.4.FC3 How reproducible: Configure system for channel bonding. Connect to system using ppp. Log shows that eth0 interface has been selected for proxyarp; arp table shows that ppp client has an entry with eth0 interface. Client has no Internet, or local, connectivity. Steps to Reproduce: 1. 2. 3. Actual results: PPP client has no network connectivity. Expected results: Client should see the Internet. Additional info: Done some testing with the PPP source RPM - note setting SYSDEBUG (or was it DEBUGSYS) causes the pppd daemon to die (signal 11). Obviously a bug in the debugging code!? Modified pppd to send out more info about the interfaces looked at. Log shows: ========================================================= Jan 28 14:49:01 fred pppd[1913]: proxy arp: scanning 7 interfaces for IP 141.163.108.30 Jan 28 14:49:01 fred pppd[1913]: proxy arp: examining interface lo Jan 28 14:49:02 fred pppd[1913]: proxy arp: examining interface bond0 Jan 28 14:49:02 fred pppd[1913]: proxy arp: interface addr 141.163.109.250 maskffff Jan 28 14:49:02 fred pppd[1913]: proxy arp: examining interface eth0 Jan 28 14:49:02 fred pppd[1913]: proxy arp: interface addr 141.163.109.250 maskffff Jan 28 14:49:02 fred pppd[1913]: proxy arp: examining interface eth1 Jan 28 14:49:02 fred pppd[1913]: proxy arp: examining interface eth2 Jan 28 14:49:02 fred pppd[1913]: proxy arp: examining interface eth3 Jan 28 14:49:02 fred pppd[1913]: proxy arp: examining interface ppp0 Jan 28 14:49:02 fred pppd[1913]: found interface eth0 for proxy arp Jan 28 14:49:02 fred pppd[1913]: local IP address 192.168.108.20 Jan 28 14:49:02 fred pppd[1913]: remote IP address 141.163.108.30 ============================================================ As can be seen bond0 is determined as suitable, but the code continues looking. It then sees eth0 and since that is the last one selected it uses that one. 'ifconfig' shows: ============================================== bond0 Link encap:Ethernet HWaddr 00:08:02:E6:57:1B inet addr:141.163.109.250 Bcast:141.163.109.255 Mask:255.255.254.0 inet6 addr: fe80::208:2ff:fee6:571b/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:66629 errors:0 dropped:0 overruns:0 frame:0 TX packets:31094 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:5146109 (4.9 MiB) TX bytes:4948431 (4.7 MiB) eth0 Link encap:Ethernet HWaddr 00:08:02:E6:57:1B inet addr:141.163.109.250 Bcast:141.163.109.255 Mask:255.255.254.0 inet6 addr: fe80::208:2ff:fee6:571b/64 Scope:Link UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:51545 errors:0 dropped:0 overruns:0 frame:0 TX packets:31087 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:4240959 (4.0 MiB) TX bytes:4948509 (4.7 MiB) =================================================== As can be seen the netmask is the same, hence both interfaces are deemed by pppd to be 'useable'.
As far as I can see, line 229 of pppd/sys-linux.c should include IFF_SLAVE. That is: #define FLAGS_MASK (IFF_UP | IFF_BROADCAST | \ IFF_POINTOPOINT | IFF_LOOPBACK | IFF_NOARP | IFF_SLAVE) This then causes pppd to skip 'slave' interfaces and use the master interface. After testing this, pppd correctly used the bond0 interface and the client had network connectivity.
A quick test with an FC2 server whilst no-one was using it, showed: ======================================================= Jan 28 16:06:32 barney pppd[8248]: proxy arp: scanning 7 interfaces for IP 141.163.106.1 Jan 28 16:06:32 barney pppd[8248]: proxy arp: examining interface lo Jan 28 16:06:32 barney pppd[8248]: proxy arp: examining interface eth0 Jan 28 16:06:32 barney pppd[8248]: proxy arp: examining interface eth1 Jan 28 16:06:32 barney pppd[8248]: proxy arp: examining interface eth2 Jan 28 16:06:32 barney pppd[8248]: proxy arp: interface addr 141.163.107.250 mask ffff Jan 28 16:06:32 barney pppd[8248]: found interface to be used eth2 Jan 28 16:06:32 barney pppd[8248]: proxy arp: examining interface eth3 Jan 28 16:06:32 barney pppd[8248]: proxy arp: examining interface bond0 Jan 28 16:06:32 barney pppd[8248]: proxy arp: interface addr 141.163.107.250 mask ffff Jan 28 16:06:32 barney pppd[8248]: found interface to be used bond0 Jan 28 16:06:32 barney pppd[8248]: proxy arp: examining interface ppp0 Jan 28 16:06:32 barney pppd[8248]: found interface bond0 for proxy arp ==================================================== As can be seen the bond0 interface is not looked at until after the eth ones. As such the bond0 is selected and the cleints are happy. Could it be that the ordering (?) of the interfaces has changed in the kernel somewhere/somehow between FC2 and FC3? FC2 server has rpms kernel-2.6.10-1.9_FC2 and ppp-2.4.2-3.FC2.1 John.
I have upgraded one of our VPN/PPTP servers from FC3 to FC5. This problem seems to have gone away now. I can see that the relevant part of the pppd code is still the same, but the order of interfaces returned by the kernel may well have changed. The ppp daemon is reporting that it is using the bonded interface (bond0) for the clients. Close this bugzilla report if you wish to. John.
Fedora Core 3 is now maintained by the Fedora Legacy project for security updates only. If this problem is a security issue, please reopen and reassign to the Fedora Legacy product. If it is not a security issue and hasn't been resolved in the current FC5 updates or in the FC6 test release, reopen and change the version to match. Thank you!
This seems to be fixed in FC5 (see comment 3). I'll close the bug report.