See attached tcpdump. See broken UDP checksum on the DHCP Request packets. 14:52:42.807355 IP (tos 0x0, ttl 64, id 38089, offset 0, flags [none], proto 17, length: 349) 0.0.0.0.bootpc > 255.255.255.255.bootps: [bad udp cksum fe01!] BOOTP/DHCP, Request from 00:0a:95:c4:f1:c8, length: 321, xid:0x46e052a6, secs:10, flags: [none] (0x0000) Client Ethernet Address: 00:0a:95:c4:f1:c8 Vendor-rfc1048: DHCP:REQUEST MSZ:548 SID:192.168.2.1 RQ:192.168.2.92 LT:6000 PR:SM+DG+NS+HN+DN+BR+YD+YS+NTP VC:"Linux 2.6.10-1.1127_FC4.dwmw2 ppc" CID:[ether]00:0a:95:c4:f1:c8
Created attachment 110724 [details] pcap file showing broken packets.
Dave, does this appear to be only on PPC? It may be a missing htonl() somewhere... I'm fairly sure that ethereal thinks the packets are OK on i386. Peter: did you see anything in the DCHP code when you were looking through it that might be wrong?
It's unlikely to be hton[sl] because that's a NOP on PPC anyway. But it does look a bit like an endianness problem. The difference between the checksum and what it _ought_ to be is always 0x1fe afaict -- we added 2 to the wrong byte. For example, 'Checksum: 0x49f9 (incorrect, should be 0x4bf5)' This happens on the DHCP Request packets but not DHCP Discover.
Created attachment 110920 [details] checksum patch The checksum routine is still not ideal -- it won't handle being given a buffer which isn't aligned. Can you guarantee that your buffers will always be correctly aligned?
That patch is purely empirical, btw. I haven't checked that it's correct. Now I'm trying to work out WTF NetworkManager is doing to the interface to break IPv6. IPv6 is _automatic_ -- the kernel picks up the addresses for itself. But after NetworkManager brings up the interface, it seems to be hidden from the IPv6 stack - it's even absent from /proc/sys/net/ipv6/conf/ . If we can just get IPv6 working, I could start to use NetworkManager.
The patch seems to be correct, from what I can find if the packet length is not even, you have to copy the last value into a 16-bit zeroed value and checksum that. Previously of course, the code was using u_char and probably picking up some other random field next to the UDP header.
WRT to the IPv6 stuff, is there no way to bump the kernel to tell it to send those packets again? There's no guarantee that the user actually has a connection when the device is brought up, on wireless for example you don't have a connection until NetworkManager actually gets one for you, and not all drivers do netif_carrier_on(), so how does the kernel actually think it can get the timing of this right? NetworkManager is the one here that knows when you're connected by a fairly complicated set of heuristics, the kernel has no clue and probably doesn't want to.
Fix added to CVS for the checksum stuff, and also we now flush only "scope global" and "scope site" addresses with /sbin/ip.
Your further patch applied, should be fixed now. Can you confirm?
Yeah, s'fine now. Thanks.