Created attachment 409949 [details] gets rid of a useless wait Description of problem: Before trying to get a lease, dhclient waits between 0 and 4 seconds for no good reason. In June 2006 the ISC promised to get rid of this wait but never did: https://lists.isc.org/mailman/htdig/dhcp-users/2006-June/thread.html#928 The one-line patch attached simply gets rid of this wait. Version-Release number of selected component (if applicable): At least 3 and 4. Steps to Reproduce: 1. Connect your network cable 2. Be patient 3. Be more patient
FWIW this patch has been filed in ISC's (hidden?) bug tracker: [ISC-Bugs #21219]
Thanks. I know about that delay. I fought it is reasonable, but when Ted Lemon and David W. Hankins say it can be safely removed why not. It really looks like a way to save some booting time :-) RFC 3315 (DHCP for IPv6) defines delay before sending first Solicit, Confirm, Information-request message. These delays are set to 1 second, so compromise can be to use 1 second delay also for dhclient for IPv4.
dhcp-4.1.1-20.fc13 has been submitted as an update for Fedora 13. http://admin.fedoraproject.org/updates/dhcp-4.1.1-20.fc13
(In reply to comment #2) > but when Ted Lemon and David W. Hankins say it can be safely > removed why not. It really looks like a way to save some > booting time :-) I think the only case where clients would really start running dhclient at the same time is the case where they are all directly linked to the same switch/router, and this switch (not the clients) has a power outage. So this would never be more than a few hundreds clients potentially synchronized. > RFC 3315 (DHCP for IPv6) defines delay before sending first > Solicit, Confirm, Information-request message. Thanks for the reference. > These delays are set to 1 second, In the reference I read: "delayed by a RANDOM amount of time BETWEEN 0 and 1 second". > so compromise can be to use 1 second delay also for dhclient for IPv4. Your current patch is either 0 or 1 second delay. And never in between. I just had a better look at the cur_time macro and realized it is rounding to the previous second. So cur_time + random() % 1 in your patch will make half the clients start as soon as they can, and the other half start all synchronized on the next second. What about this: tv.tv_sec = cur_time + 1; /* convert lower rounding to upper rounding */ tv.tv_usec = random() * 1000000; /* stagger clients in case of This would add a random delay between 0 and 2 seconds: 0-1 seconds of (unfortunate) rounding + 0-1 seconds to de-synchronize clients. The "exact" RFC solution would be this: tv.tv_sec = cur_tv.tv_sec tv.tv_usec = cur_tv.tv_usec + random() * 1000000; if (tv.tv_usec >= 1000000) { tv.tv_usec -= 1000000; tv.tv_sec++; } But this would bypass the cur_time macro, which is here for some reason I guess (see RELNOTES)
(In reply to comment #4) > (In reply to comment #2) > > but when Ted Lemon and David W. Hankins say it can be safely > > removed why not. It really looks like a way to save some > > booting time :-) > > I think the only case where clients would really start running > dhclient at the same time is the case where they are all directly > linked to the same switch/router, and this switch (not the > clients) has a power outage. So this would never be more than a > few hundreds clients potentially synchronized. > When the switch/router has a power outage, the clients do not (re)start their dhclient. Client needs to contact server (switch/router) either in case the client is (re)starting itself or in the case when the client needs to renew its lease. So when the server (switch/router) has power outage only clients that (re)start at that moment or need to renew its lease at that moment notice it. > > RFC 3315 (DHCP for IPv6) defines delay before sending first > > Solicit, Confirm, Information-request message. > > Thanks for the reference. > > > These delays are set to 1 second, > > In the reference I read: "delayed by a RANDOM amount of time > BETWEEN 0 and 1 second". > Sorry, I meant *max* delay(s). > The "exact" RFC solution would be this: > tv.tv_sec = cur_tv.tv_sec > tv.tv_usec = cur_tv.tv_usec + random() * 1000000; > if (tv.tv_usec >= 1000000) { > tv.tv_usec -= 1000000; > tv.tv_sec++; > } > But this would bypass the cur_time macro, which is here for > some reason I guess (see RELNOTES) If we want to leave there some delay, we can make it the exact way as it is in dhc6.c: tv.tv_sec = cur_tv.tv_sec; tv.tv_usec = cur_tv.tv_usec; tv.tv_usec += (random() % 100) * 10000; if (tv.tv_usec >= 1000000) { tv.tv_sec += 1; tv.tv_usec -= 1000000; } But I think there's so much randomness, that we can remove the delay at all. So the patch will look like: - tv.tv_sec = cur_time + random() % 5; + tv.tv_sec = cur_time;
(In reply to comment #5) > When the switch/router has a power outage, > the clients do not (re)start their dhclient. When the switch/router has a power outage, all clients running NetworkManager will all restart their dhclient (at the same time). This is (unfortunately) the default behaviour that with at least Fedora 10, 11 and 12. It's not dhclient's fault, but this is what happens. More about this here: http://thread.gmane.org/gmane.linux.network.networkmanager.devel/15570/
Yeah, but that's a bug in NM and should be fixed in NM. We shouldn't try to work around their bug here in dhclient.
(In reply to comment #7) > Yeah, but that's a bug in NM and should be fixed in NM. We shouldn't try to > work around their bug here in dhclient. I do not think this is a bug but a feature that should be optional. Suppose a switch has a power failure for a long time, longer than DHCP leases. Wouldn't you be happy that all notes linked to it are back in business as soon as the power is back on? (possibly hammering the DHCP server a bit). You can easily find other use cases where this is a desired feature, obviously starting with a laptop. Granted, the "brief switch outage" is not one of them. Even there it is not so bad since NM's off/on will not harm existing connections (as opposed to Windows but I digress). I guess any DHCP server would be able to support a few hundreds of simultaneous requests without problem anyway so this discussion is probably just for the record.
dhcp-4.1.1-21.fc13 has been pushed to the Fedora 13 stable repository. If problems still persist, please make note of it in this bug report.
https://admin.fedoraproject.org/updates/dhcp-4.1.1-21.fc13