Red Hat Bugzilla – Bug 170764
dhclient spews requests on disk error
Last modified: 2007-11-30 17:07:21 EST
+++ This bug was initially created as a clone of Bug #159929 +++
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.8) Gecko/20050513
Description of problem:
this is part conjecture since the computers this happens on can't be used but
needs to be reinstalled with fresh harddrives. it has happened twice so far.
the issue is that dhclient sends a DHCP request, gets it acknowledged and
accepts the given IP. it will then presumably try to write the data to disk,
but fail (on the console, disk I/O errors scroll by in a terrifying speed). it
will then IMMEDIATELY send another request.
at our system this means 300+ completed DHCP transactions per second per such
host. since each transaction results in six lines of output, our log on the
server gets swamped and fills the disk rather quickly (half a gigabyte per hour,
which is more than our total log volume for a whole day). when the log disk
goes full, we lose logging for other services as well, so this is a real problem
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. you need: one computer, one hammer.
2. open case. whack harddisk with hammer. be careful not to whack too hard.
3. boot computer.
Actual Results: the DHCP server is flooded by requests
Expected Results: the computer should wait a minimum of a second between requests.
-- Additional comment from firstname.lastname@example.org on 2005-06-09 10:27 EST --
Oh dear! Sorry to hear about these problems you are having with dhclient.
From a quick look at the code, it seems dhclient does not actually test
if the write of a bound/renewed lease succeeds or not.
What can make it cycle from the bound/renew state to the initial state is
if execution of dhclient-script fails - so the "I/O errors" observed could
have been for reads of /sbin/dhclient-script . Do you have any record of these
I/O error messages? If so, it would be useful to append an example to this bug
If the hard disk failure occurred after dhclient has run and before it
binds/renews a lease, it would be unable to run /sbin/dhclient-script,
and could cycle infinitely - I agree this behaviour should be corrected.
Did you actually verify that the hard disk had failed ?
I'd just like to make sure that this problem was actually due to a hard disk
failure and not problems with dhclient-script.
Presumably the hard disk on which the /var/log partition resides was not the
one which failed ? Else you would not have had the log fill up - or were
you using remote logging ? Some examples of the log messages generated by
dhclient would be useful to append to this bug.
If dhclient detects an unrecoverable dhclient-script failure, it should
probably just exit . I'll work on implementing this. But the further
information described above would be much appreciated in resolving this bug.
-- Additional comment from email@example.com on 2005-06-09 11:00 EST --
the log disk in question is on our DHCP server (which also happens to be our
central logging host), the messages are from dhcpd running there, not the client
machine. it's just the normal sequence of DHCPREQUEST DHCPOFFER DHCPACK.
I'm afraid we don't have any record of the messages on the failing computers.
the console messages zoom by so fast it's impossible to read them, and after
powering down, the hard drive won't even spin up anymore.
(I'm impressed by the speedy response :)
-- Additional comment from firstname.lastname@example.org on 2005-06-09 20:32 EST --
This bug is now fixed with dhcp-3.0.1-40_EL3, which should be in
RHEL-3-U6, but which meanwhile can be downloaded from:
Please try it out and let me know of any issues - thank you.
ISC bug 14894 raised and patch submitted - accepted for inclusion
in next upstream DHCP release.
-- Additional comment from email@example.com on 2005-07-05 11:33 EST --
FWIW, the test RPM has been running in one of our labs (40 student workstations)
for almost a month without any issues. I haven't brought out the hammer, though :-)
fixed with dhcp-3.0.1-40_EL4+
I occurs to me that the problem with this bug, and the seemingly related bug
162080 I reported in June, is that fixing the client side is only half a
solution. Sure, the dhclient has to be fixed not to misbehave in this way.
However, the dhcp server should similarly be fixed to detect a misbehaving
client and stop talking to it for a while. Otherwise, there's a potential
denial of service attack problem, in which a misbehaving client can tie up a
server or fill its /var/log partition with junk messages. Any chance of getting
the server side fixed to handle this better as well?
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.