+++ This bug was initially created as a clone of Bug #147375 +++ From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041111 Firefox/1.0 Description of problem: On reboot or on service start/restart the dhcpd startup will emit an error message like the one below: Starting dhcpd: *** glibc detected *** free(): invalid pointer: 0x0867da10 *** Internet Systems Consortium DHCP Server V3.0.1 Copyright 2004 Internet Systems Consortium. All rights reserved. For info, please visit http://www.isc.org/sw/dhcp/ [FAILED] It does not necessarily happen all the time but it may also take several invocations of the startup to get dhcp started. Version-Release number of selected component (if applicable): dhcp-3.0.1-30_FC3 How reproducible: Sometimes Steps to Reproduce: 1. service dhcpd restart - it may work or it may take several attempts to get the service started. It will also happen if dhcpd is invoked directly. 2. 3. Actual Results: # service dhcpd start Starting dhcpd: *** glibc detected *** free(): invalid pointer: 0x083bca10 *** Internet Systems Consortium DHCP Server V3.0.1 Copyright 2004 Internet Systems Consortium. All rights reserved. For info, please visit http://www.isc.org/sw/dhcp/ [FAILED] Expected Results: # service dhcpd start Starting dhcpd: [ OK ] Additional info: Fedora Core 3 - All current updates for installed software AMD 2700+ - 1GB Ram 240GB Disk (160GB SATA - 80 GB IDE) 1 DVD writer, 1 CD/RW Writer Running as NAT firewall/workstation to home network. glibc-2.3.4-2.fc3 glibc-kernheaders-2.4-9.1.87 kernel-2.6.10-1.760_FC3 I have not yet tried to rebuild SRPM on this system to see if it makes any difference. -- Additional comment from jvdias on 2005-02-07 13:56 EST -- Is this an AMD 64-bit or a 32-bit machine ? I am not able to reproduce this problem on an i386 platform . Please append the complete output of the dhcpd run attempt when it fails - try this: # cd /var/lib/dhcp # script /tmp/dhcpd.log # ulimit -c unlimited # dhcpd -d -f -tf /tmp/dhcpd.trace.log ( wait for problem, press CTRL-C) # exit # ls -l core* # gzip core* and then append the /tmp/dhcpd.*.log and any core.*.gz files to this bug or send them to me - thanks. -- Additional comment from wrsturm on 2005-02-07 14:18 EST -- Created an attachment (id=110739) dhcpd core file -- Additional comment from wrsturm on 2005-02-07 14:19 EST -- Created an attachment (id=110740) dhcp script log -- Additional comment from wrsturm on 2005-02-07 14:20 EST -- Created an attachment (id=110741) dhcp trace log -- Additional comment from jvdias on 2005-02-07 14:37 EST -- Thanks! I am now investigating as top priority. Does the system have an AMD 64-bit or AMD 32-bit CPU ? Does it have more than one CPU / hyperthreading enabled ? -- Additional comment from jvdias on 2005-02-07 15:28 EST -- Please can you append the output of these commands to this bug: # uname -a # rpm -q dhcp --queryformat '%{ARCH} %{BUILDHOST}\n' Thank you! -- Additional comment from jvdias on 2005-02-07 15:53 EST -- The core file you sent is a 32-bit core file, but it is from an executable which was linked to the glibc '32-bit compatibility mode' /lib/tls/libc.so.6 from glibc32-2.3.3-68, which is only installed on 64 bit systems. I've searched the AMD website for '2700+' but can find no data as to whether the processor is 64-bit or 32-bit . If you have a 64-bit machine ('uname -m' outputs x86_64), then you should install the dhcp-3.0.1-30_FC3.x86_64.rpm, not the 32-bit dhcp-3.0.1-30_FC3.i386.rpm - this incompatibility could be the source of your problem. Please also do an 'rpm -qf `readlink /lib/tls/libc.so.6`' and tell me which package is output - if glibc32-2.3.3-68, this could also be a problem because glibc was upgraded to 2.3.4-2 while the compatibility library is at glibc-2.3.3, and dhcp-3.0.1-30 was compiled for glibc-2.3.4. -- Additional comment from wrsturm on 2005-02-07 16:42 EST -- It is a 32bit single cpu no hyperthreading uname -a 2.6.10-1.760_FC3 #1 Wed Feb 2 00:14:23 EST 2005 i686 athlon i386 GNU/Linux It is an AMD Athlon 32 bit chip that in marketese performs as fasts as an equivalent intel running at clockspeed of 2700MHz. The actual clock speed is 2170.352 MHz. rpm -q dhcp --queryformat '%{ARCH} %{BUILDHOST}\n' i386 tweety.build.redhat.com uname -m i686 A slightly different variant of the command you sent: rpm -qf /lib/`readlink /lib/tls/libc.so.6` glibc-2.3.4-2.fc3 -- Additional comment from wrsturm on 2005-02-07 17:12 EST -- Here is /proc/spuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 8 model name : AMD Athlon(tm) XP 2700+ stepping : 1 cpu MHz : 2170.352 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 mmx fxsr sse pni syscall mmxext 3dnowext 3dnow bogomips : 4292.60 -- Additional comment from jvdias on 2005-02-07 19:51 EST -- I've tried running dhcpd in playback mode with the trace file and using your exact configuration and lease file hundreds of times in a loop with no exit / core generated, on an Intel i686 system . I'm looking for an Athlon system on which to test, so far without success - so it looks like yours is the only system on which this problem can be reproduced at the moment. Please try the following: 1. Replace the glibc-2.3.4-2.fc3.i686.rpm with glibc-2.3.4-2.fc3.i386.rpm : # rpm -Uvh --force glibc-2.3.4-2.fc3.i386.rpm (both the i386 and i686 RPMs can be downloaded from: ftp://download.fedora.redhat.com/pub/fedora/linux/core/updates/3 Reboot and see if the problem still occurs, repeating steps in Comment #1 . If it does not, it would seem there is a problem with the i686 optimized glibc on Athlon and I will take this up with the glibc / anaconda developers . If it does, then there is a dhcpd problem - you can go ahead and put back the i686 glibc: # rpm -Uvh --force glibc-2.3.4-2.fc3.i686.rpm The core file appears to be corrupt - I cannot obtain any useful data from it on our systems here: " $ gdb /usr/sbin/dhcpd core.31535 GNU gdb Red Hat Linux (6.1post-1.20040607.43rh) ... Core was generated by `dhcpd -d -f -tf /tmp/dhcpd.trace.log'. Program terminated with signal 6, Aborted. Loaded symbols for /usr/sbin/dhcpd ... #0 0x00ee27a2 in ?? () (gdb) where #0 0x00ee27a2 in ?? () #1 0x00991955 in ?? () #2 0x00000000 in ?? () (gdb) quit " If the gdb "where" commmand for the corefile shows anything different on your system, please append it to this bug. The core was generated by an abort in glibc, which is part of new memory validation routines with which there may be a problem on the Athlon . 2. If you still get an exit with core dump, please download this source RPM : http://people.redhat.com/~jvdias/DHCP/FC3/dhcp-3.0.1-30_FC3.src.rpm and build it with: # rpmbuild --rebuild dhcp-3.0.1-30_FC3.src.rpm This will build an unstripped debugging version of DHCP . Install the RPMS produced in /usr/src/redhat/RPMS/i386 and reproduce the problem. The core file should then not be corrupt and doing a 'gdb where' will tell us what is causing the problem - please append the core file or output of gdb 'where' command generated as above. Thank You! -- Additional comment from jvdias on 2005-02-07 20:26 EST -- I've finally found a dual processor athlon on which I can reproduce the problem. It would appear to be a glibc bug . You needn't gather the information requested in the above comment . Installing the i386 glibc may prevent the problem from occurring. I'm continuing to investigate - thanks. -- Additional comment from jvdias on 2005-02-08 18:48 EST -- I've found the problem. It was a memory corruption issue latent to all previous dhcp versions, which just happened to trigger the new glibc / gcc 'FORTIFY_SOURCE' runtime memory validation checks ONLY on the Athlon FC3 platform - weird! But genuine problems were found and are fixed with dhcp-3.0.1-32_FC3, which can be downloaded from : http://people.redhat.com/~jvdias/DHCP/FC3/3.0.1-32_FC3/i386 Please test this version and let me know if it fixes the problem - it certainly does on the machine on which I was able to reproduce it. Thank you! -- Additional comment from wrsturm on 2005-02-08 19:58 EST -- That seems to have done it. Tried a few restarts (4), a reboot then a bunch more restarts(10) without an issue. Thanks. -- Additional comment from jvdias on 2005-02-10 10:27 EST -- I contacted the upstream ISC DHCP maintainer on this issue, and ISC have agreed to fix this in the next release . But they pointed out that the subnet declaration: subnet 68.145.239.64 netmask 255.255.255.255 {} is what causes the problem, as a 32-bit netmask was never envisioned to be used here (but it is not forbidden in the documentation - it just doesn't make any sense) . I think what you are trying to achieve is to get DHCP to ignore the interface with address 68.145.239.64 ? This would be achieved by omitting the 68.145.239.64 subnet declaration altogether. Yes, dhcp will emit a message about "No subnet declaration for xxxx (68.145.239.64)" but this message is harmless - that interface will still be ignored. By default, dhcpd will bind to address 0.0.0.0 (the "ANY") address on the interface for which it has a subnet declaration. Using the 'local-address' option makes it bind to a specific address - so you could specify 'local-address 10.0.0.1;' and dhcpd would bind ONLY to address 10.0.0.1 on the 10.0.0/24 interface . -- Additional comment from wrsturm on 2005-02-10 21:15 EST -- Yep. Thats what I was trying to do. I have made the changes here and will 'live' with the error message (until I forget why I did this). Any day now. :-) -- Additional comment from marius.andreiana on 2005-08-20 03:20 EST -- Closing as errata
fixed with dhcp-3.0.1-40_EL4+
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2006-0114.html