Bug 147375 - Dhcpd will fail to start up, exits with glibc error
Summary: Dhcpd will fail to start up, exits with glibc error
Alias: None
Product: Fedora
Classification: Fedora
Component: dhcp (Show other bugs)
(Show other bugs)
Version: 3
Hardware: i386 Linux
Target Milestone: ---
Assignee: Jason Vas Dias
QA Contact:
Depends On:
Blocks: 170767 170769
TreeView+ depends on / blocked
Reported: 2005-02-07 18:32 UTC by Warren Sturm
Modified: 2007-11-30 22:10 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2005-08-20 07:20:08 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
dhcpd core file (41.39 KB, application/octet-stream)
2005-02-07 19:18 UTC, Warren Sturm
no flags Details
dhcp script log (685 bytes, text/plain)
2005-02-07 19:19 UTC, Warren Sturm
no flags Details
dhcp trace log (2.08 KB, text/plain)
2005-02-07 19:20 UTC, Warren Sturm
no flags Details

Description Warren Sturm 2005-02-07 18:32:03 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5)
Gecko/20041111 Firefox/1.0

Description of problem:
On reboot or on service start/restart the dhcpd startup will emit an
error message like the one below:

Starting dhcpd: *** glibc detected *** free(): invalid pointer:
0x0867da10 ***
Internet Systems Consortium DHCP Server V3.0.1
Copyright 2004 Internet Systems Consortium.
All rights reserved.
For info, please visit http://www.isc.org/sw/dhcp/

It does not necessarily happen all the time but it may also take
several invocations of the startup to get dhcp started.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. service dhcpd restart  - it may work or it may take several attempts
to get the service started.  It will also happen if dhcpd is invoked

Actual Results:  # service dhcpd start
Starting dhcpd: *** glibc detected *** free(): invalid pointer:
0x083bca10 ***
Internet Systems Consortium DHCP Server V3.0.1
Copyright 2004 Internet Systems Consortium.
All rights reserved.
For info, please visit http://www.isc.org/sw/dhcp/

Expected Results:  # service dhcpd start
Starting dhcpd:                                            [  OK  ]

Additional info:

Fedora Core 3 - All current updates for installed software

AMD 2700+ - 1GB Ram 240GB Disk (160GB SATA - 80 GB IDE)
1 DVD writer, 1 CD/RW Writer

Running as NAT firewall/workstation to home network.


I have not yet tried to rebuild SRPM on this system to see if it makes
any difference.

Comment 1 Jason Vas Dias 2005-02-07 18:56:25 UTC
Is this an AMD 64-bit or a 32-bit machine ?

I am not able to reproduce this problem on an i386 platform .

Please append the complete output of the dhcpd run attempt when 
it fails - try this: 

# cd /var/lib/dhcp
# script /tmp/dhcpd.log
# ulimit -c unlimited
# dhcpd -d -f -tf /tmp/dhcpd.trace.log
  ( wait for problem, press CTRL-C)
# exit
# ls -l core*
# gzip core*

and then append the /tmp/dhcpd.*.log and any core.*.gz files to this
bug or send them to me - thanks.

Comment 2 Warren Sturm 2005-02-07 19:18:09 UTC
Created attachment 110739 [details]
dhcpd core file

Comment 3 Warren Sturm 2005-02-07 19:19:27 UTC
Created attachment 110740 [details]
dhcp script log

Comment 4 Warren Sturm 2005-02-07 19:20:01 UTC
Created attachment 110741 [details]
dhcp trace log

Comment 5 Jason Vas Dias 2005-02-07 19:37:23 UTC
Thanks! I am now investigating as top priority. 
Does the system have an AMD 64-bit or AMD 32-bit CPU ?
Does it have more than one CPU / hyperthreading enabled ?

Comment 6 Jason Vas Dias 2005-02-07 20:28:37 UTC
Please can you append the output of these commands to this bug:

# uname -a
# rpm -q dhcp --queryformat '%{ARCH} %{BUILDHOST}\n'

Thank you!

Comment 7 Jason Vas Dias 2005-02-07 20:53:24 UTC
The core file you sent is a 32-bit core file, but it is from an
executable which was linked to the glibc '32-bit compatibility mode'
/lib/tls/libc.so.6 from  glibc32-2.3.3-68, which is only installed 
on 64 bit systems.  
I've searched the AMD website for '2700+' but can find no data as
to whether the processor is 64-bit or 32-bit .
If you have a 64-bit machine ('uname -m' outputs x86_64), then you
should install the dhcp-3.0.1-30_FC3.x86_64.rpm, not the 32-bit
dhcp-3.0.1-30_FC3.i386.rpm - this incompatibility could be the source
of your problem. 
Please also do an 'rpm -qf `readlink /lib/tls/libc.so.6`' and tell me
which package is output - if  glibc32-2.3.3-68, this could also be 
a problem because glibc was upgraded to 2.3.4-2 while the
compatibility library is at glibc-2.3.3, and dhcp-3.0.1-30 was
compiled for  glibc-2.3.4. 

Comment 8 Warren Sturm 2005-02-07 21:42:25 UTC
It is a 32bit single cpu no hyperthreading

uname -a
2.6.10-1.760_FC3 #1 Wed Feb 2 00:14:23 EST 2005 i686 athlon i386 GNU/Linux
It is an AMD Athlon 32 bit chip that in marketese performs as fasts as
an equivalent intel running at clockspeed of 2700MHz.  The actual
clock speed is 2170.352 MHz.

rpm -q dhcp --queryformat '%{ARCH} %{BUILDHOST}\n'
i386 tweety.build.redhat.com

uname -m

A slightly different variant of the command you sent:
rpm -qf /lib/`readlink /lib/tls/libc.so.6`

Comment 9 Warren Sturm 2005-02-07 22:12:08 UTC
Here is /proc/spuinfo

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 6
model           : 8
model name      : AMD Athlon(tm) XP 2700+
stepping        : 1
cpu MHz         : 2170.352
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca
cmov pat pse36 mmx fxsr sse pni syscall mmxext 3dnowext 3dnow
bogomips        : 4292.60

Comment 10 Jason Vas Dias 2005-02-08 00:51:07 UTC
I've tried running dhcpd in playback mode with the trace file and 
using your exact configuration and lease file hundreds of times in 
a loop with no exit / core generated, on an Intel i686 system .
I'm looking for an Athlon system on which to test, so far without
success - so it looks like yours is the only system on which this
problem can be reproduced at the moment.

Please try the following:
1. Replace the glibc-2.3.4-2.fc3.i686.rpm with 
   glibc-2.3.4-2.fc3.i386.rpm :
   # rpm -Uvh --force glibc-2.3.4-2.fc3.i386.rpm 
   (both the i386 and i686 RPMs can be downloaded from:
   Reboot and see if the problem still occurs, repeating steps in
   Comment #1 .
   If it does not, it would seem there is a problem with the i686
   optimized glibc on Athlon and I will take this up with the 
   glibc / anaconda developers .
   If it does, then there is a dhcpd problem - you can go ahead and
   put back the i686 glibc:

   # rpm -Uvh --force glibc-2.3.4-2.fc3.i686.rpm 
   The core file appears to be corrupt - I cannot obtain any 
   useful data from it on our systems here:
$ gdb /usr/sbin/dhcpd core.31535
GNU gdb Red Hat Linux (6.1post-1.20040607.43rh) 
Core was generated by `dhcpd -d -f -tf /tmp/dhcpd.trace.log'.
Program terminated with signal 6, Aborted.
Loaded symbols for /usr/sbin/dhcpd
#0  0x00ee27a2 in ?? ()
(gdb) where
#0  0x00ee27a2 in ?? ()
#1  0x00991955 in ?? ()
#2  0x00000000 in ?? ()
(gdb) quit
    If the gdb "where" commmand for the corefile shows anything
    different on your system, please append it to this bug.
    The core was generated by an abort in glibc, which is part
    of new memory validation routines with which there may be 
    a problem on the Athlon .

2.  If you still get an exit with core dump, please download this
    source RPM :
    and build it with:
    # rpmbuild --rebuild dhcp-3.0.1-30_FC3.src.rpm
    This will build an unstripped debugging version of DHCP .
    Install the RPMS produced in /usr/src/redhat/RPMS/i386 
    and reproduce the problem. 
    The core file should then not be corrupt and doing a 'gdb where'
    will tell us what is causing the problem - please append the
    core file or output of gdb 'where' command generated as above.

Thank You!

Comment 11 Jason Vas Dias 2005-02-08 01:26:46 UTC
I've finally found a dual processor athlon on which I can reproduce
the problem. It would appear to be a glibc bug . You needn't gather
the information requested in the above comment . Installing the 
 i386 glibc may prevent the problem from occurring. I'm continuing 
to investigate - thanks. 

Comment 12 Jason Vas Dias 2005-02-08 23:48:13 UTC
I've found the problem. It was a memory corruption issue latent to
all previous dhcp versions, which just happened to trigger the new
glibc / gcc 'FORTIFY_SOURCE' runtime memory validation checks ONLY
on the Athlon FC3 platform - weird! But genuine problems were found
and are fixed with dhcp-3.0.1-32_FC3, which can be downloaded from :
Please test this version and let me know if it fixes the problem -
it certainly does on the machine on which I was able to reproduce it.
Thank you!

Comment 13 Warren Sturm 2005-02-09 00:58:25 UTC
That seems to have done it.  Tried a few restarts (4), a reboot then a
bunch more restarts(10) without an issue.  Thanks.

Comment 14 Jason Vas Dias 2005-02-10 15:27:48 UTC
 I contacted the upstream ISC DHCP maintainer on this issue, and ISC
 have agreed to fix this in the next release .

 But they pointed out that the subnet declaration:
   subnet netmask {}
 is what causes the problem, as a 32-bit netmask was never 
 envisioned to be used here (but it is not forbidden in the
 documentation - it just doesn't make any sense) .
 I think what you are trying to achieve is to get DHCP to ignore
 the interface with address ?  This would be 
 achieved by omitting the subnet declaration altogether.
 Yes, dhcp will emit a message about 
  "No subnet declaration for xxxx (" 
 but this message is harmless - that interface will still be ignored.
 By default, dhcpd will bind to address (the "ANY") address
 on the interface for which it has a subnet declaration. Using the
 'local-address' option makes it bind to a specific address - so
 you could specify
 and dhcpd would bind ONLY to address on the 10.0.0/24 
 interface .

Comment 15 Warren Sturm 2005-02-11 02:15:42 UTC
Yep.  Thats what I was trying to do.  I have made the changes here and
will 'live' with the error message (until I forget why I did this).

Any day now.  :-)  

Comment 16 Marius Andreiana 2005-08-20 07:20:08 UTC
Closing as errata

Note You need to log in before you can comment on or make changes to this bug.