Bug 506626 - dhclient segfaults in add_timeout() after DHCPDECLINE
Summary: dhclient segfaults in add_timeout() after DHCPDECLINE
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: dhcp
Version: 11
Hardware: i386
OS: Linux
low
medium
Target Milestone: ---
Assignee: David Cantrell
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-06-18 02:04 UTC by Warren Togami
Modified: 2009-08-06 02:55 UTC (History)
4 users (show)

Fixed In Version: 4.1.0-22.fc11
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-07-11 17:04:28 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
dhclient (533.59 KB, application/octet-stream)
2009-06-26 02:57 UTC, David Cantrell
no flags Details
fragment of /var/log/messages showing dhclient crash (3.95 KB, text/plain)
2009-06-26 21:59 UTC, John Heidemann
no flags Details

Description Warren Togami 2009-06-18 02:04:15 UTC
dhcp-4.1.0-20.fc11.i586
dracut a57acf272116828b93c684deb7c7c5ccba7ee46f

dhclient in a dracut initrd is segfaulting.  Somehow the state of the initrd is getting wedged where a 100% reproducible segfault happens even in the subsequent emergency shell with plain 'dhclient eth0'.  Any idea what could be going on here? 

Program terminated with signal 11, Segmentation fault.
#0  0x0806fd4d in add_timeout (when=0x4a3986b2, where=0x8056a20 <send_discover>, what=0x95fc5d0, ref=0, unref=0) at dispatch.c:143
143        q -> when . tv_sec = when -> tv_sec;
(gdb) bt
#0  0x0806fd4d in add_timeout (when=0x4a3986b2, where=0x8056a20 <send_discover>, what=0x95fc5d0, ref=0, unref=0) at dispatch.c:143
#1  0x08056dd7 in state_init (cpp=0x95fc5d0) at dhclient.c:1213
#2  0x080580f8 in bind_lease (client=0x95fc5d0) at dhclient.c:1494
#3  0x08058b72 in dhcpack (packet=0x95fcef8) at dhclient.c:1460
#4  0x080547cf in dhcp (packet=0x95fcef8) at dhclient.c:1693
#5  0x0807ae63 in do_packet (interface=0x95d4608, packet=0xbfdf25a4, len=300, from_port=17152, from={len = 4, iabuf = "\254\37d\376a)\n\b\2\0\0\0\0\0\0"}, hfrom=0xbfdf35ba) at options.c:3732
#6  0x0806db8f in got_one (h=0x95d4608) at discover.c:1393
#7  0x080a763e in omapi_one_dispatch (wo=0x0, t=0xbfdf39c8) at dispatch.c:473
#8  0x0806fed0 in dispatch () at dispatch.c:92
#9  0x0805aa2c in main (argc=2, argv=0xbfdf3c54) at dhclient.c:971

Comment 1 Warren Togami 2009-06-18 02:17:28 UTC
sh-4.0# dhclient -v eth0
Internet Systems Consortium DHCP Client 4.1.0
Copyright 2004-2008 Internet Systems Consortium.
All rights reserved.
For info, please visit http://www.isc.org/sw/dhcp/

Listening on LPF/eth0/52:54:00:12:34:56
Sending on   LPF/eth0/52:54:00:12:34:56
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 5
DHCPOFFER from 172.31.100.254
DHCPREQUEST on eth0 to 255.255.255.255 port 67
DHCPACK from 172.31.100.254
DHCPDECLINE on eth0 to 255.255.255.255 port 67
dhclient[616]: segfault at 4a39a266 ip 0806fd4d sp bfa4eed0 error 4 in dhclient[8048000+84000]

DHCPDECLINE?  Haven't seen that before.

Identical dhcpd.conf as a working configuration.  Something else in the initrd image is causing it to DHCPDECLINE and segfault?

Comment 2 Warren Togami 2009-06-18 13:50:32 UTC
Ah, there was a bug in dhclient-script causing it to exit non-zero.

So the bug is something like, "dhclient segfaults after DHCPDECLINE after dhclient-script returns non-zero"?

Comment 3 David Cantrell 2009-06-18 14:44:02 UTC
DECLINE in dhclient-script will be handled by the default case which will log a message about an unhandled state and then call exit_with_hooks and use exit code 1.

Can you set ulimit to unlimited for core files, run dhclient again, and send me the core file?

Comment 4 Warren Togami 2009-06-18 16:52:51 UTC
http://wtogami.fedorapeople.org/temp/dhclient-segfault-core
This is the core of the above backtrace.

Comment 5 David Cantrell 2009-06-26 02:57:31 UTC
Created attachment 349502 [details]
dhclient

Warren, can you try out the attached dhclient and see if it solves the problem you are seeing?  If it crashes, can you attach a core dump?

Thanks.

Comment 6 John Heidemann 2009-06-26 21:00:53 UTC
David, the dhclient in comment #5 Works For Me (also f11.i586).
Although should there, perhaps also, be a patch to dhclient-script to make it return 0?

Also, this bug is reasonably serious: it breaks networking with NetworkManager for me.  (For some reason, though, ifup eth0 doesn't segfault.)

Comment 7 David Cantrell 2009-06-26 21:41:49 UTC
(In reply to comment #6)
> David, the dhclient in comment #5 Works For Me (also f11.i586).
> Although should there, perhaps also, be a patch to dhclient-script to make it
> return 0?

Why?

> Also, this bug is reasonably serious: it breaks networking with NetworkManager
> for me.  (For some reason, though, ifup eth0 doesn't segfault.)  

Really?  I'm not having any problems here.  Is it broken in F-11 or rawhide?  I'll get this patch in to rawhide, but let me know if it's in F-11 and I'll roll an update there.  Surprised I haven't heard from anyone else.

Comment 8 John Heidemann 2009-06-26 21:55:31 UTC
Re: comment #7:

>> Although should there, perhaps also, be a patch to dhclient-script to make it
>> return 0?
>
>Why?

Comment #2 said
"...there was a bug in dhclient-script causing it to exit non-zero..."

I assume if dhclient-script has a bug it should get fixed.

Although later comments suggest returning 1 is a feature, I guess.

And regardless, having dhclient not setfault is a Good Thing.

>> Also, this bug is reasonably serious: it breaks networking with NetworkManager
>> for me.  (For some reason, though, ifup eth0 doesn't segfault.)  
>
>Really?  I'm not having any problems here.  Is it broken in F-11 or rawhide? 
>I'll get this patch in to rawhide, but let me know if it's in F-11 and I'll
>roll an update there.  Surprised I haven't heard from anyone else.  

I have a fresh install of F11 on an EEE PC 1000 netbook.  Although ifup brings up the ethernet, the NetworkManager menu fails to bring up either wireless or wired (I'll attach part of /var/log/messages).

My other laptop running x86_64 upgraded to f11 from f10 doesn't have this problem, so not clear how general it is.

Comment 9 John Heidemann 2009-06-26 21:59:31 UTC
Created attachment 349615 [details]
fragment of /var/log/messages showing dhclient crash

Comment 10 Warren Togami 2009-06-26 22:13:44 UTC
This isn't a standard dhclient-script.  The bug was in an obscure part of dhclient where it crashed instead of handling an error condition.

Comment 11 David Cantrell 2009-06-26 23:54:45 UTC
John & Warren,

Based on Warren's core dump and John's log file, it looks like you are both seeing the exact same problem, just in different places (warren in the dracut stuff, john in everyday use on an eeepc).  The similarity is that both are i586 instances (correct me if I'm wrong).  x86_64 appears to work just fine.  I'm not sure why we're getting NULL passed to add_timeout() and only on i586, but what I'll do is roll an F-11 update and a new build for rawhide that includes the patch I used to build the new dhclient in this BZ.

Thanks for the reports.  I'll leave this BZ opened until I publish the update for F-11.

Comment 12 David Cantrell 2009-06-26 23:56:47 UTC
Also, for clarification, the dhclient-script that comes with dhclient is only used when you run dhclient directly (or via ifup).  NetworkManager provides its own dhclient-script replacement (/usr/libexec/nm-dhcp-client.action).  In either case, the problem was in dhclient and not the script executed by dhclient.

Comment 13 Fedora Update System 2009-06-27 00:27:19 UTC
dhcp-4.1.0-21.fc11 has been submitted as an update for Fedora 11.
http://admin.fedoraproject.org/updates/dhcp-4.1.0-21.fc11

Comment 14 Fedora Update System 2009-06-30 21:32:36 UTC
dhcp-4.1.0-22.fc11 has been pushed to the Fedora 11 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update dhcp'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F11/FEDORA-2009-7128

Comment 15 Fedora Update System 2009-07-11 17:04:17 UTC
dhcp-4.1.0-22.fc11 has been pushed to the Fedora 11 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 16 Artur Roszczyk 2009-07-30 23:59:23 UTC
Unfortunately I think that this has not been completly fixed. I've just installed new Fedora 11 on my laptop (fujitsu amilo, cd 1.6ghz 1gb ram). During LiveCD session everything was ok. I was able to connect via wifi and ethernet, but after installation my network manager didn't work. After I tried to connect I've got disconnected, but after that if I executed in console:

killall -9 dhclient
dhclient -v eth0

I got connected.

In logs I found this:

Jul 31 01:33:13 sevos-laptop kernel: dhclient[3913]: segfault at 4a722dba ip 0806f9cd sp bfbe1bd0 error 4 in dhclient[8048000+84000]

I am a bit confused. I tried to install versions 23, 22 and 20 of dhcp  and dhclient but with no result.

Sorry If I posted in wrong place. It is my first message on this bugtracker. I hope that it can be resolved. I want to switch my office to fedora ;)

Greetings,
Artur Roszczyk

full log below:
Jul 31 01:33:10 sevos-laptop NetworkManager: <info>  Activation (eth0) starting connection 'Auto eth0'
Jul 31 01:33:10 sevos-laptop NetworkManager: <info>  (eth0): device state change: 3 -> 4 (reason 0)
Jul 31 01:33:10 sevos-laptop NetworkManager: <info>  Activation (eth0) Stage 1 of 5 (Device Prepare) scheduled...
Jul 31 01:33:10 sevos-laptop NetworkManager: <info>  Activation (eth0) Stage 1 of 5 (Device Prepare) started...
Jul 31 01:33:10 sevos-laptop NetworkManager: <info>  Activation (eth0) Stage 2 of 5 (Device Configure) scheduled...
Jul 31 01:33:10 sevos-laptop NetworkManager: <info>  Activation (eth0) Stage 1 of 5 (Device Prepare) complete.
Jul 31 01:33:10 sevos-laptop NetworkManager: <info>  Activation (eth0) Stage 2 of 5 (Device Configure) starting...
Jul 31 01:33:10 sevos-laptop NetworkManager: <info>  (eth0): device state change: 4 -> 5 (reason 0)
Jul 31 01:33:10 sevos-laptop NetworkManager: <info>  Activation (eth0) Stage 2 of 5 (Device Configure) successful.
Jul 31 01:33:10 sevos-laptop NetworkManager: <info>  Activation (eth0) Stage 3 of 5 (IP Configure Start) scheduled.
Jul 31 01:33:10 sevos-laptop NetworkManager: <info>  Activation (eth0) Stage 2 of 5 (Device Configure) complete.
Jul 31 01:33:10 sevos-laptop NetworkManager: <info>  Activation (eth0) Stage 3 of 5 (IP Configure Start) started...
Jul 31 01:33:10 sevos-laptop NetworkManager: <info>  (eth0): device state change: 5 -> 7 (reason 0)
Jul 31 01:33:10 sevos-laptop NetworkManager: <info>  Activation (eth0) Beginning DHCP transaction.
Jul 31 01:33:10 sevos-laptop NetworkManager: <info>  dhclient started with pid 3913
Jul 31 01:33:10 sevos-laptop NetworkManager: <info>  Activation (eth0) Stage 3 of 5 (IP Configure Start) complete.
Jul 31 01:33:10 sevos-laptop dhclient: Internet Systems Consortium DHCP Client 4.1.0
Jul 31 01:33:10 sevos-laptop dhclient: Copyright 2004-2008 Internet Systems Consortium.
Jul 31 01:33:10 sevos-laptop dhclient: All rights reserved.
Jul 31 01:33:10 sevos-laptop dhclient: For info, please visit http://www.isc.org/sw/dhcp/
Jul 31 01:33:10 sevos-laptop dhclient: 
Jul 31 01:33:10 sevos-laptop dhclient: Listening on LPF/eth0/00:03:0d:47:e2:18
Jul 31 01:33:10 sevos-laptop dhclient: Sending on   LPF/eth0/00:03:0d:47:e2:18
Jul 31 01:33:10 sevos-laptop dhclient: Sending on   Socket/fallback
Jul 31 01:33:13 sevos-laptop dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 3
Jul 31 01:33:13 sevos-laptop dhclient: DHCPOFFER from 192.168.0.6
Jul 31 01:33:13 sevos-laptop dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Jul 31 01:33:13 sevos-laptop dhclient: DHCPACK from 192.168.0.6
Jul 31 01:33:13 sevos-laptop dhclient: DHCPDECLINE on eth0 to 255.255.255.255 port 67
Jul 31 01:33:13 sevos-laptop kernel: dhclient[3913]: segfault at 4a722dba ip 0806f9cd sp bfbe1bd0 error 4 in dhclient[8048000+84000]
Jul 31 01:33:13 sevos-laptop NetworkManager: <WARN>  dhcp_watch_cb(): dhcp client died abnormally
Jul 31 01:33:13 sevos-laptop NetworkManager: <info>  (eth0): device state change: 7 -> 9 (reason 17)
Jul 31 01:33:13 sevos-laptop NetworkManager: <info>  Marking connection 'Auto eth0' invalid.
Jul 31 01:33:13 sevos-laptop NetworkManager: <info>  Activation (eth0) failed.
Jul 31 01:33:13 sevos-laptop NetworkManager: <info>  (eth0): device state change: 9 -> 3 (reason 0)
Jul 31 01:33:13 sevos-laptop NetworkManager: <info>  (eth0): deactivating device (reason: 0).
Jul 31 01:33:13 sevos-laptop NetworkManager: <WARN>  check_one_route(): (eth0) error -34 returned from rtnl_route_del(): Sucess#012
Jul 31 01:33:13 sevos-laptop avahi-daemon[1374]: Withdrawing address record for 192.168.0.3 on eth0.
Jul 31 01:33:13 sevos-laptop avahi-daemon[1374]: Leaving mDNS multicast group on interface eth0.IPv4 with address 192.168.0.3.
Jul 31 01:33:13 sevos-laptop avahi-daemon[1374]: Interface eth0.IPv4 no longer relevant for mDNS.
Jul 31 01:33:14 sevos-laptop ntpd[1667]: Deleting interface #18 eth0, 192.168.0.3#123, interface stats: received=175, sent=176, dropped=0, active_time=12598 secs

Comment 17 David Cantrell 2009-08-06 02:55:22 UTC
(In reply to comment #16)
> Unfortunately I think that this has not been completly fixed. I've just
> installed new Fedora 11 on my laptop (fujitsu amilo, cd 1.6ghz 1gb ram).

Did you also install all of the Fedora updates, specifically the dhcp-4.1.0-22.fc11 update?  The fix for this issue is available as an update to Fedora 11.  It's not part of the base install.


Note You need to log in before you can comment on or make changes to this bug.