Bug 491157

Summary: dhclient wlan0 fails after yum update from 2.6.27.12-170.2.5 to 2.6.27.19-170.2.35
Product: [Fedora] Fedora Reporter: Frank Middleton <f.middleton>
Component: dhcpAssignee: David Cantrell <dcantrell>
Status: CLOSED NEXTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 10CC: dcantrell, matt, rrauenza, wwoods
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: 4.0.0-35.fc10 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-05-25 21:21:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Frank Middleton 2009-03-19 16:04:57 UTC
User-Agent:       Mozilla/5.0 (X11; U; SunOS sun4u; en-US; rv:1.9.1b3) Gecko/20090305 Firefox/3.1b3

dhclient eth0 works, dhclient wlan0 no longer works, accessing the same AP (d-link di-624). Manually assigning an IP address to wlan0, editing the routing tables and resolve.conf allows WiFi to work perfectly. There are 3 other dhcp clients (a webcam and 2 VoIP boxes) on the network and they all connect with no problems. This has only been a problem since doing a yum update on 2/23, so this seems to be a regression; I asked on the b43 and fedora mailinglists and got no useful feedback.


Reproducible: Always

Steps to Reproduce:
0. Make sure NM isn't running
1. Attach to the AP
2. Run wpa_supplicant if needed
3. Run dhclient wlan0
Actual Results:  
# dhclient -v -d wlan0
Internet Systems Consortium DHCP Client 4.0.0
Copyright 2004-2007 Internet Systems Consortium.
All rights reserved.
For info, please visit http://www.isc.org/sw/dhcp/

Listening on LPF/wlan0/00:90:96:7b:45:f1
Sending on   LPF/wlan0/00:90:96:7b:45:f1
Sending on   Socket/fallback
DHCPDISCOVER on wlan0 to 255.255.255.255 port 67 interval 8
DHCPDISCOVER on wlan0 to 255.255.255.255 port 67 interval 12
DHCPDISCOVER on wlan0 to 255.255.255.255 port 67 interval 15
DHCPDISCOVER on wlan0 to 255.255.255.255 port 67 interval 19
DHCPDISCOVER on wlan0 to 255.255.255.255 port 67 interval 7
No DHCPOFFERS received.
No working leases in persistent database - sleeping.
etc... 

default gateway removed from routing tables (this is already reported in another bug)

Expected Results:  
DHCPOFFERS, an assigned IP address, updated routing tables and resolv.conf

dmesg shows nothing unusual

# yum install dhcpcd
...
Package 12:dhclient-4.0.0-33.fc10.x86_64 already installed and latest version
Nothing to do 

I have no other wifi enabled computer so I can't run tcpdump to see if anything is actually being sent. dhclient takes the interface down and up, so it is very hard to snoop on the same machine. The AP/router is configured to offer specific and different addresses based on MAC to the wired and wireless DHCP requests. This may not be a problem with dhclient, although it certainly seems that way. The computer is a single core 1.8GHz AMD Athlon with 0,5GB memory.

Comment 1 David Cantrell 2009-04-17 01:31:50 UTC
I'm curious if the client is actually reaching whatever system is the DHCP server.  Where is the DHCP server running?

If the DHCP server is running on the D-Link device, I have very little confidence that it's working properly.  Every D-Link device I've owned (hub, switch, router, wireless adapter, Ethernet adapter, and so on) has failed in bizarre and unpredictable ways.  I no longer trust anything with a D-Link name on it.

If you can run dhcpd on another host on your network and see if it answers the client's requests, that could help pinpoint the problem.

Comment 2 Frank Middleton 2009-04-17 13:20:15 UTC
Yes, D-Link is junk (anyone want an 8-2=6 port hub?). However there are a lot of them out there! As it happens, I just got a Netgear WPN824NA to replace the D-Link (for other reasons), so I could try it with that, and I'll post here when I have done so. I plan to set up for network booting, and for that I need a DHCP server that can deal with PXE, so I will eventually do that - if it would help, I could bump up the priority of that project.

However - I'm not sure what the test with an alternate DHCP server would prove. If it works (and I bet it does), then there's still a problem with the D-Link. If it doesn't work, then I suppose you might suspect the client. But I did also buy a Netgear WG511 cardbus card (Atheros AR2414) and dchp fails with that, too. I don't know if you can use a second WiFi device in the same machine to snoop om the other, but if it is possible, it might be worth a try.

Obviously, I think this is a regression. dhclient was working fine until I did the yum update back in February. The linksys webcam still successfully gets it's dhcp address over WiFi. The WG511 doesn't get it's IP address, so it isn't a MAC address issue. If I knew how, I would try using an older version to see if that fixes the problem. If you like I can trash the D-Link now and you can close this bug (assuming dhclient works with the Netgear), or we can work on it while I still have the D-Link running. I am keen to get rid of it because it keeps rebooting and D-Link support has been utterly incapable of fixing the problem.

Comment 3 David Cantrell 2009-04-17 19:26:08 UTC
I regularly test dhclient in Fedora with soho routers, dnsmasq, and dhcpd running on different Fedora or RHEL boxes to ensure things still work properly.  Because this is the first report I've heard of there being a possible regression in the F-10 dhclient, I'm inclined to place the blame on either the D-Link device or your environment (firewall, etc?).

If dhclient is able to get a lease from the Netgear device, then I would say the problem is the D-Link device at that point.  If that's the case, there won't be anything I can really do since the problem is on the device side.

Additionally, you say you tried a Netgear WG511 cardbus card in the same system (I think it's the same system) and tried to get a lease from the D-Link device and that failed to.  That really leads me to think the D-Link device is just non-functional or at least not RFC compliant.  Your webcam working or even a Windows system working is not necessarily a good indicator as DHCP clients on those devices rarely follow RFC compliance completely.  Windows specifically has loose compliance policies to ensure Windows can work with on-market broken devices (such as your D-Link router...maybe).

So, what I'd like you to try is the Netgear device as the DHCP server and see if the Fedora system can get a lease from it.  If that works, I'll say the D-Link router is the issue and we'll note it here and close the bug as cantfix.

Thanks.

Comment 4 Frank Middleton 2009-04-19 14:59:00 UTC
It didn't work using any permutation of the Netgear AP, D-Link AP, B43 interface, and WG511 card.

After some very patient experimenting I found the reason seems to be that running dhclient always does the equivalent of ifconfig wlanx down. If you carefully run dhclient wlanx *first* and only then do ifconfig wlanx up, iwconfig wlanx essid ..., (and  wpa_supplicant if needed), in that order, an IP address may eventually get assigned, so dhclient does work, with both APs. I tested this by alternating wlan0 and wlan1 between the D-Link and Netgear, which are on different subnets.

This experimentation is made difficult because the wlan status as reported by iwconfig seems to lag the status in /var/log/messages by quite a long time. I suspect that this may be significant...

However, this is a regression because you used to be able to start the interface first and then run dhclient, and it would reliably get an IP address every time.

This is very reproducible and consistent behavior, so much so, I would guess everyone must (now) be doing it in that non-intuitive order. Anyway, it isn't the AP. Since there is a rather ugly (and slow) workaround, I suppose you could close this as wontfix, but the man pages should be changed to document the requirement that dhclient be started before the interfaces.

Comment 5 David Cantrell 2009-04-20 20:48:19 UTC
I have not seen any other reports of this issue, probably because most people with wireless devices are using NetworkManager.  However, I would like dhclient to work with wireless devices without requiring NetworkManager.

How about the following change to /sbin/dhclient-script:

Index: dhclient-script
===================================================================
RCS file: /cvs/pkgs/rpms/dhcp/F-10/dhclient-script,v
retrieving revision 1.6
diff -u -p -r1.6 dhclient-script
--- dhclient-script	17 Apr 2009 21:10:07 -0000	1.6
+++ dhclient-script	20 Apr 2009 20:47:03 -0000
@@ -220,7 +220,6 @@ dhconfig() {
         # IP address changed.  Bringing down the interface will delete all
         # routes, and clear the ARP cache.
         ip -family inet addr flush dev ${interface} >/dev/null 2>&1
-        ip -family inet link set dev ${interface} down
     fi
 
     if [ "${reason}" = "BOUND" ] || [ "${reason}" = "REBOOT" ] ||

Does that make things work as they used to?

Comment 6 Frank Middleton 2009-04-20 21:46:50 UTC
Yes, it would seem that it does! I can live with applying this patch if the fix breaks nm - if I wanted a nm gui I'd use mswindows :-) 

Thank you so much! You and the b43 guys have done a great job.

Comment 7 David Cantrell 2009-04-20 21:50:43 UTC
It shouldn't break NM because it provides its own replacement for dhclient-script (/usr/libexec/nm-dhcp-client.action).

I'll generate an update for dhcp to include this fix.  Thanks for the feedback.

Comment 8 Fedora Update System 2009-04-21 00:08:29 UTC
dhcp-4.0.0-35.fc10 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/dhcp-4.0.0-35.fc10

Comment 9 Fedora Update System 2009-04-22 01:10:43 UTC
dhcp-4.0.0-35.fc10 has been pushed to the Fedora 10 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update dhcp'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F10/FEDORA-2009-3863

Comment 10 Frank Middleton 2009-04-22 15:00:31 UTC
# yum --enablerepo=updates-testing update dhcp
Loaded plugins: refresh-packagekit
updates-testing                          | 2.3 kB     00:00     
updates-testing/primary_db               | 428 kB     00:02     
Setting up Update Process
No Match for argument: dhcp
Package(s) dhcp available, but not installed.
No Packages marked for Update
# yum list dhclient
Loaded plugins: refresh-packagekit
Installed Packages
dhclient.x86_64      

Should dhcp be installed?

# yum list  dhcp
Loaded plugins: refresh-packagekit
Available Packages
dhcp.x86_64                                       12:4.0.0-33.fc10                                       updates
# yum --enablerepo=updates-testing list dhcp
Loaded plugins: refresh-packagekit
Available Packages
dhcp.x86_64                                   12:4.0.0-34.fc10               

dhcp-4.0.0-35.fc10 seems to be missing anyway...

Comment 11 Fedora Update System 2009-05-25 21:21:25 UTC
dhcp-4.0.0-35.fc10 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 12 Matt Ford 2009-05-26 18:15:13 UTC
I too have spotted the regression.  However the latest version of the RPM does *not* fix the issue.

My home router (a draytek) and my local coffee bar are now no longer able to serve me DHCP addresses.  However the more expensive routers at work are still able to serve addresses.

After a fresh install of F10 all DHCP requests work fine (at home and in coffee shop).  Running a yum update then breaks things for me.

I see bad behaviour from both manual dhcp requests and those that are generated by NetworkManager.

I don't see anything in the logs with (higher verbosity) that would help explain this behaviour.

What can I do to help debug this?

I also see the DHCP issue with F11 preview.

Comment 13 Matt Ford 2009-05-26 18:25:42 UTC
I'm running a Samsung NC10 (if that is any help)

Comment 14 Matt Ford 2009-05-26 18:54:15 UTC
Scouring the web I see a bunch of people having issues with dhclient and F10.  I'll try and post some links later.  Possible causes suggested selinux I see no evidence of this myself and incorrect mtu's from bad routers...

Comment 15 Matt Ford 2009-05-27 07:59:01 UTC
The issue doesn't seem to be with dhclient (at least not directly).  Rolling back the dhclient package to F10's original doesn't fix the issue.   Things look to break with the kernel upgrade as the original bug reporter suggested.  The NC10 uses the ath5k wireless drivers.

Comment 16 Rich Rauenzahn 2010-04-22 04:43:19 UTC
I'm having this problem with FC12.

Sometimes the wifi works, sometimes it gets into this mode in which it never gets a DHCP response. I've gone through three usb wifi dongles, the latest being 

T:  Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=ff(vend.) Sub=ff Prot=ff MxPS=64 #Cfgs=  1
P:  Vendor=0ace ProdID=1215 Rev=48.10
S:  Manufacturer=ZyDAS
S:  Product=USB2.0 WLAN
C:* #Ifs= 1 Cfg#= 1 Atr=80 MxPwr=500mA
I:* If#= 0 Alt= 0 #EPs= 4 Cls=ff(vend.) Sub=00 Prot=00 Driver=zd1211rw
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=83(I) Atr=03(Int.) MxPS=  64 Ivl=125us
E:  Ad=04(O) Atr=03(Int.) MxPS=  64 Ivl=125us

This is in a corporate environment -- I'm not sure what kind of access points they are using.

$ sudo dhclient -v -d wlan0
Internet Systems Consortium DHCP Client 4.1.1
Copyright 2004-2010 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Listening on LPF/wlan0/00:02:72:8a:07:b3
Sending on   LPF/wlan0/00:02:72:8a:07:b3
Sending on   Socket/fallback
DHCPDISCOVER on wlan0 to 255.255.255.255 port 67 interval 7
DHCPDISCOVER on wlan0 to 255.255.255.255 port 67 interval 9
DHCPDISCOVER on wlan0 to 255.255.255.255 port 67 interval 14
DHCPDISCOVER on wlan0 to 255.255.255.255 port 67 interval 17
DHCPDISCOVER on wlan0 to 255.255.255.255 port 67 interval 12
DHCPDISCOVER on wlan0 to 255.255.255.255 port 67 interval 2
No DHCPOFFERS received.
No working leases in persistent database - sleeping.

The lease time is 1800, so I don't think it is a case of running out of IPs to assign.

$ sudo iwlist wlan0 scan
wlan0     Scan completed :
          Cell 01 - Address: 00:0B:0E:45:85:00
                    Channel:6
                    Frequency:2.437 GHz (Channel 6)
                    Quality=23/100  Signal level=23/100  
                    Encryption key:off
                    ESSID:"XXXXXX"
                    Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s
                              11 Mb/s; 12 Mb/s; 18 Mb/s
                    Bit Rates:24 Mb/s; 36 Mb/s; 48 Mb/s; 54 Mb/s
                    Mode:Master
                    Extra:tsf=000000011b905b88
                    Extra: Last beacon: 825ms ago
                    IE: Unknown: 0005766D616972
                    IE: Unknown: 010882848B0C12961824
                    IE: Unknown: 030106
                    IE: Unknown: 0706555320010B16
                    IE: Unknown: 0B0500000F0000
                    IE: Unknown: 43020000
                    IE: Unknown: 2A0100
                    IE: Unknown: 32043048606C
                    IE: Unknown: DD22000B0E0200000000160C02A104A20BA30CA412A516A618A624A730AB48AE60B46CB8
                    IE: Unknown: DD2E000B0E03001C5F756ABE00C86148CF119BC39E1B5E8BF12786EC33FE1AB8223104133D23C04A524AF70C1C527BA1
                    IE: Unknown: DD180050F2020101010003A4000027A4000042435E0062322F00