Bug 626427
Summary: | Frequent dhclient DHCPREQUEST messages in syslog | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Jay Turner <jturner> | ||||||||||||
Component: | NetworkManager | Assignee: | Dan Williams <dcbw> | ||||||||||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Desktop QE <desktop-qa-list> | ||||||||||||
Severity: | low | Docs Contact: | |||||||||||||
Priority: | medium | ||||||||||||||
Version: | 6.0 | CC: | borgan, ddumas, jklimes, jpopelka, ovasik, rkhan, srevivo, tpelka | ||||||||||||
Target Milestone: | rc | ||||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | All | ||||||||||||||
OS: | Linux | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2013-02-19 18:37:53 UTC | Type: | --- | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Bug Depends On: | |||||||||||||||
Bug Blocks: | 756082 | ||||||||||||||
Attachments: |
|
Description
Jay Turner
2010-08-23 14:31:42 UTC
Can you attach /var/lib/dhclient/dhclient-*eth0.lease /var/lib/dhclient/dhclient-*wlan0.lease and the packet capture from wireshark. thanks You can also attach - the relevant parts of the log file: cat /var/log/messages | grep dhclient - output of 'ip a' - ifcfg-* files from /etc/sysconfig/network-scripts I'm asking for all those files because I'm not able to reproduce it at the moment. I have machine with 3 interfaces, attached to 3 networks. All have dynamically assigned addresses via dhcpd, each with different renewal time. During renewal they are all sending DHCPREQUEST to correct dhcpd server address via correct (looking via wireshark) interface. ... Aug 23 17:40:59 dhcp-24-166 dhclient: DHCPREQUEST on eth1 to 192.168.0.217 port 67 Aug 23 17:40:59 dhcp-24-166 dhclient: DHCPACK from 192.168.0.217 Aug 23 17:40:59 dhcp-24-166 dhclient: bound to 192.168.0.10 -- renewal in 181 seconds. Aug 23 17:41:43 dhcp-24-166 dhclient: DHCPREQUEST on eth2 to 192.168.1.217 port 67 Aug 23 17:41:43 dhcp-24-166 dhclient: DHCPACK from 192.168.1.217 Aug 23 17:41:43 dhcp-24-166 dhclient: bound to 192.168.1.111 -- renewal in 602 seconds. ... I wonder whether this has something to do with your wireless interface (wlan0). Created attachment 440420 [details]
Wireshark capture
DHCP packets during failed attempts to renew the lease on the wlan0 interface.
Created attachment 440421 [details]
eth0 lease file
Created attachment 440422 [details]
wlan0 lease file
There are 2 other wlan0 lease files, but this is the one currently being utilized:
root 3283 0.0 0.0 9004 1448 ? S 07:38 0:00 /sbin/dhclient -d -4 -sf /usr/libexec/nm-dhcp-client.action -pf /var/run/dhclient-wlan0.pid -lf /var/lib/dhclient/dhclient-9ddebcda-4e51-4623-b8fa-6d3dd7cabc7f-wlan0.lease -cf /var/run/nm-dhclient-wlan0.conf wlan0
Created attachment 440423 [details]
'cat /var/log/messages | grep dhclient'
# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:1d:ba:aa:ab:ca brd ff:ff:ff:ff:ff:ff inet 10.11.231.238/22 brd 10.11.231.255 scope global eth0 inet6 fe80::21d:baff:feaa:abca/64 scope link valid_lft forever preferred_lft forever 3: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 link/ether 00:21:5d:eb:42:60 brd ff:ff:ff:ff:ff:ff inet 10.11.4.245/22 brd 10.11.7.255 scope global wlan0 inet6 fe80::221:5dff:feeb:4260/64 scope link valid_lft forever preferred_lft forever 4: pan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link/ether f2:83:d2:70:81:f3 brd ff:ff:ff:ff:ff:ff # cat /etc/sysconfig/network-scripts/ifcfg-eth0 # Marvell Technology Group Ltd. 88E8055 PCI-E Gigabit Ethernet Controller DEVICE=eth0 BOOTPROTO=dhcp HWADDR=00:1D:BA:AA:AB:CA ONBOOT=yes There is not an ifcfg file for wlan0 on this machine. Great. Thank you. I would like to see what happens when the wlan0 gets (not RENEW, but INIT or INIT-REBOOT state). From the log it seems that the 1.1.1.1 address of the dhcp server was told the client from outside. You are using NetworkManager, don't you? So 1) Click on NM applet, click on Disconnect below wlan0 2) Empty the wlan0's lease file '/var/lib/dhclient/dhclient-9ddebcda-4e51-4623-b8fa-6d3dd7cabc7f-wlan0.lease' 3) Start wireshark to see what happens on wlan0 4) 'Turn on' the wlan0 by clicking on wlan0 below --Available-- 5) Attach the packet capture here thanks Something else interesting to note is that after some period of time of looking for a dhcp reply from 1.1.1.1 (via eth0), the processes times out and instead sends a request to 255.255.255.255 . . . according to wireshark, that request does indeed go out via wlan0. Very odd. Created attachment 440443 [details]
wireshark capture of INIT process for wlan0
Some additional information which will hopefully help narrow things further. During renewal, the first part of this if/else clause is getting executed: if (destination.sin_addr.s_addr != INADDR_BROADCAST && fallback_interface) result = send_packet (fallback_interface, (struct packet *)0, &client -> packet, client -> packet_length, from, &destination, (struct hardware *)0); else /* Send out a packet. */ result = send_packet (client -> interface, (struct packet *)0, &client -> packet, client -> packet_length, from, &destination, (struct hardware *)0); So the renewal is going out via fallback_interface. I am not familiar enough with dhcp internals to know if this is supposed to be happening and if so, what determines the fallback interface. (In reply to comment #9) > Something else interesting to note is that after some period of time of looking > for a dhcp reply from 1.1.1.1 (via eth0), the processes times out and instead > sends a request to 255.255.255.255 . . . according to wireshark, that request > does indeed go out via wlan0. Very odd. That's correct. In RENEWING state, client periodically *unicasts* a DHCPREQUEST message to the server (1.1.1.1 in our case) that assigned it its lease, requesting an extension. If seven-eighths of the lease time expires and the client still receives no response, the client enters the REBINDING state and begins *broadcasting* DHCPREQUEST messages to locate some other DHCP server that will renew its lease. I guess in that REBINDING state client finally renews it's lease for wlan0. Am I right? In that case I would rather move this BZ to rhel-6.1.0 because (in that case) I don't see it as a blocker and don't want to break anything at the very last moment. Thanks for that code snippet. I admit I don't have any idea why the renewal is going out via fallback_interface. I'll continue to investigate this and probably ask upstream whether there are any known consequences. I knew about the unicast/multicast timeout. The part I found odd was that the multicast went out the correct device. Anyway, yes, once the multicast occurs, the interface's lease is renewed and the cycle starts again, but depending on the lease length, that is quite a bit of failed traffic on the interface. Per comment 15, moving to 6.1 Jay, can you show ('ip route show') me your IP routing table when both interfaces are up? thanks Jiri, this occurred to me the other day and I failed to make a note of it in the bug. I think it explains why the request goes out the "wrong" interface. I am not in the office today so will have to wait to get you the routing table from there, but here at the house, my dhcp server is 192.168.0.1. When I have both wired and wireless up, the default route is out via eth0 (just as at the office where 1.1.1.1 hits the default route which is eth0 as opposed to wlan0.) That having been said, in the case of my home office, the same dhcp server is serving both wired and wireless, so not sure why the unicast address would fail and then the multicast request (through the exactly same interface) would succeed. Dhcp uses raw sockets ("LPF" - linux packet filter) for receiving and transmitting local subnet broadcast DHCP packets and a normal BSD socket (the 'fallback interface') for any unicast traffic that needs to be routed. This 'fallback interface' is a standard berkeley UDP socket, so the destination mac address and interface to go through is determined by the operating system. Information how to send a packet to reach the destination (server) is stored in routing table. And I think the "problem" here's the missing info in your routing table. One think I still don't understand is the weird 1.1.1.1 address of the dhcp server. Looking into packet captures I know that the wlan0 (10.11.4.254) is connected to 10.11.4.0/22 network. The dhcp server seems to be directly on that network because the offers and acks come directly from 1.1.1.1 and there's no relay agent address filled. But how can a host with 1.1.1.1 address be on 10.11.4.0/22 network ? If it's really on 10.11.4.0/22 I think it should have 10.11.4.1 address or so. Anyway, this dhcp server sends (in the lease for client) address (10.11.7.254) of default router, but the client is not using it. We know that because when it wants to send packet directly to 1.1.1.1 it sends it via eth0, i.e. to router (it's address can be found in your routing table) on network (10.11.228.0/22) to which the eth0 is connected. The reason why it is so is I think in the fact that when interface gets (or renews/rebinds) it's address with default router, this router address is stored in routing table. When another interface gets it's address with router address (different than the first router) it overwrites the default router in routing table. And so it changes whenever some interface rebinds it's address. I'm still a network newbie but I think this has no solution. The traffic just goes out through the default router (i.e. through the interface which is connected to the same network as the router). And when you are connected to more networks and each have different default router, that the system cannot say which one is the 'really default'. I will poke around at this some more next week when I am actually in the office. The 1.1.1.1 thing confuses me as well. Also interesting to note that at this time only wireless appears to be impacted by this issue. I have a workstation at the office where one NIC is on rdu.redhat.com (eth0 - primary) while another is on test.redhat.com (eth1 - secondary.) While I do see the eth1 renewal requests go out via eth0, they are satisfied and the things continue normally. This could very well be something specific to my laptop. I will talk to some other folks in the RDU office and see if they are experiencing anything similar. Are there any news, Jay ? Sorry for the delay in responding. I am still seeing this behavior (at least as of a few days ago.) I have not had a chance to try and reproduce on another laptop. All I know is in comment #19. The only idea I have is that NetworkManager may have anything to do with this, because from log in comment #6 it seems that both interfaces are NM controlled. So let's pass this to NM for now. Dan, Jirka: can you look at this, maybe you know what's going on here. And if not, reassign it back to dhcp. Thank you. I can't think of anything specific here. Though perhaps NetworkManager crashed and left a dhclient process running which is now continually trying to renew the address on the interface? If you run into this again please do a "ps aux" and filter for (NetworkManager || dhclient). If we can get the logs mentioned in comment 28, please re-open. |