Bug 2474
Summary: | pump's CPU usage shoots up | ||
---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | ecarter |
Component: | pump | Assignee: | Erik Troan <ewt> |
Status: | CLOSED RAWHIDE | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 6.0 | CC: | duanev, gregpublic, kanellis, maurice |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2000-02-03 15:42:47 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
ecarter
1999-05-01 21:02:19 UTC
How I fixed my system: First I had to make ifup and ifdown in /etc/sysconfig/network-scripts use dhcpcd instead of pump when DHCP is being used. Also, the version of dhcpcd included in RH 6 is broken, so I upgraded that package. Patches for ifup and ifdown and the dhcpcd rpm I used are on anonymous ftp at 150.135.194.252. Note: I am not the only one who is going to be having problems using pump instead of dhcpcd. Cox cable modems in Phoenix, AZ require that the DHCP client use a hostname parameter. It does not appear that pump has this capability. *** Bug 504 has been marked as a duplicate of this bug. *** Tried a DHCPCD connection by cable modem using DHCPCD V.70 as included in Red Hat 5.2 AXP. Failed. Tried with V.65-4 from 5.1 and it works just dandy. ------- Additional Comments From dkl 12/29/98 17:15 ------- We would need to get a cable modem in to accurately test this bug. We will reopen it when we get one. This isue has been forwarded to a developer for further consideration. Have you tested it with the latest pump from the errata? I fixed something similiar to this... If you still see this, could you start pump, attach an strace to it, and see what happens when the CPU usage goes crazy? I'd like to see what strace says. ------- Email Received From Edward Carter <ecarter.edu> 06/24/99 02:23 ------- My Cox at Home cable modem needs my machine name on all DHCP inquiries. So when pump requested a renewal, it will go into an endless cycle. (These notes are against the latest pump from the errata. pump-0.6.7-1.i386.rpm) So I added the BOOTP_OPTION_HOSTNAME option to the following three queries: DHCP_TYPE_DISCOVER DHCP_TYPE_REQUEST DHCP_TYOE_RELEASE Similar to the change made to DHCP_TYPE_OFFER. The following two changes are also needed: Machine name added to ifcfg-eth1: HOST=xx99999-x Use the HOST macro in /sbin/ifup: if /sbin/pump i $DEVICE h $HOST Also, theres an overflow in handleTransaction(), at the nextTimeout *= 2 line; changed it to: nextTimeout = (NUM_RETRIES-tries)*2; Finally theres a memory leak in dhcpRelease(): free( intf->hostname ); free( intf->domain ); How much strace data would you like? I got 28M of it. I've placed the first 7000 lines or so in a file at: ftp://ftp.io.com/pub/usr/duanev/pump.txt This is with the errata'ed version from Redhat (pump-0.7.0 is what the directory says - I wish there was a version number *inside* the .c file - then I could be more sure). ls -l gives: 45295 Jul 27 17:58 pump.c Hey! My date is way off! 5 hours slow to be exact. And something in cron keeps setting it way off. I've been meaning to fix it but it hasn't been a priority until now. I'll bet that when the time jumps around pump can think brand new leases are up immediately. Adding test code to check this and force lease renewals in the main loop to be at least 30 minutes apart. I'll keep yall informed. *** Bug 4306 has been marked as a duplicate of this bug. *** Pump works Ok when I first boot. But hours later "top" reports that its eating up all my CPU and System resources. I have upgraded to the lattest version and have the same results. Iam connecting to a 1Mbit ISP called Sympatico in Toronto Canada ------- Additional Comments From duanev 08/25/99 09:58 ------- probably a duplicate of # 2474 Progress report (or lack there of). This sucker is difficult to recreate. I haven't been able to cause pump to hang by manually changing the system date (and I deliberatly have not fixed my jumping date problem yet). But what I do know is the pump problem occurs when pump fails to wakeup in time to renew the lease. By then my ISP has revoked my IP number and all subsequent renew operations fail. Pump then stupidly sits in a loop forever trying to renew anyway. So three things to fix once all is known: 1) pump should react correctly on renew failures (bring down the interface and fire it up with a new lease maybe?), 2) to make it more robust pump should have a backup plan for catching lease expiration (like wakeing up more often to see if the "state" of the system has changed), and 3) fix why pump is occasionally missing the expiration time. Ok, I've got a patch for #3, it seems stable enough to ignore #2, and I'll leave #1 up to someone who knows the dhcp protocol better than I do. I know about elapsed time vs. wall-clock time and that was the problem here. There are two very different types of time on computer systems - but most everyone thinks there is only one. Wall clock time is what you get with date(1) and time(2). Is needs to track GMT ala your local offset and is expected to speed up or slow down to sync with whatever external synchronization source you choose. This can be a GPS reference via ntpdate or it can be your watch via your fingers and date(1), but it is expected that this time WILL JUMP periodically to match the source. Elapsed time is the precise number of seconds (or miliseconds) that have elapsed since some time in the past when your app started its timer. This time is expected to REMAIN STABLE no matter if the clock crystals in the computer drift or not. These two applications don't mix, elapsed time people hate the jumps in wall-clock time, and wall-clock time people hate the drift in elapsed time. pump was using wall-clock time but needed to use elapsed time. When the date jumped (wildly on my system) pump would occasionally botch the lease renewal and loose the ip. Then it would try forever to renew the lease to no avail. ftp://ftp.io.com/pub/usr/duanev/pump-0.7.0-djv.patch I've also added a fair number of new debug messages and made the --status output easier to awk or grep|cut. (everyone is going to use pump to extract the dynamic ip address right?) Take what you need, dump the rest. Ok, I've got a patch for #3, it seems stable enough to ignore #2, and I'll leave #1 up to someone who knows the dhcp protocol better than I do. I know about elapsed time vs. wall-clock time and that was the problem here. There are two very different types of time on computer systems - but most everyone thinks there is only one. Wall clock time is what you get with date(1) and time(2). Is needs to track GMT ala your local offset and is expected to speed up or slow down to sync with whatever external synchronization source you choose. This can be a GPS reference via ntpdate or it can be your watch via your fingers and date(1), but it is expected that this time WILL JUMP periodically to match the source. Elapsed time is the precise number of seconds (or miliseconds) that have elapsed since some time in the past when your app started its timer. This time is expected to REMAIN STABLE no matter if the clock crystals in the computer drift or not. These two applications don't mix, elapsed time people hate the jumps in wall-clock time, and wall-clock time people hate the drift in elapsed time. pump was using wall-clock time but needed to use elapsed time. When the date jumped (wildly on my system) pump would occasionally botch the lease renewal and loose the ip. Then it would try forever to renew the lease to no avail. ftp://ftp.io.com/pub/usr/duanev/pump-0.7.0-djv.patch I've also added a fair number of new debug messages and made the --status output easier to awk or grep|cut. (everyone is going to use pump to extract the dynamic ip address right?) Take what you need, dump the rest. I see a similar thing if the network goes completely down and thus pump is not able to renew lease. pump-0.7.2-2. Also, pump stops logging after a while. Restarting syslog does not help. Restarting pump does. This should all be fixed in pump 0.7.6, which will make it onto our ftp site next week. I'll have this on ftp://people.redhat.com/ewt/ later this afternoon for testing. Thanks for the good comments and the patch; sorry it took so long to get this taken care of. |