Description of problem: I have a fresh install of rh9 from official media. after install at the console, I ran 'up2date -u -f' as root. this started the process of updating the system with the latest rpms. after apprxoximately 20 minutes, the power LED on the case of the machine begins to flash as though the APM has kicked in. shortly after that, right in the middle of up2date, the machine becomes unresponsive: the keyboard does not respond, and it is unpingable from the LAN. I have to hit the reset button on the machine. Version-Release number of selected component (if applicable): kernel-2.4.20-20.9 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: thinking that APM might be the issue, I rebooted the machine after a crash and went to bed. when I got up in the morning, everything was fine, however once I started heavy activity on the machine, the green light on the case begins to flash, and again the lockup. I rebooted again and disabled APM features of the BIOS and started the up2date. again, I had the same lockup problem. I altered the grub.conf file to pass "apm=off" to the kernel command line, and tried again. again, after a short time, the machine locks up. knowing that I had a short window within which to operate, I rebooted again and this time ran 'up2date -u -f kernel'. this update was successful, and I rebooted the machine via the LAN. I thought all was well. I logged in at the console and again ran 'up2date -u', hoping the updated kernel would resolve the issue, but alas, it locked up while doing something with glibc-common. I am not sure if it was 'installing' at the time or not, but hopefully since I turned on transactions it will be fine. I have done some testing, and if I do not run 'up2date -u', but rather stay logged in remotely and hit return at a shell prompt now and again (i.e., machine quite idle), it is stable. the green light does not flash on the box, and all seems well. it has been up for over an hour now without an issue. I suspect APM because the green light on the case flashes. I suspect kernel because this only happens when the machine is under load (>1.00). note that when the green light flashes on the box, the keyboard of the console is not responsive, nor is the network interface, yet the BIOS is responsive enough to reboot if I hit the reset button. well, the machine just now rebooted due to a power fluctuation, so the uptime is a little over an hour without an issue. if I were to run up2date, it would take only about 20 minutes for the machine to lock up.
I thought I had solved this problem by installing up2date-gnome, since it seemed to proceed normally for quite a while.. but it locked up eventually :( it seems that it *always* locks up while running up2date. it seems I can do anything else I want on the machine without trouble, but if I run up2date, the machine goes down soon after, usually midway through an install or something, which is annoying. the machine is not really doing anything all that amazing right now, as it just runs a DNS cache for the lan.. so I can reboot it a lot, and it doesn't really impact public services. [..much time passes..] ok, quark is up and seems stable. I basically upgraded the system in a series of transactions. I upgraded glibc and glibc-common, then proceeded to start a new transaction that would upgrade everything else. it crashed during this 2nd phase, but when I rebooted again and ran up2date via strace, I had no problems, and it upgraded everything as it was supposed to. it would be nice to know how exactly the machine could have been put into sleep mode (or whatever mode it was in) even though I have APM turned off in the BIOS, and passed 'apm=off' on the command line. bottom line is that my machine is stable now (or at least it seems to be), but I'd still like to know where/why up2date seems to lock up, especially since it is a total "denial of service".. I had quick access to the reset button, fortunately, but what if I hadn't? note that I am changing the component for this bug to 'up2date', when what I really want is to select both, since I suspect it is some sort of odd interaction.
[~] [9:02pm] [quark] % cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 5 model : 8 model name : AMD-K6(tm) 3D processor stepping : 12 cpu MHz : 350.802 cache size : 64 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr mce cx8 pge mmx syscall 3dnow k6_mtrr bogomips : 699.59
What do you mean by the "machine just rebooted due to a power fluctuation"? If the machine just rebooted out-of-the-blue, it could be some kind of power supply or motherboard defect that could also be what's causing the up2date lockups. stracing up2date might have changed timings enough to prevent you from hitting whatever hardware problem is causing the lockup. (I've seen that kind of thing with hardware failures before.) I'm not saying I'm 100% confident it's hardware, but when I've seen machines act like this, it's usually if not always hardware... (And if the power is really fluctuating to the point that you can see it in any nearby lights or the like, and the machine rebooted at the same time, it would be a good idea to put it on a UPS.)
very simply put, I have a UPS, but I have *way* too much stuff plugged into it, and I sometimes see "spontaneous" reboots of several machines connected to it. I do not think that is causing my problem, though. note that quark has been up and stable for 24 hours now, so I am tempted to close this bug.
quark is one of my oldest machines, so I am not suprised it has issues. if it stays up for another 24 hours, I'll close this.
either way, this is would be a kernel bug, not an up2date one, reassigning to kernel (if not just a hardware issue)
well, I don't know for sure what the heck the problem is, but quark has been stable for quite some time.. it has rebooted a couple of times, so the uptime is only three days, but I have not noticed the same symptoms that I opened this ticket about. I agree that this is a kernel issue, not an up2date one. I feel strongly that the power fluctuations and "bad hardware" are not the issue, especially since the machine has been fine for weeks now.. then again, I'm not exactly pushing quark above 1.00, so who knows? would it be helpful if I attached dmesg output, or other stuff? I suppose I could go out of my way to run some sort of stress on the machine and see if I can get it to break, but I'd rather not :)
I'm not sure it is related but I have a rather similar problem. If I let my laptop idle while running GNOME, the hard-drive led will start blinking, my keyboard won't work anymore and I'll have to reset the machine... It only happens when I'm using GNOME though. I have tried to reporduce by just running TWM and XMMS but it didn't happen. Then, I guess that it only happens if the CPU is much used (Evolution + Rhythmbox + GNOME uses more CPU than TWM + xmms). I have re-produces the problem using rhythmbox-xine, rhythmbox-gstreamer or xmms, and OSS or ALSA.
Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/