Created attachment 368985 [details] syslog from a normal boot Description of problem: A test with the current rawhide anaconda images using 2.6.31.5-127.fc12.i686 shows that on a test hardware an installation progress requires a constant pounding on a keyboard to get "unstuck" things. This was definitely not the case with images from the end of October when 2.6.31.5-96.fc12.i686 kernel was used. Moreover a "keyboard beep" becomes a few seconds long siren. An attempt to reboot gets stuck on "waiting for mdraid sets to become clean" and in this moment any further progress becomes impossible. In syslog one will find <4>Fast TSC calibration failed <6>TSC: PIT calibration matches PMTIMER. 1 loops instead of <4>Fast TSC calibration using PIT With 'nohz=off' added to boot options these nasties are going away. Version-Release number of selected component (if applicable): anaconda 12.46 2009-11-10 images How reproducible: every time Additional info: Acer TravelMate 230 laptop used in testing
Created attachment 368986 [details] syslog from a boot with added 'nohz=off'
These messages likely have nothing to do with any problem in anaconda, and are more likely caused by a change in kernel versions. Reassigning.
can you please provide the same log from the last working kernel for comparison? thanks. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
Vedran: why did you silently change my settings on this bug? mjg has told me in the past that he is interested in timer bugs and wishes them to be assigned to him, it _is_ a regression as far as the user's concerned (it did not happen in kernel 96). severity is debatable, but I don't see that you have the right to override my call with no justification. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
> can you please provide the same log from the last working kernel for > comparison? Not from the same hardware unfortunately. It will likely look similar to what you see with 'nohz=off' and is attached. Attached outputs result from testing anaconda installation images; which I am trying to do from time to time. The laptop in question still runs in this moment Fedora 10 (where 'clocksource=jiffies' is required or you are practically goner; c.f. still NEW bug 476609). I overwrote older images and they are not on mirrors too. As for a "debatable severity" this is not a very big deal for _me_. It is likely a different story with a "newbie" attempting to install on a similarly affected hardware. At the end of October this "just worked".
output from different hardware is worthless. I wanted to see the log from the affected system with -96 because it may _not_ be identical to the log with nohz=off ; it's not like we suddenly enabled the tickless timer between -96 and -127 or anything, so I want to know what _else_ has changed which has suddenly exposed this issue on your machine. The logs might tell us that. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
> output from different hardware is worthless. Well, yes. So I did not provide it. :-) > ... it may _not_ be identical ... I agree. Only that I do not have a way to provide it. Still both kernels were 2.6.31.5-<something>. I know that for sure as I happen to have logs from a different hardware so this is a reference point.
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle. Changing version to '12'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
I just got some .iso files and found that 2.6.31.5-127.fc12 with this bug was used on distribution images for Fedora 12. Oh, great! Some will have an extra "fun" when trying installs/updates. http://fedoraproject.org/wiki/Bugs/Common does not mention the issue.
well, yeah, we'd already decided on the final package set when you reported this, it was far too late to change anything. this isn't a 'common' bug, since it's precisely hardware-specific, and it's not considered a blocker (there have been issues of this kind with every release since the tickless timer was enabled by default). -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
Current Fedora 12 kernels, and 2.6.31.6-166.fc12.i686 in particular, suffer from the same affliction. This replaces requirement of 'clocksource=jiffies' for Fedora 10 kernels (cf. bug 476609).
I have encountered the same bug on a Dell Optiplex 760 PC. The computer seems to be running OK with the nohz=off kernel argument.
aram: please file a new bug. Each instance of this problem is hardware-specific and needs to be tracked separately.
I tried if it would be possible to drop nohz=off for the latest updates kernel-2.6.32.9-67.fc12.i686 and a hardware from this report (Acer TravelMate 230 notebook). If anything this got even worse. A boot was moving forward until initramfs was loaded. At that moment everything was stopped and nothing happened for many minutes until I lost patience and powered down then whole thing. With nohz=off luckily it does boot.
The other workaround is to hit enter a few times when it seems to freeze, as discovered by someone on the forums. http://forums.fedoraforum.org/showthread.php?t=242122 While, in theory, a relatively unimportant bug, it's already driven one person away on the testing list. In my case, it only affected one machine, an Acer 4720z with an integrated Intel Mobile GM965/960, the same or nearly the same card as the person on the forums.
Doesn't have to be enter, any key will do (I use space bar). The graphics card has nothing to do with it. Please file a new bug for each system affected by this problem, the fix can be different for every system even if the symptom is the same. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
(In reply to comment #15) > The other workaround is to hit enter a few times when it seems to freeze, as > discovered by someone on the forums. It does not help very much on my machine. That means that I can move with an installation if I am constantly generate some keyboard interrupts but it would take a few days to complete such process and results would be unusable and a system clock entirely on a wild side. In any case as noted in comment #14 a presence of kernel-2.6.32.9-67.fc12.i686 does not improve this situation. Without 'nohz=off' my Acer simply does not boot after getting stuck inside of initramfs.
'nohz=off' is still required with current Fedora 13 kernels - like, more specific, 2.6.33.6-147.2.4.fc13.i686. Strictly speaking it is now sometimes possible to "somewhat" boot without this parameter, and even without pounding on a keyboard too much, but then X will not start, or a machine will decide that it is overheating and it will shut off, or both, or something else of that sort. Even without an automatic shutdown one can hear from time to time fans trying to commit a suicide with overrevs. In any case a boot is not reliable and prone to hangs in udev. Nothing of that sort if 'nohz=off' is used. A possibly related could be the following fragment from dmesg (that from a boot with 'nohz=off'): ACPI: Core revision 20091214 Enabling APIC mode: Flat. Using 1 I/O APICs ..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1 ..MP-BIOS bug: 8254 timer not connected to IO-APIC ...trying to set up timer (IRQ0) through the 8259A ... ..... (found apic 0 pin 0) ... ....... works. I am afraid that I have the latest BIOS for that particular mobo I was able to find.
i'm starting to wonder if we need to file these upstream or pay someone to look at them or something, they don't seem to get any traction :( mine's been open for ages now.
(In reply to comment #19) > i'm starting to wonder if we need to file these upstream or pay someone to look > at them or something, they don't seem to get any traction :( mine's been open > for ages now. Which bug is that?
https://bugzilla.redhat.com/show_bug.cgi?id=516870 -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
Created attachment 439524 [details] dmesg from boot of 2.6.33.6-147.2.4.fc13.i686 without 'nohz=off' dmesg output here is from a boot to level 1. When trying to boot to level 3 or 5 something invariably gets firmly stuck and a boot never finishes. During these attempts not only fans are hitting maximal revs, pretty quickly even while booting but in an intermittent manner, but a "scientific" test of keeping fingers behind fan exhaust indicates that indeed a really hot air is expelled. That never happens when 'nohz=off' is used. Even if a laptop booted that way shutting it down cleanly is really difficult and likely impossible without quite a few extra keyboard interrupts. /sys/devices/system/clocksource/clocksource0/{available,current}_clocksource both give 'acpi_pm'; regardless if booted without 'nohz=off' or with.
Can you try 2.6.36-rc2 from rawhide?
(In reply to comment #23) > Can you try 2.6.36-rc2 from rawhide? In three attempts to boot 2.6.36-0.7.rc2.git0.fc15.i686 without 'nohz=off' two got stuck, fast, in "Starting udev" and one later in "Retrigger failed udev events". In this second case it was possible to force a progress with a keyboard interrupt (although start of various services was really slow). A 'reboot', in this one case when I reached a shell prompt, was not really moving anywhere without a constant "help" from a keyboard and actually powered down a laptop instead of restarting it. dmesg from this one case when boot finished is attached (although the only notable thing seems to be an infamous "rcu_dereference_check() without protection"). Booting the same kernel with 'nohz=off' does not show of any symptoms above and 'reboot' acts really as a reboot. Does plymouth have different requirements for 2.6.36? With 'rhgb quiet' a graphics background is successively erased by what looks like scrolling blocks of black-on-black text. Just a cosmetics but this does not happen when booting a "regular" 2.6.33.6-147.2.4.fc13.i686.
Created attachment 440409 [details] dmesg from 2.6.36-0.7.rc2.git0.fc15.i686 without 'nohz=off'
This message is a reminder that Fedora 12 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 12. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '12'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 12's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 12 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
(In reply to comment #26) > This message is a reminder that Fedora 12 is nearing its end of life. Well, comment 24 was describing problems with 2.6.36-0.7.rc2.git0.fc15.i686. The last "release" kernel I had an opportunity to try was 2.6.33.6-147.2.4.fc13.i686 and it required 'nohz=off' or weird things were happening. Currently the machine which was displaying that behaviour is "out-in-the-field" and it should return in the end of December.
I had been using nohz=off on a Dell Optiplex 760 PC since 2010 but it no longer prevented stuttering boots with a new EL6.3 kernel. I found that the problem went away after I updated the BIOS. nohz=off is no longer required on that computer.
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle. Changing version to '19'. (As we did not run this process for some time, it could affect also pre-Fedora 19 development cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.) More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19