|Summary:||CPU lockup during boot|
|Product:||[Fedora] Fedora||Reporter:||Bruno Wolff III <bruno>|
|Component:||kernel||Assignee:||Kernel Maintainer List <kernel-maint>|
|Status:||CLOSED RAWHIDE||QA Contact:||Fedora Extras Quality Assurance <extras-qa>|
|Version:||rawhide||CC:||aquini, awilliam, bruno, gansalmon, itamar, jistone, jonathan, kernel-maint, loganjerry, madhu.chinakonda, redhat, tflink|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2011-07-22 22:04:19 UTC||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
|Bug Depends On:|
Description Bruno Wolff III 2011-06-19 14:49:13 UTC
Comment 1 Bruno Wolff III 2011-06-19 14:50:26 UTC
Created attachment 505470 [details] /proc/cpuinfo
Comment 2 Bruno Wolff III 2011-06-19 14:51:51 UTC
Created attachment 505471 [details] lspci -vvv output
Comment 3 Bruno Wolff III 2011-06-19 15:03:12 UTC
I have also filed a kernel.org bug for this issue: https://bugzilla.kernel.org/show_bug.cgi?id=37872
Comment 4 Bruno Wolff III 2011-06-29 11:48:33 UTC
This is still happening with kernel-PAE-3.0-0.rc5.git0.1.fc16.i686.
Comment 5 Bruno Wolff III 2011-07-11 13:36:22 UTC
I am still seeing this with kernel-PAE-3.0-0.rc6.git6.1.fc16.i686. I am proposing as an alpha blocker, but it may be that the set of hardware affected is small.
Comment 6 Bruno Wolff III 2011-07-11 14:45:07 UTC
There is a suggested patch to try in the kernel bugtracker.
Comment 7 Bruno Wolff III 2011-07-11 14:51:04 UTC
Here is the patch that might fix things: --- linux-2.6.orig/kernel/sched.c +++ linux-2.6/kernel/sched.c @@ -7750,6 +7750,9 @@ static void init_cfs_rq(struct cfs_rq *c #endif #endif cfs_rq->min_vruntime = (u64)(-(1LL << 20)); +#ifndef CONFIG_64BIT + cfs_rq->min_vruntime_copy = cfs_rq->min_vruntime; +#endif } static void init_rt_rq(struct rt_rq *rt_rq, struct rq *rq)
Comment 8 Adam Williamson 2011-07-11 15:10:39 UTC
bruno: are you set up to test the patch yourself, or would it help for someone to build a patched kernel for you to try? -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
Comment 9 Bruno Wolff III 2011-07-11 16:01:14 UTC
I think I can do it myself. I don't do a lot of kernel building, but I have successfully done it in the past. I seem to have the build started, but it will take a while to finish. Since I can't test it until I get home from work anyway, taking a while to run should be OK.
Comment 10 Adam Williamson 2011-07-11 18:55:20 UTC
just in case, the Short Idiot's Guide To Kernel Patch Testing, Written By A Short Idiot: fedpkg co kernel cd kernel cp /path/to/patch.patch . nano kernel.spec (bump baserelease: I usually add 0.1, so it is 'newer' than the current official build but will be superseded by the next official build) (add the patch at the end of the big list of patches, probably around Patch12000 - e.g.:) Patch12000: patch.patch (find the big list of ApplyPatch statements and add one to the bottom:) ApplyPatch patch.patch (optionally, add a bit to the changelog) (save and exit) fedpkg srpm mock -r fedora-rawhide-x86_64 /path/to/kernel.src.rpm go get dinner and wait =)
Comment 11 Bruno Wolff III 2011-07-11 19:03:47 UTC
I got it started a couple of hours ago. The ApplyPatch stuff did throw me off a bit. I ended up doing it more with fedpkg, like I would for a regular package. I just did a local build, not a mock build. It's still running, so that's good. My memory from the last time is that it took about 12 hours to do a build. So I am hoping it's done before I need to sleep.
Comment 12 Bruno Wolff III 2011-07-11 23:21:28 UTC
I applied the patch to the -rc6.git6 kernel and I was able to boot both machines that had been locking up into graphical desktops.
Comment 13 Bruno Wolff III 2011-07-15 12:31:06 UTC
The fix has been posted to lkml, but is not yet in Linus' tree.
Comment 14 Adam Williamson 2011-07-15 17:23:40 UTC
Discussed at 2011-07-15 blocker review meeting. Given the severity of the impact, and Paul McKenney's suggestion that several users are affected by this - http://lkml.org/lkml/2011/7/12/298 - accepted as an Alpha blocker, under criterion "The installer must boot (if appropriate) and run on all primary architectures from default live image, DVD, and boot.iso install media" (or "In most cases (see Blocker_Bug_FAQ), a system installed according to any of the above criteria (or the appropriate Beta or Final criteria, when applying this criterion to those releases) must boot to the 'firstboot' utility on the first boot after installation, without unintended user intervention. This includes correctly accessing any encrypted partitions when the correct passphrase is supplied. The firstboot utility must be able to create a working user account", both subsume the idea that the system must boot).
Comment 15 Bruno Wolff III 2011-07-15 18:19:50 UTC
The fix is now in Linus' tree. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c64be78ffb415278d7d32d6f55de95c73dcc19a4 So it should show up in the next rawhide kernel update that uses an upstream update.
Comment 16 Bruno Wolff III 2011-07-15 21:51:14 UTC
Note that kernel-3.0-0.rc7.git1.1.fc16 is using a kernel tree from a copule of days ago and doesn't have the fix in it. No need to waste time testing the fix with that kernel.
Comment 17 Bruno Wolff III 2011-07-16 01:32:10 UTC
Somehow it didn't manage to get into kernel-3.0-0.rc7.git3.1.fc16 either. I double checked patch-3.0-rc7-git3.bz2 and it wasn't in there.
Comment 18 Josh Stone 2011-07-19 03:59:57 UTC
It's in the next snapshot: $ git describe --contains c64be78ffb415278d7d32d6f55de95c73dcc19a4 v3.0-rc7-git4~4 I built a local kernel with rc7-git6, and was finally able to boot my i686 VM.
Comment 19 Bruno Wolff III 2011-07-22 05:27:20 UTC
I tried out 3.0-0.rc7.git10.1.fc16 on one of the two machines I saw the problem on and things are working. I hope to test the other machine in the morning.
Comment 20 Bruno Wolff III 2011-07-22 11:06:10 UTC
I tested the second system and things are now working there as well.
Comment 21 Bruno Wolff III 2011-07-22 13:50:11 UTC
kernel-3.0-0.rc7.git10.1.fc16 is in rawhide this morning. I think this can probably be closed now.
Comment 22 Tim Flink 2011-07-22 22:04:19 UTC
As the latest kernel has been tested and verified to fix this issue, I am closing the bug.