Bug 449346
Summary: | SMP 32bit RHEL5u1 and RHEL5u2 HVM domain might stop booting when start udev service | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Adam Stokes <astokes> | ||||||||
Component: | kernel-xen | Assignee: | Rik van Riel <riel> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Martin Jenner <mjenner> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | high | ||||||||||
Version: | 5.4 | CC: | clalance, cward, dwa, dzickus, jburke, joe.jin, mathieu-acct, qcai, riel, rodney.mckee, tao, tn, xen-maint, yongkang.you | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | All | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | |||||||||||
: | 480689 513395 (view as bug list) | Environment: | |||||||||
Last Closed: | 2009-09-02 08:35:10 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 460955, 480689, 513395 | ||||||||||
Attachments: |
|
Description
Adam Stokes
2008-06-02 10:42:31 UTC
Adam, which timer mode did you use for the patch to fix your problem? Rik, in my Caneland platform, I get much high hang precent than 20% to boot a SMP PAE RHEL5u1/2 guest. 32e guest doesn't have this issue. I remember both timer_mode 1 and 2 would help booting. It looks like we will also need the following two changesets from xen-unstable, in addition to the changeset above: changeset: 18554:1420a6649cfa user: Keir Fraser <keir.fraser> date: Fri Sep 26 17:09:36 2008 +0100 summary: hvm: Default timer_mode=1 (do not delay virtual time for missed changeset: 16764:3f26758bcc02 user: Keir Fraser <keir.fraser> date: Fri Jan 18 22:27:51 2008 +0000 summary: xend: Handle unspecified timer_mode domain platform parameter. Other timer changesets we may need for HVM time to run correctly: changeset: 18729:16eede823854 user: Keir Fraser <keir.fraser> date: Tue Oct 28 10:36:22 2008 +0000 summary: hvm: Do not mess with APIC timer deadline if in one-shot mode. changeset: 18695:6f74549ac4c5 user: Keir Fraser <keir.fraser> date: Wed Oct 22 12:08:16 2008 +0100 summary: x86, hvm: Allow 100us periodic virtual timers changeset: 18694:71c15dfaa12b user: Keir Fraser <keir.fraser> date: Wed Oct 22 12:04:32 2008 +0100 summary: Port HPET device model to vpt timer subsystem changeset: 17716:6c4cab061af4 user: Keir Fraser <keir.fraser> date: Sat May 24 09:27:03 2008 +0100 summary: hvm: Build guest timers on monotonic system time. changeset: 16690:01adaec882d4 user: Keir Fraser <keir.fraser> date: Tue Jan 08 14:31:23 2008 +0000 summary: hvm: time: Fixes to 'SYNC' (no_missed_ticks_pending) timer handling . changeset: 16689:66db23ecd562 user: Keir Fraser <keir.fraser> date: Tue Jan 08 13:57:45 2008 +0000 summary: hvm: hpet: Fix per-timer enable/disable. Related to bug 307201 ? Some more HVM HPET candidates: changeset: 17017:209512f6d89c user: Keir Fraser <keir.fraser> date: Mon Feb 11 14:45:29 2008 +0000 summary: x86 hvm: Allow HPET to be configured as a per-domain config option. changeset: 16707:51aa2f884f64 user: Keir Fraser <keir.fraser> date: Fri Jan 11 11:01:36 2008 +0000 summary: hvm: hpet: Tidy up hpet_to_ns_limit calculation. changeset: 16697:1b2be7cf0b7b user: Keir Fraser <keir.fraser> date: Wed Jan 09 10:32:13 2008 +0000 summary: hvm: hpet: Clamp period to sane values to prevent excessive looping in changeset: 16693:9ff64d045e61 user: Keir Fraser <keir.fraser> date: Tue Jan 08 16:20:04 2008 +0000 summary: hvm: hpet: Fix overflow when converting to nanoseconds. changeset: 16689:66db23ecd562 user: Keir Fraser <keir.fraser> date: Tue Jan 08 13:57:45 2008 +0000 summary: hvm: hpet: Fix per-timer enable/disable. changeset: 16486:c00f31f27de6 user: Keir Fraser <keir.fraser> date: Wed Nov 28 13:13:51 2007 +0000 summary: hvm: Fix 2 type mismatches in vlapic.h and hpet.c for 32-bit build Xen changeset: 16404:ae6f4c7f15cb user: Keir Fraser <keir.fraser> date: Wed Nov 21 09:49:09 2007 +0000 summary: hvm: Do not crash guest if it does an unaligned access to an HPET Created attachment 331184 [details] patch 1/3 of the tools bit I am building a kernel now with a bunch of backported patches, CVS branch private-bz449346-branch, http://brewweb.devel.redhat.com/brew/taskinfo?taskID=1682472 Created attachment 331185 [details]
patch 2/3 of the tools bits
Created attachment 331186 [details]
patch 3/3 of the tools bits
This bug can be caused by a combination of two main factors: - while doing disk IO, one VCPU of an HVM guest can miss timer ticks - Xen did not re-deliver those missed timer ticks later on, causing clock skew between VCPUs inside an HVM guest Both of these issues should be resolved with the backport of the AIO disk handling code and upstream Xen 'no missed-tick accounting' timer code. Please test the test RPMs from http://people.redhat.com/riel/.xenaiotime/ and let us know if those (experimental!) test packages resolve the issue. *** Bug 490760 has been marked as a duplicate of this bug. *** *** Bug 499276 has been marked as a duplicate of this bug. *** in kernel-2.6.18-146.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified. ~~ Attention - RHEL 5.4 Beta Released! ~~ RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner! If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity. Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value. Questions can be posted to this bug or your customer or partner representative. Try this issue on RHEL5.4 beta. If using RHEL5u1 PAE image, still see few times guest booting stopped at "waiting for driver initialization". But doesn't see any issue with RHEL5u4 PAE image. *** Bug 461640 has been marked as a duplicate of this bug. *** *** Bug 307201 has been marked as a duplicate of this bug. *** An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1243.html The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |