Bug 710609 - Kernel trace on m2.4xlarge or m2.2xlarge instances in EC2
Kernel trace on m2.4xlarge or m2.2xlarge instances in EC2
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel (Show other bugs)
6.1
x86_64 Linux
urgent Severity urgent
: rc
: ---
Assigned To: Frantisek Hrbata
Red Hat Kernel QE team
: EC2, ZStream
Depends On: 709856
Blocks:
  Show dependency treegraph
 
Reported: 2011-06-03 16:32 EDT by Ken Reilly
Modified: 2013-01-09 18:55 EST (History)
24 users (show)

See Also:
Fixed In Version: kernel-2.6.32-131.3.1.el6
Doc Type: Bug Fix
Doc Text:
Xen guests cannot make use of all CPU features, and in some cases they are even risky to be advertised. One such feature is CONSTANT_TSC. This feature prevents the TSC (Time Stamp Counter) from being marked as unstable, which allows the sched_clock_stable option to be enabled. Having the sched_clock_stable option enabled is problematic for Xen PV guests because the sched_clock() function has been overridden with the xen_sched_clock() function, which is not synchronized between virtual CPUs. This update provides a patch, which sets all x86_power features to 0 as a preventive measure against other potentially dangerous assumptions the kernel could make based on the features, fixing this issue.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-06-15 12:09:39 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
tier1 & tier2 kernel qe tests (117.97 KB, application/x-bzip)
2011-06-15 12:46 EDT, wes hayutin
no flags Details

  None (edit)
Description Ken Reilly 2011-06-03 16:32:51 EDT
This bug has been copied from bug #709856 and has been proposed
to be backported to 6.1 z-stream (EUS).
Comment 5 Chris Morgan 2011-06-09 09:18:45 EDT
So is this an ack from kernel QE?
Comment 6 Qixiang Wan 2011-06-09 09:29:42 EDT
(In reply to comment #5)
> So is this an ack from kernel QE?

no, this is from Virt QE, we performed these tests from xen userspace with RHEL6.1 guest to avoid regression. Kernel QE are updating the kernel Tier1/2 test results in https://errata.devel.redhat.com/errata/show/11253 . Both the sides should get pass before verify this bug.
Comment 10 wes hayutin 2011-06-13 22:34:49 EDT
Have any tier1, tier2 tests been run against the new kernel?  Are there any beaker tests that may test the specific issue of the guest crashing?

If there are any I will run them in the ec2 env.  Thanks
Comment 11 Dayong Tian 2011-06-13 22:49:18 EDT
Kernel Tier1 tests passed, Tier2 tests was still running.
We chose some specific tests for the bug in Tier2 tests, following tests were included:
/kernel/power-management/multi-thread-gettimeofday
/kernel/power-management/multi-thread-clock_gettime
/kernel/power-management/time-warp-test
/kernel/power-management/clock_gettime
/kernel/power-management/diff-clock-source
/kernel/stress/racer
/kernel/vm/193695
/kernel/distribution/ltp/20100831
/kernel/misc/autotest_r5278
Comment 12 Igor Zhang 2011-06-13 23:36:30 EDT
I ever reproduced the bug in-house twice.
Host:
RHEL5.3 and kernel 2.6.18-128.1.10.el5

Guest:
RHEL6.1 and kernel 2.6.32-131.3.1.el6

See the log rhel6u1_x86_64_pv_install.log in
https://beaker.engineering.redhat.com/jobs/95043

And rhel6u1_i386_pv_install.log in
https://beaker.engineering.redhat.com/jobs/94623


At the same time, we found user-space packages xen and xen-libs in RHEL5.3 don't support RHEL6.1 installation as a guest. Then I retested under another configuration:
Host:
RHEL5.6 and kernel 2.6.18-238.12.1.el5

Guest:
RHEL6.1 and kernel 2.6.32-131.4.1.el6

Now the jobs on architectures Intel Nehalem and Intel system without nonstop_tsc flag are still queuing. The finished ones have passed our regression tests. For instance:
https://beaker.engineering.redhat.com/jobs/96462
Comment 13 Andrew Jones 2011-06-14 05:07:18 EDT
(In reply to comment #12)
> At the same time, we found user-space packages xen and xen-libs in RHEL5.3
> don't support RHEL6.1 installation as a guest.

The best config for testing would be 5.3 kernel-xen and 5.6/7 xen userspace, and 2.6.32-131.4.1.el6 for the guest kernel.
Comment 16 errata-xmlrpc 2011-06-15 12:09:39 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0874.html
Comment 17 wes hayutin 2011-06-15 12:46:13 EDT
Created attachment 504907 [details]
tier1 & tier2 kernel qe tests

tier1 & tier2 kernel qe tests
all pass 

executed in ec2 us-east-1 w/ m2.2xlarge

Note You need to log in before you can comment on or make changes to this bug.