470035 – xm dmesg printk spam -- Domain attempted WRMSR 00000000000000e8 from 00000016:3d0e9470 to 00000000:00000000

Bug 470035 - xm dmesg printk spam -- Domain attempted WRMSR 00000000000000e8 from 00000016:3d0e9470 to 00000000:00000000

Summary: xm dmesg printk spam -- Domain attempted WRMSR 00000000000000e8 from 00000016...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel-xen
Sub Component:
Version:	5.3
Hardware:	All
OS:	Linux
Priority:	urgent
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Chris Lalancette
QA Contact:	Martin Jenner
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	477647 (view as bug list)
Depends On:
Blocks:	488928
TreeView+	depends on / blocked

Reported:	2008-11-05 12:31 UTC by Jeff Layton
Modified:	2018-10-20 02:14 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2009-09-02 08:04:16 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Allow dom0 to write to the APERF/MPERF MSR (1.17 KB, patch) 2009-01-16 14:18 UTC, Chris Lalancette	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2009:1243	0	normal	SHIPPED_LIVE	Important: Red Hat Enterprise Linux 5.4 kernel security and bug fix update	2009-09-01 08:53:34 UTC

Description Jeff Layton 2008-11-05 12:31:54 UTC

I'm seeing a large number of messages pop up on my dom0 serial console with 2.6.18-121.el5.jtltest.53xen:

(XEN) printk: 7 messages suppressed.
(XEN) traps.c:1761:d0 Domain attempted WRMSR 00000000000000e8 from 00000029:d7ca940f to 00000000:00000000.
(XEN) printk: 1 messages suppressed.
(XEN) traps.c:1761:d0 Domain attempted WRMSR 00000000000000e8 from 0000002a:1b018f99 to 00000000:00000000.
(XEN) printk: 7 messages suppressed.
(XEN) traps.c:1761:d0 Domain attempted WRMSR 00000000000000e8 from 0000002b:7ce56d5c to 00000000:00000000.
(XEN) traps.c:1761:d0 Domain attempted WRMSR 00000000000000e7 from 00000038:b3032339 to 00000000:00000000.
(XEN) traps.c:1761:d0 Domain attempted WRMSR 00000000000000e8 from 0000002b:c04c7db3 to 00000000:00000000.
(XEN) traps.c:1761:d0 Domain attempted WRMSR 00000000000000e7 from 00000039:04aa3d86 to 00000000:00000000.
(XEN) traps.c:1761:d0 Domain attempted WRMSR 00000000000000e8 from 0000002c:24d0e568 to 00000000:00000000.
(XEN) traps.c:1761:d0 Domain attempted WRMSR 00000000000000e7 from 00000039:96d2e77e to 00000000:00000000.
(XEN) printk: 12 messages suppressed.
(XEN) traps.c:1761:d0 Domain attempted WRMSR 00000000000000e8 from 0000002c:83fb2b6e to 00000000:00000000.


...it may have started with earlier kernels -- I'm not sure. I've just noticed it. It's a 2 x dual core CPU box. cpuinfo from one of the cores is below:

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
stepping	: 11
cpu MHz		: 1998.000
cache size	: 4096 KB
physical id	: 0
siblings	: 1
core id		: 0
cpu cores	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
bogomips	: 7484.56
clflush size	: 64
cache_alignment	: 64
address sizes	: 38 bits physical, 48 bits virtual
power management:

...other info available upon request. I can also probably provide access to the box if needed.

Comment 1 Chris Lalancette 2008-11-05 13:30:50 UTC

Jeff,
    Hm, is this all of the time, or just on bootup of a dom0 or of a PV guest?  It's common for a guest (including dom0) to do this sort of poking around at boot time, but it shouldn't happen after that.  If it's happening more often than that, though, something else might be going on.

Chris Lalancette

Comment 2 Jeff Layton 2008-11-05 13:37:11 UTC

I see it pop up pretty regularly. My guests have been active for quite a while now and I still see the message pop occasionally (at least a few times every minute).

Comment 4 Issue Tracker 2008-11-12 22:02:06 UTC

I don't see these messages on a 5.2 system. Again, I'm not sure if this is
even a problem (it's not an artefact from increased debugging in beta
kernels, is it?), but I don't see this happening before 5.3. 

Internal Status set to 'Waiting on SEG'

This event sent from IssueTracker by streeter 
 issue 237281

Comment 5 Bill Burns 2008-11-13 15:51:43 UTC

Seems that something is different, we are looking into it...

Comment 6 RHEL Program Management 2008-12-01 20:21:17 UTC

This bugzilla has Keywords: Regression.  

Since no regressions are allowed between releases, 
it is also being proposed as a blocker for this release.  

Please resolve ASAP.

Comment 13 Bill Burns 2008-12-22 18:27:27 UTC

*** Bug 477647 has been marked as a duplicate of this bug. ***

Comment 14 Chris Lalancette 2009-01-16 13:17:10 UTC

OK.  I'm pretty sure I know what this is now, thanks to jlayton allowing me to poke around on his box.  For 5.3, we updated the acpi-cpufreq kernel module to user two new Intel MSR's, namely MSR_IA32_APERF and MSR_IA32_MPERF.  The problem is, the hypervisor doesn't know anything about these MSR's.  So the dom0 is trying to measure the frequency by doing:

rdmsr(MSR_IA32_APERF)
rdmsr(MSR_IA32_MPERF)

And then trying to reset the state of those two MSR's with:

wrmsr(MSR_IA32_APERF, 0)
wrmsr(MSR_IA32_MPERF, 0)

It's the latter which are probably spewing all of the messages.  This is probably pretty easily fixed by allowing the dom0 to actually do those wrmsr's, which should be a fairly simple patch to teach the hypervisor about them.  I'll try to come up with something.

Chris Lalancette

Comment 15 Chris Lalancette 2009-01-16 14:18:58 UTC

Created attachment 329207 [details]
Allow dom0 to write to the APERF/MPERF MSR

OK, I tested out the following patch on jlayton's failing box, and it did indeed quiet down the messages.  I'll try to float this patch upstream and see what kind of response I get.

Chris Lalancette

Comment 16 Chris Lalancette 2009-01-22 08:30:20 UTC

FYI; this was committed in upstream Xen as xen-unstable c/s 19055.

Chris Lalancette

Comment 17 Chris Lalancette 2009-01-23 10:27:09 UTC

I've uploaded a test kernel that contains this fix (along with several others)
to this location:

http://people.redhat.com/clalance/virttest

Could the original reporter try out the test kernels there, and report back if
it fixes the problem?

Thanks,
Chris Lalancette

Comment 18 Jeff Layton 2009-01-23 12:13:39 UTC

Booted to the new kernel and no longer see these messages. Looks like the patch works!

Comment 29 errata-xmlrpc 2009-09-02 08:04:16 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1243.html

Note You need to log in before you can comment on or make changes to this bug.