Bug 201799

Summary:

running 'dmidecode' in DomU throws thousands of xen printk's

Product:

[Fedora] Fedora

Reporter:

Matt C <wago>

Component:

xen

Assignee:

Steven Rostedt <srostedt>

Status:

CLOSED RAWHIDE

QA Contact:

Brian Brock <bbrock>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

CC:

bstein, katzj, nobody+bjmason, riel, wago

Target Milestone:

---

Target Release:

---

Hardware:

All

OS:

Linux

URL:

https://www.redhat.com/archives/fedora-xen/2006-August/msg00042.html

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2006-10-04 17:01:38 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

150224, 185595

Attachments:

Description	Flags
Linux kernel patch to catch unprivileged access to /dev/mem early	none

Description Matt C 2006-08-08 21:52:48 UTC

Description of problem:
Running the 'dmidecode' tool inside of a DomainU causes the hypervisor to throw
thousands of warnings to console. On a slow console (ie: RS232/9600), this can
throw CPU spinlock lockup errors as well.

Version-Release number of selected component (if applicable):
kernel-xen-2.6.17-1.2157_FC5

How reproducible:
Boot paravirtualized DomainU system and run the dmidecode tool.

Steps to Reproduce:
1. build a fresh FC5 system
2. install and boot kernel-xen-2.6.17-1.2157_FC5
3. build a paravirtualized/nonprivileged virtual host image
4. boot the paravirtualized host
5. run the dmidecode tool as root in the virtual host
  
Actual results:
I basically get thousands of the following printk messages, starting with:

(XEN) DOM4: (file=mm.c, line=578) Non-privileged attempt to map I/O space 000000f0

and counting up and ending with:

(XEN) DOM4: (file=mm.c, line=578) Non-privileged attempt to map I/O space 000000ff

Then, some of the time, the kernel throws a spinlock timer at the end of all of
these printks:

Timer ISR/1: Time went backwards: delta=-46186990 delta_cpu=-42186990
shadow=2112479126776 off=466807746 processed=2112992120678
cpu_processed=2112988120678

BUG: spinlock lockup on CPU#0, swapper/0, c06e9284 (Not tainted)

<c04e74b0> __spin_lock_debug+0x7b/0x85 <c04e7514> _raw_spin_lock+0x5a/0x69

 <c061eedb> _spin_lock+0x6/0x8  <c0407d80> timer_interrupt+0x40/0x563
 <c0434533> ktime_get_ts+0x4b/0x51  <c0434484> ktime_get+0x10/0x3a
 <c0442fe3> handle_IRQ_event+0x40/0x82  <c04430b0> __do_IRQ+0x8b/0xdf
 <c0406336> do_IRQ+0x1a/0x25  <c054c937> evtchn_do_upcall+0x84/0xb8
 <c0404c71> hypervisor_callback+0x3d/0x48  <c04087da> safe_halt+0x1a/0x2f
 <c0402b98> kernel_thread_helper+0x0/0xb  <c04028a9> xen_idle+0x45/0x4d
 <c0402945> cpu_idle+0x94/0xad  <c07047cf> start_kernel+0x1c1/0x1c5
 0: 2112992120678
 1: 2112988120678
BUG: soft lockup detected on CPU#1!

<c0442e2d> softlockup_tick+0x9a/0xab <c040825d> timer_interrupt+0x51d/0x563

 <c0442fe3> handle_IRQ_event+0x40/0x82  <c04430b0> __do_IRQ+0x8b/0xdf
 <c0406336> do_IRQ+0x1a/0x25  <c054c937> evtchn_do_upcall+0x84/0xb8

<c0404c71> hypervisor_callback+0x3d/0x48 <c054c8b1> force_evtchn_callback+0xa/0xc

 <c04248cf> __do_softirq+0x67/0xee  <c0424995> do_softirq+0x3f/0x63
 <c040633b> do_IRQ+0x1f/0x25  <c054c937> evtchn_do_upcall+0x84/0xb8
 <c0404c71> hypervisor_callback+0x3d/0x48  <c04087da> safe_halt+0x1a/0x2f
 <c04028a9> xen_idle+0x45/0x4d  <c0402945> cpu_idle+0x94/0xad
BUG: soft lockup detected on CPU#0!

<c0442e2d> softlockup_tick+0x9a/0xab <c040825d> timer_interrupt+0x51d/0x563

 <c0434533> ktime_get_ts+0x4b/0x51  <c0442fe3> handle_IRQ_event+0x40/0x82
 <c04430b0> __do_IRQ+0x8b/0xdf  <c0406336> do_IRQ+0x1a/0x25

<c054c937> evtchn_do_upcall+0x84/0xb8 <c0404c71> hypervisor_callback+0x3d/0x48

 <c04087da> safe_halt+0x1a/0x2f  <c0402b98> kernel_thread_helper+0x0/0xb
 <c04028a9> xen_idle+0x45/0x4d  <c0402945> cpu_idle+0x94/0xad
 <c07047cf> start_kernel+0x1c1/0x1c5


Once, I even had the storage controller (3ware 9500) barf on me due to an I/O
timeout that it couldn't recover from. I didn't have my console capture setup
for that one, unfortunately.

Expected results:
It would be fine if either:
A. dmidecode found no valid data structures _and_ the hypervisor silently
handled this.
B. xen provided a fake smbios datastructure for dmidecode to decode

Additional info:
I assume that this is the smbios hardware address space that dmidecode is trying
to access via the guest kernel. I'm also guessing that the printk has interrupts
disabled or something similarly funky, so the flood of them causes the CPU
lockup messages as well as probably the I/O problems.

I saw some patches on the xen-devel list about adding real smbios data to the
HVM virtual hosts, but it seemed to be very HVM-specific code, and I'm running
paravirts. The thread from that is here:

http://lists.xensource.com/archives/html/xen-devel/2006-07/msg00416.html

As you can imagine, the whole system (all virts) is unuseable while this is
happening, and it runs for minutes at a time.

Comment 1 Rik van Riel 2006-08-10 15:53:48 UTC

With kernel 2.6.17-1.2519.el5xen on x86-64 I get only one "(XEN) DOM32:
(file=mm.c, line=578) Non-privileged attempt to map I/O space 000000f0" warning
by the hypervisor.

Stracing dmidecode shows this:

write(1, "# dmidecode 2.7\n", 16# dmidecode 2.7
)       = 16
open("/dev/mem", O_RDONLY)              = 3
brk(0)                                  = 0x60f000
brk(0x640000)                           = 0x640000
mmap(NULL, 65536, PROT_READ, MAP_SHARED, 3, 0xf0000) = -1 EFAULT (Bad address)
write(2, "/dev/mem: ", 10/dev/mem: )              = 10
write(2, "mmap: Bad address\n", 18mmap: Bad address
)     = 18
brk(0x630000)                           = 0x630000
exit_group(1)                           = ?
Process 23451 detached

I'll try to reproduce the problem with the latest FC6 Test2 kernel.

Comment 2 Rik van Riel 2006-08-10 17:23:53 UTC

OK, the latest rawhide and rhel5 kernels both show the better behaviour - only
one hypervisor message.

To get rid of it completely I guess we could disable mmaping /dev/mem in domU.

Comment 3 Matt C 2006-08-10 22:55:02 UTC

I don't think I was clear enough with my test case, sorry.

This happens with older dmidecode-2.2. The newer dmidecode-2.7 that you tested
seems to be more intelligent and aborts immediately if the smbios structure
isn't present. The older/broken 2.2 version scans through the entire smbios
memory range, structure or not. The problem is that the vhosts that I'm
supporting are based on RHEL3, which shipped with dmidecode-2.2.

I agree that disabling /dev/mem access in domU would be a good consideration. At
a minimum, just blocking the smbios region would be acceptable to me.

If you want to reproduce, just pull down and run:
http://download.savannah.gnu.org/releases/dmidecode/dmidecode-2.2.tar.gz

I found that if I run this repeatedly and concurrently on multiple vhosts, my
box will eventually lose access to it's local storage as well. I suspect that
this is an interaction with the 3w-9xxx driver.

Also, I'm testing on 32bit, not sure if that makes any difference here or not.
Thanks.

Comment 4 Rik van Riel 2006-08-11 15:41:42 UTC

Ahhhhhhhh ok.  Good point.

I'll work on a /dev/mem hack :)

Comment 5 Rik van Riel 2006-08-11 15:43:23 UTC

Ahhhhhhhh ok.  Good point.

Hacking /dev/mem is not the right fix though, because in some cases the guest
has access to actual IO devices.  Instead, we should limit the printing of the
message by the hypervisor, which should happen automagically by not having all
the debugging options on for the FC6 and RHEL5 final builds.

Comment 6 Matt C 2006-08-14 22:37:19 UTC

So, I'm not crazy about relying on disabling debug to make this go away. This'll
make it impossible for us to enable debug on our xen nodes in the future, since
these printks are aggressive enough to bring down our xen parent host when a
couple of virtual hosts run dmidecode concurrently.

I put together a quick-n-dirty patch that traps the /dev/mem access from
unprivileged domains earlier on to avoid the hypervisor failure. However, I'm
not sure if it solves the guest direct I/O access problem or not. Patch is will
be attached...

Comment 7 Matt C 2006-08-14 22:38:52 UTC

Created attachment 134173 [details]
Linux kernel patch to catch unprivileged access to /dev/mem early

Comment 8 Rik van Riel 2006-08-14 23:05:22 UTC

It really should be fixed on the hypervisor side, otherwise a malicious guest
could still bring down the host.

Rate-limiting the print on the hypervisor side should probably work just fine.

Comment 9 Rik van Riel 2006-09-19 20:36:04 UTC

Steven, this is one to verify with your ratelimit patch :)

It may already be fixed, but it may also be a different printk you need to add
ratelimiting to.

Comment 10 Stephen Tweedie 2006-10-04 17:01:38 UTC

Should be fixed in current rawhide via xen printk ratelimiting.  The console
output from both fc5 and rawhide guest dmidecode is minimised now.