Description of problem: Running the 'dmidecode' tool inside of a DomainU causes the hypervisor to throw thousands of warnings to console. On a slow console (ie: RS232/9600), this can throw CPU spinlock lockup errors as well. Version-Release number of selected component (if applicable): kernel-xen-2.6.17-1.2157_FC5 How reproducible: Boot paravirtualized DomainU system and run the dmidecode tool. Steps to Reproduce: 1. build a fresh FC5 system 2. install and boot kernel-xen-2.6.17-1.2157_FC5 3. build a paravirtualized/nonprivileged virtual host image 4. boot the paravirtualized host 5. run the dmidecode tool as root in the virtual host Actual results: I basically get thousands of the following printk messages, starting with: (XEN) DOM4: (file=mm.c, line=578) Non-privileged attempt to map I/O space 000000f0 and counting up and ending with: (XEN) DOM4: (file=mm.c, line=578) Non-privileged attempt to map I/O space 000000ff Then, some of the time, the kernel throws a spinlock timer at the end of all of these printks: Timer ISR/1: Time went backwards: delta=-46186990 delta_cpu=-42186990 shadow=2112479126776 off=466807746 processed=2112992120678 cpu_processed=2112988120678 BUG: spinlock lockup on CPU#0, swapper/0, c06e9284 (Not tainted) <c04e74b0> __spin_lock_debug+0x7b/0x85 <c04e7514> _raw_spin_lock+0x5a/0x69 <c061eedb> _spin_lock+0x6/0x8 <c0407d80> timer_interrupt+0x40/0x563 <c0434533> ktime_get_ts+0x4b/0x51 <c0434484> ktime_get+0x10/0x3a <c0442fe3> handle_IRQ_event+0x40/0x82 <c04430b0> __do_IRQ+0x8b/0xdf <c0406336> do_IRQ+0x1a/0x25 <c054c937> evtchn_do_upcall+0x84/0xb8 <c0404c71> hypervisor_callback+0x3d/0x48 <c04087da> safe_halt+0x1a/0x2f <c0402b98> kernel_thread_helper+0x0/0xb <c04028a9> xen_idle+0x45/0x4d <c0402945> cpu_idle+0x94/0xad <c07047cf> start_kernel+0x1c1/0x1c5 0: 2112992120678 1: 2112988120678 BUG: soft lockup detected on CPU#1! <c0442e2d> softlockup_tick+0x9a/0xab <c040825d> timer_interrupt+0x51d/0x563 <c0442fe3> handle_IRQ_event+0x40/0x82 <c04430b0> __do_IRQ+0x8b/0xdf <c0406336> do_IRQ+0x1a/0x25 <c054c937> evtchn_do_upcall+0x84/0xb8 <c0404c71> hypervisor_callback+0x3d/0x48 <c054c8b1> force_evtchn_callback+0xa/0xc <c04248cf> __do_softirq+0x67/0xee <c0424995> do_softirq+0x3f/0x63 <c040633b> do_IRQ+0x1f/0x25 <c054c937> evtchn_do_upcall+0x84/0xb8 <c0404c71> hypervisor_callback+0x3d/0x48 <c04087da> safe_halt+0x1a/0x2f <c04028a9> xen_idle+0x45/0x4d <c0402945> cpu_idle+0x94/0xad BUG: soft lockup detected on CPU#0! <c0442e2d> softlockup_tick+0x9a/0xab <c040825d> timer_interrupt+0x51d/0x563 <c0434533> ktime_get_ts+0x4b/0x51 <c0442fe3> handle_IRQ_event+0x40/0x82 <c04430b0> __do_IRQ+0x8b/0xdf <c0406336> do_IRQ+0x1a/0x25 <c054c937> evtchn_do_upcall+0x84/0xb8 <c0404c71> hypervisor_callback+0x3d/0x48 <c04087da> safe_halt+0x1a/0x2f <c0402b98> kernel_thread_helper+0x0/0xb <c04028a9> xen_idle+0x45/0x4d <c0402945> cpu_idle+0x94/0xad <c07047cf> start_kernel+0x1c1/0x1c5 Once, I even had the storage controller (3ware 9500) barf on me due to an I/O timeout that it couldn't recover from. I didn't have my console capture setup for that one, unfortunately. Expected results: It would be fine if either: A. dmidecode found no valid data structures _and_ the hypervisor silently handled this. B. xen provided a fake smbios datastructure for dmidecode to decode Additional info: I assume that this is the smbios hardware address space that dmidecode is trying to access via the guest kernel. I'm also guessing that the printk has interrupts disabled or something similarly funky, so the flood of them causes the CPU lockup messages as well as probably the I/O problems. I saw some patches on the xen-devel list about adding real smbios data to the HVM virtual hosts, but it seemed to be very HVM-specific code, and I'm running paravirts. The thread from that is here: http://lists.xensource.com/archives/html/xen-devel/2006-07/msg00416.html As you can imagine, the whole system (all virts) is unuseable while this is happening, and it runs for minutes at a time.
With kernel 2.6.17-1.2519.el5xen on x86-64 I get only one "(XEN) DOM32: (file=mm.c, line=578) Non-privileged attempt to map I/O space 000000f0" warning by the hypervisor. Stracing dmidecode shows this: write(1, "# dmidecode 2.7\n", 16# dmidecode 2.7 ) = 16 open("/dev/mem", O_RDONLY) = 3 brk(0) = 0x60f000 brk(0x640000) = 0x640000 mmap(NULL, 65536, PROT_READ, MAP_SHARED, 3, 0xf0000) = -1 EFAULT (Bad address) write(2, "/dev/mem: ", 10/dev/mem: ) = 10 write(2, "mmap: Bad address\n", 18mmap: Bad address ) = 18 brk(0x630000) = 0x630000 exit_group(1) = ? Process 23451 detached I'll try to reproduce the problem with the latest FC6 Test2 kernel.
OK, the latest rawhide and rhel5 kernels both show the better behaviour - only one hypervisor message. To get rid of it completely I guess we could disable mmaping /dev/mem in domU.
I don't think I was clear enough with my test case, sorry. This happens with older dmidecode-2.2. The newer dmidecode-2.7 that you tested seems to be more intelligent and aborts immediately if the smbios structure isn't present. The older/broken 2.2 version scans through the entire smbios memory range, structure or not. The problem is that the vhosts that I'm supporting are based on RHEL3, which shipped with dmidecode-2.2. I agree that disabling /dev/mem access in domU would be a good consideration. At a minimum, just blocking the smbios region would be acceptable to me. If you want to reproduce, just pull down and run: http://download.savannah.gnu.org/releases/dmidecode/dmidecode-2.2.tar.gz I found that if I run this repeatedly and concurrently on multiple vhosts, my box will eventually lose access to it's local storage as well. I suspect that this is an interaction with the 3w-9xxx driver. Also, I'm testing on 32bit, not sure if that makes any difference here or not. Thanks.
Ahhhhhhhh ok. Good point. I'll work on a /dev/mem hack :)
Ahhhhhhhh ok. Good point. Hacking /dev/mem is not the right fix though, because in some cases the guest has access to actual IO devices. Instead, we should limit the printing of the message by the hypervisor, which should happen automagically by not having all the debugging options on for the FC6 and RHEL5 final builds.
So, I'm not crazy about relying on disabling debug to make this go away. This'll make it impossible for us to enable debug on our xen nodes in the future, since these printks are aggressive enough to bring down our xen parent host when a couple of virtual hosts run dmidecode concurrently. I put together a quick-n-dirty patch that traps the /dev/mem access from unprivileged domains earlier on to avoid the hypervisor failure. However, I'm not sure if it solves the guest direct I/O access problem or not. Patch is will be attached...
Created attachment 134173 [details] Linux kernel patch to catch unprivileged access to /dev/mem early
It really should be fixed on the hypervisor side, otherwise a malicious guest could still bring down the host. Rate-limiting the print on the hypervisor side should probably work just fine.
Steven, this is one to verify with your ratelimit patch :) It may already be fixed, but it may also be a different printk you need to add ratelimiting to.
Should be fixed in current rawhide via xen printk ratelimiting. The console output from both fc5 and rawhide guest dmidecode is minimised now.