From: http://lists.xensource.com/archives/html/xen-devel/2011-11/msg00787.html On 11/14/11 11:15, Laszlo Ersek wrote: >> On Fri, Nov 11, 2011 at 05:25:52PM +0000, Niall Fleming wrote: > >>> Booted the physical machine, date tells me it's 17:15:45. Hwclock >>> agrees. >>> Booted a VM (using xl as xm seems to be labouring under the >>> impression that blockdev is missing - it isn't) >>> The login prompt displays the time as 17:17, which is the expected >>> behaviour. >>> Changed the time in dom0 - (date +%T -s 12:00:00), synced to hwclock. >>> Check that date and hwclock match - they do. >>> Destroy the VM and recreate. >>> The login prompt displays the time as 17:22. Unexpected! > > I shut the VM down (2.6.32-131.0.15.el6) before the systime/hwclock > change in dom0 (2.6.18-286.el5xen), and restarted it afterwards. The > guest entered an infinite loop during this boot :( In my test, I moved the clock from 11:28 to 11:05 (ie. backwards) too -- that may have an effect on some boot calibration loop in the guest. The xen-devel thread seems to imply that the hypervisor was not handling dom0 clock changes for a long time.
(In reply to comment #0) > The xen-devel > thread seems to imply that the hypervisor was not handling dom0 clock changes > for a long time. Disregard that, sorry. I misunderstood <http://lists.xensource.com/archives/html/xen-devel/2011-11/msg00848.html>. "3.1" in this case is the kernel's version number, not the hypervisor's. So it should be the responsibility of the dom0 kernel to notify the hv.
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=fdb9eb9f155bfc0f8dc2fc88f90448b30c78ad97 perhaps
XENPF_settime seems to be supported by our hypervisor version, but our kernel version never calls it. I'm not sure we want to add a new feature ("new" in the RHEL-5 coordinate system).
The RHEL-5 kernel doesn't have the timer scaffolding on which the above pvops commit rests. IOW I'm not sure *where* in the dom0 kernel we should issue the hypercall.
It may not be the closest match, but I would just add an #ifdef CONFIG_XEN in do_settimeofday and see if it works...
(In reply to comment #5) > It may not be the closest match, but I would just add an #ifdef CONFIG_XEN in > do_settimeofday and see if it works... Thanks for the hint! do_settimeofday() in "arch/i386/kernel/time-xen.c" does call DOM0_SETTIME in dom0 (if the "independent_wallclock" kernel parameter is not set). DOM0_SETTIME is #defined as XENPF_settime in "include/xen/interface/dom0_ops.h". I missed it when I grepped for XENPF_settime. We already have linux-2.6.18-xen.hg changeset 832:c0f2f398aa3c (commit 17192205 / bug 494879). A possibly relevant changesets for "time-xen.c", newer than 832, seems to be <http://xenbits.xensource.com/linux-2.6.18-xen.hg/rev/1021>. The last hunk causes the generic warp_clock() to call, in the end, monotonic_reset(), added by the preceding hunks. I should also take a look at the guest's busy loop.
It seems that the busy-loop in the guest is temporary. I think it's related to the amount that I set the clock back with.
The "hwclock --systohc" host step doesn't appear to be necessary to cause the guest to spin.
Final messages before spin: (early) DMI not present or invalid. (early) e820 update range: 0000000000000000 - 0000000000001000 (early) (usable)(early) ==> (early) (reserved)(early) (early) e820 remove range: 00000000000a0000 - 0000000000100000 (early) (usable)(early) This is the stack during the spin: xen_start_kernel() -> x86_64_start_reservations() -> start_kernel() -> setup_arch() -> prefill_possible_map() -> printk(); message is "SMP: Allowing %d CPUs, %d hotplug CPUs\n" -> vprintk() -> release_console_sem() -> call_console_drivers() -> _call_console_drivers() -> __call_console_drivers() -> xenboot_write_console() -> raw_console_write() -> HYPERVISOR_console_io(CONSOLEIO_write, ... -> write_console() -> HYPERVISOR_sched_op(SCHEDOP_yield, NULL); write_console() goes like this: /* * Make sure the whole buffer is emitted, polling if * necessary. We don't ever want to rely on the hvc daemon * because the most interesting console output is when the * kernel is crippled. */ while (len) { int sent = __write_console(data, len); data += sent; len -= sent; if (unlikely(len)) HYPERVISOR_sched_op(SCHEDOP_yield, NULL); }
It reproduces on a RHEL-5 (-299), PV domU as well. I tried once with 4 VCPUs, and twice with 1 VCPU. Stacks: 4 VCPU case: PID: 1 TASK: ffff8800012467a0 CPU: 3 COMMAND: "swapper" #0 [ffff880002501a80] schedule at ffffffff80261d89 #1 [ffff880002501a88] thread_return at ffffffff80261e5c #2 [ffff880002501b08] __wake_up_common at ffffffff8028828c #3 [ffff880002501b58] _spin_lock_irq at ffffffff8026393d #4 [ffff880002501b68] wait_for_completion at ffffffff80261ffb #5 [ffff880002501bb8] call_usermodehelper_keys at ffffffff8029a2a6 #6 [ffff880002501cb8] kcons_write at ffffffff803bc220 #7 [ffff880002501ce8] __call_console_drivers at ffffffff8028dfc1 #8 [ffff880002501d08] release_console_sem at ffffffff80217b39 #9 [ffff880002501d38] vprintk at ffffffff8028e7c1 #10 [ffff880002501db8] printk at ffffffff8028e886 #11 [ffff880002501ea8] init_net_drop_monitor at ffffffff8077c553 #12 [ffff880002501ec8] init at ffffffff8075e7ed #13 [ffff880002501f48] kernel_thread at ffffffff8025fb2c 1 VCPU cases: PID: 227 TASK: ffff88003f1fb080 CPU: 0 COMMAND: "insmod" #0 [ffff88003f5a7ce0] xencons_ring_send at ffffffff803bc92c #1 [ffff88003f5a7d30] __xencons_tx_flush at ffffffff803bbfc8 #2 [ffff88003f5a7d50] kcons_write at ffffffff803bc220 #3 [ffff88003f5a7d80] __call_console_drivers at ffffffff8028dfc1 #4 [ffff88003f5a7da0] release_console_sem at ffffffff80217b39 #5 [ffff88003f5a7dd0] vprintk at ffffffff8028e7c1 #6 [ffff88003f5a7e50] printk at ffffffff8028e886 #7 [ffff88003f5a7f50] sys_init_module at ffffffff802a43ee #8 [ffff88003f5a7f80] system_call at ffffffff8025f106 RIP: 00000000004105fa RSP: 00007fff210bc288 RFLAGS: 00000202 RAX: 00000000000000af RBX: ffffffff8025f106 RCX: ffffffff8025f080 RDX: 0000000002c015c0 RSI: 0000000000010168 RDI: 0000000002c015e0 RBP: 0000000000010168 R8: 0000000000020010 R9: 0000000000000003 R10: 000000000040f180 R11: 0000000000000202 R12: 00007fff210bcf9d R13: 0000000000000003 R14: 0000000002c015e0 R15: ffff88003f231780 ORIG_RAX: 00000000000000af CS: e033 SS: e02b PID: 9 TASK: ffff88003fb427a0 CPU: 0 COMMAND: "xenwatch" #0 [ffff88003fb55b88] xencons_ring_send at ffffffff803bc92c #1 [ffff88003fb55bd8] __xencons_tx_flush at ffffffff803bbfc8 #2 [ffff88003fb55bf8] kcons_write at ffffffff803bc220 #3 [ffff88003fb55c28] __call_console_drivers at ffffffff8028dfc1 #4 [ffff88003fb55c48] release_console_sem at ffffffff80217b39 #5 [ffff88003fb55c78] vprintk at ffffffff8028e7c1 #6 [ffff88003fb55cf8] printk at ffffffff8028e886 #7 [ffff88003fb55de8] xlvbd_add at ffffffff8807630b #8 [ffff88003fb55e48] backend_changed at ffffffff88075efa #9 [ffff88003fb55e98] xenwatch_handle_callback at ffffffff803be418 #10 [ffff88003fb55ea8] xenwatch_thread at ffffffff803bf10d #11 [ffff88003fb55ee8] kthread at ffffffff802338c3 #12 [ffff88003fb55f48] kernel_thread at ffffffff8025fb2c The common part in all three cases: kcons_write at ffffffff803bc220 drivers/xen/console/console.c:159 __call_console_drivers at ffffffff8028dfc1 kernel/printk.c:389 release_console_sem at ffffffff80217b39 kernel/printk.c:906 vprintk at ffffffff8028e7c1 kernel/printk.c:430 kernel/printk.c:906 printk at ffffffff8028e886 kernel/printk.c:561 Common between the 1 VCPU cases: __xencons_ring_send at ffffffff803bc92c include/asm/mach-xen/asm/hypercall.h:264 include/xen/evtchn.h:112 drivers/xen/console/xencons_ring.c:61 drivers/xen/console/xencons_ring.c:81 __xencons_tx_flush at ffffffff803bbfc8 drivers/xen/console/console.c:394
Writing to the console in a PV domU actually kicks the "xenconsoled" daemon process in dom0. Changing the wall clock time in dom0 could definitely affect a process in dom0. It appears that flow control from the domU to xenconsoled is implemented with busy waits. Assuming that xenconsoled hangs temporarily, that could explain the domU spinning.
handle_io() in tools/console/daemon/io.c employs rate limiting based on gettimeofday(). /* Check if over allowance, and refuse to process events * from this guest for next time period */ /* If any domain has been rate limited, we need to work out what timeout to supply to select */ if (next_timeout) { long long duration = (next_timeout - now); tv.tv_sec = duration / 1000; tv.tv_usec = (duration - (tv.tv_sec * 1000)) * 1000; } ret = select(max_fd + 1, &readfds, &writefds, 0, next_timeout ? &tv : NULL); I'll reproduce the spinning and look into the xenconsoled process. It should be blocked in the select() call, after time moves backwards and the domain appears rate limited.
http://xenbits.xensource.com/hg/xen-unstable.hg/rev/6cd615adaac7
Confirmed, I straced xenconsoled: select(7, [6], [], NULL, NULL) = 1 (in [6]) select(7, [6], [], NULL, NULL) = 1 (in [6]) select(7, [6], [], NULL, NULL) = 1 (in [6]) select(7, [6], [], NULL, NULL) = 1 (in [6]) select(10, [6 8 9], [], NULL, NULL) = 2 (in [6 8]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [6]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [6]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [6]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [8]) select(10, [6 8 9], [9], NULL, NULL) = 2 (in [8], out [9]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [8]) select(10, [6 8 9], [9], NULL, NULL) = 1 (out [9]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [8]) select(10, [6 8 9], [9], NULL, NULL) = 2 (in [8], out [9]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [8]) select(10, [6 8 9], [9], NULL, NULL) = 1 (out [9]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [8]) select(10, [6 8 9], [9], NULL, NULL) = 1 (out [9]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [8]) select(10, [6 8 9], [9], NULL, NULL) = 2 (in [8], out [9]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [8]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [8]) select(10, [6 8 9], [9], NULL, NULL) = 1 (out [9]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [8]) select(10, [6 8 9], [9], NULL, NULL) = 1 (out [9]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [8]) select(10, [6 8 9], [9], NULL, NULL) = 2 (in [8], out [9]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [8]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [8]) select(10, [6 8 9], [9], NULL, NULL) = 2 (in [8], out [9]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [8]) select(10, [6 8 9], [9], NULL, NULL) = 2 (in [8], out [9]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [8]) select(10, [6 8 9], [9], NULL, NULL) = 2 (in [8], out [9]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [8]) select(10, [6 8 9], [9], NULL, NULL) = 2 (in [8], out [9]) select(10, [6 8 9], [], NULL, NULL) = 1 (in [8]) select(10, [6 9], [], NULL, {723, 639000} The last arg is "timeout".
Created attachment 539243 [details] xenconsoled: replace gettimeofday with clock_gettime backport of 18168:6cd615adaac7 (comment 15)
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release.
Fix built into xen-3.0.3-139.el5
Steps to reproduce: - (kernel-xen version is not important, either in dom0 or domU; xen userspace version in dom0 is important) - boot dom0, - set the dom0 system time back 20 minutes (ie. into the past) using the "date" utility - start a new PV guest
- guest spins with unpatched xen package, progresses normally otherwise
(In reply to comment #25) > - guest spins with unpatched xen package, progresses normally otherwise Thanks, follow your steps, i reproduced with xen-3.0.3-132.el5 on dom0, rhel5.9 pv domU
(In reply to comment #24) > Steps to reproduce: > - (kernel-xen version is not important, either in dom0 or domU; xen userspace > version in dom0 is important) > - boot dom0, > - set the dom0 system time back 20 minutes (ie. into the past) using the > "date" > utility > - start a new PV guest Verified on xen-3.0.3-141.el5 Note: if you try to verify this problem by update xen rpm package, a restart of xend service is not enough, i verified by reboot dom0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0119.html