Bug 753796
| Summary: | xenconsoled is unprotected against time skew | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Laszlo Ersek <lersek> | ||||
| Component: | xen | Assignee: | Laszlo Ersek <lersek> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 5.8 | CC: | drjones, imammedo, leiwang, mrezanin, pbonzini, qguan, qwan, wshi, xen-maint | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | xen-3.0.3-139.el5 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 818049 (view as bug list) | Environment: | |||||
| Last Closed: | 2013-01-08 04:07:20 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 807971, 818049 | ||||||
| Attachments: |
|
||||||
|
Description
Laszlo Ersek
2011-11-14 14:18:44 UTC
(In reply to comment #0) > The xen-devel > thread seems to imply that the hypervisor was not handling dom0 clock changes > for a long time. Disregard that, sorry. I misunderstood <http://lists.xensource.com/archives/html/xen-devel/2011-11/msg00848.html>. "3.1" in this case is the kernel's version number, not the hypervisor's. So it should be the responsibility of the dom0 kernel to notify the hv. XENPF_settime seems to be supported by our hypervisor version, but our kernel version never calls it. I'm not sure we want to add a new feature ("new" in the RHEL-5 coordinate system).
The RHEL-5 kernel doesn't have the timer scaffolding on which the above pvops commit rests. IOW I'm not sure *where* in the dom0 kernel we should issue the hypercall. It may not be the closest match, but I would just add an #ifdef CONFIG_XEN in do_settimeofday and see if it works... (In reply to comment #5) > It may not be the closest match, but I would just add an #ifdef CONFIG_XEN in > do_settimeofday and see if it works... Thanks for the hint! do_settimeofday() in "arch/i386/kernel/time-xen.c" does call DOM0_SETTIME in dom0 (if the "independent_wallclock" kernel parameter is not set). DOM0_SETTIME is #defined as XENPF_settime in "include/xen/interface/dom0_ops.h". I missed it when I grepped for XENPF_settime. We already have linux-2.6.18-xen.hg changeset 832:c0f2f398aa3c (commit 17192205 / bug 494879). A possibly relevant changesets for "time-xen.c", newer than 832, seems to be <http://xenbits.xensource.com/linux-2.6.18-xen.hg/rev/1021>. The last hunk causes the generic warp_clock() to call, in the end, monotonic_reset(), added by the preceding hunks. I should also take a look at the guest's busy loop. It seems that the busy-loop in the guest is temporary. I think it's related to the amount that I set the clock back with. The "hwclock --systohc" host step doesn't appear to be necessary to cause the guest to spin. Final messages before spin:
(early) DMI not present or invalid.
(early) e820 update range: 0000000000000000 - 0000000000001000 (early) (usable)(early) ==> (early) (reserved)(early)
(early) e820 remove range: 00000000000a0000 - 0000000000100000 (early) (usable)(early)
This is the stack during the spin:
xen_start_kernel()
-> x86_64_start_reservations()
-> start_kernel()
-> setup_arch()
-> prefill_possible_map()
-> printk(); message is "SMP: Allowing %d CPUs, %d hotplug CPUs\n"
-> vprintk()
-> release_console_sem()
-> call_console_drivers()
-> _call_console_drivers()
-> __call_console_drivers()
-> xenboot_write_console()
-> raw_console_write()
-> HYPERVISOR_console_io(CONSOLEIO_write, ...
-> write_console()
-> HYPERVISOR_sched_op(SCHEDOP_yield, NULL);
write_console() goes like this:
/*
* Make sure the whole buffer is emitted, polling if
* necessary. We don't ever want to rely on the hvc daemon
* because the most interesting console output is when the
* kernel is crippled.
*/
while (len) {
int sent = __write_console(data, len);
data += sent;
len -= sent;
if (unlikely(len))
HYPERVISOR_sched_op(SCHEDOP_yield, NULL);
}
It reproduces on a RHEL-5 (-299), PV domU as well. I tried once with 4 VCPUs, and twice with 1 VCPU. Stacks:
4 VCPU case:
PID: 1 TASK: ffff8800012467a0 CPU: 3 COMMAND: "swapper"
#0 [ffff880002501a80] schedule at ffffffff80261d89
#1 [ffff880002501a88] thread_return at ffffffff80261e5c
#2 [ffff880002501b08] __wake_up_common at ffffffff8028828c
#3 [ffff880002501b58] _spin_lock_irq at ffffffff8026393d
#4 [ffff880002501b68] wait_for_completion at ffffffff80261ffb
#5 [ffff880002501bb8] call_usermodehelper_keys at ffffffff8029a2a6
#6 [ffff880002501cb8] kcons_write at ffffffff803bc220
#7 [ffff880002501ce8] __call_console_drivers at ffffffff8028dfc1
#8 [ffff880002501d08] release_console_sem at ffffffff80217b39
#9 [ffff880002501d38] vprintk at ffffffff8028e7c1
#10 [ffff880002501db8] printk at ffffffff8028e886
#11 [ffff880002501ea8] init_net_drop_monitor at ffffffff8077c553
#12 [ffff880002501ec8] init at ffffffff8075e7ed
#13 [ffff880002501f48] kernel_thread at ffffffff8025fb2c
1 VCPU cases:
PID: 227 TASK: ffff88003f1fb080 CPU: 0 COMMAND: "insmod"
#0 [ffff88003f5a7ce0] xencons_ring_send at ffffffff803bc92c
#1 [ffff88003f5a7d30] __xencons_tx_flush at ffffffff803bbfc8
#2 [ffff88003f5a7d50] kcons_write at ffffffff803bc220
#3 [ffff88003f5a7d80] __call_console_drivers at ffffffff8028dfc1
#4 [ffff88003f5a7da0] release_console_sem at ffffffff80217b39
#5 [ffff88003f5a7dd0] vprintk at ffffffff8028e7c1
#6 [ffff88003f5a7e50] printk at ffffffff8028e886
#7 [ffff88003f5a7f50] sys_init_module at ffffffff802a43ee
#8 [ffff88003f5a7f80] system_call at ffffffff8025f106
RIP: 00000000004105fa RSP: 00007fff210bc288 RFLAGS: 00000202
RAX: 00000000000000af RBX: ffffffff8025f106 RCX: ffffffff8025f080
RDX: 0000000002c015c0 RSI: 0000000000010168 RDI: 0000000002c015e0
RBP: 0000000000010168 R8: 0000000000020010 R9: 0000000000000003
R10: 000000000040f180 R11: 0000000000000202 R12: 00007fff210bcf9d
R13: 0000000000000003 R14: 0000000002c015e0 R15: ffff88003f231780
ORIG_RAX: 00000000000000af CS: e033 SS: e02b
PID: 9 TASK: ffff88003fb427a0 CPU: 0 COMMAND: "xenwatch"
#0 [ffff88003fb55b88] xencons_ring_send at ffffffff803bc92c
#1 [ffff88003fb55bd8] __xencons_tx_flush at ffffffff803bbfc8
#2 [ffff88003fb55bf8] kcons_write at ffffffff803bc220
#3 [ffff88003fb55c28] __call_console_drivers at ffffffff8028dfc1
#4 [ffff88003fb55c48] release_console_sem at ffffffff80217b39
#5 [ffff88003fb55c78] vprintk at ffffffff8028e7c1
#6 [ffff88003fb55cf8] printk at ffffffff8028e886
#7 [ffff88003fb55de8] xlvbd_add at ffffffff8807630b
#8 [ffff88003fb55e48] backend_changed at ffffffff88075efa
#9 [ffff88003fb55e98] xenwatch_handle_callback at ffffffff803be418
#10 [ffff88003fb55ea8] xenwatch_thread at ffffffff803bf10d
#11 [ffff88003fb55ee8] kthread at ffffffff802338c3
#12 [ffff88003fb55f48] kernel_thread at ffffffff8025fb2c
The common part in all three cases:
kcons_write at ffffffff803bc220 drivers/xen/console/console.c:159
__call_console_drivers at ffffffff8028dfc1 kernel/printk.c:389
release_console_sem at ffffffff80217b39 kernel/printk.c:906
vprintk at ffffffff8028e7c1 kernel/printk.c:430
kernel/printk.c:906
printk at ffffffff8028e886 kernel/printk.c:561
Common between the 1 VCPU cases:
__xencons_ring_send at ffffffff803bc92c include/asm/mach-xen/asm/hypercall.h:264
include/xen/evtchn.h:112
drivers/xen/console/xencons_ring.c:61
drivers/xen/console/xencons_ring.c:81
__xencons_tx_flush at ffffffff803bbfc8 drivers/xen/console/console.c:394
Writing to the console in a PV domU actually kicks the "xenconsoled" daemon process in dom0. Changing the wall clock time in dom0 could definitely affect a process in dom0. It appears that flow control from the domU to xenconsoled is implemented with busy waits. Assuming that xenconsoled hangs temporarily, that could explain the domU spinning. handle_io() in tools/console/daemon/io.c employs rate limiting based on gettimeofday().
/* Check if over allowance, and refuse to process events
* from this guest for next time period
*/
/* If any domain has been rate limited, we need to work
out what timeout to supply to select */
if (next_timeout) {
long long duration = (next_timeout - now);
tv.tv_sec = duration / 1000;
tv.tv_usec = (duration - (tv.tv_sec * 1000)) * 1000;
}
ret = select(max_fd + 1, &readfds, &writefds, 0, next_timeout ? &tv : NULL);
I'll reproduce the spinning and look into the xenconsoled process. It should be blocked in the select() call, after time moves backwards and the domain appears rate limited.
Confirmed, I straced xenconsoled:
select(7, [6], [], NULL, NULL) = 1 (in [6])
select(7, [6], [], NULL, NULL) = 1 (in [6])
select(7, [6], [], NULL, NULL) = 1 (in [6])
select(7, [6], [], NULL, NULL) = 1 (in [6])
select(10, [6 8 9], [], NULL, NULL) = 2 (in [6 8])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [6])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [6])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [6])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [8])
select(10, [6 8 9], [9], NULL, NULL) = 2 (in [8], out [9])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [8])
select(10, [6 8 9], [9], NULL, NULL) = 1 (out [9])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [8])
select(10, [6 8 9], [9], NULL, NULL) = 2 (in [8], out [9])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [8])
select(10, [6 8 9], [9], NULL, NULL) = 1 (out [9])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [8])
select(10, [6 8 9], [9], NULL, NULL) = 1 (out [9])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [8])
select(10, [6 8 9], [9], NULL, NULL) = 2 (in [8], out [9])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [8])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [8])
select(10, [6 8 9], [9], NULL, NULL) = 1 (out [9])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [8])
select(10, [6 8 9], [9], NULL, NULL) = 1 (out [9])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [8])
select(10, [6 8 9], [9], NULL, NULL) = 2 (in [8], out [9])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [8])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [8])
select(10, [6 8 9], [9], NULL, NULL) = 2 (in [8], out [9])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [8])
select(10, [6 8 9], [9], NULL, NULL) = 2 (in [8], out [9])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [8])
select(10, [6 8 9], [9], NULL, NULL) = 2 (in [8], out [9])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [8])
select(10, [6 8 9], [9], NULL, NULL) = 2 (in [8], out [9])
select(10, [6 8 9], [], NULL, NULL) = 1 (in [8])
select(10, [6 9], [], NULL, {723, 639000}
The last arg is "timeout".
Created attachment 539243 [details] xenconsoled: replace gettimeofday with clock_gettime backport of 18168:6cd615adaac7 (comment 15) This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release. Fix built into xen-3.0.3-139.el5 Steps to reproduce: - (kernel-xen version is not important, either in dom0 or domU; xen userspace version in dom0 is important) - boot dom0, - set the dom0 system time back 20 minutes (ie. into the past) using the "date" utility - start a new PV guest - guest spins with unpatched xen package, progresses normally otherwise (In reply to comment #25) > - guest spins with unpatched xen package, progresses normally otherwise Thanks, follow your steps, i reproduced with xen-3.0.3-132.el5 on dom0, rhel5.9 pv domU (In reply to comment #24) > Steps to reproduce: > - (kernel-xen version is not important, either in dom0 or domU; xen userspace > version in dom0 is important) > - boot dom0, > - set the dom0 system time back 20 minutes (ie. into the past) using the > "date" > utility > - start a new PV guest Verified on xen-3.0.3-141.el5 Note: if you try to verify this problem by update xen rpm package, a restart of xend service is not enough, i verified by reboot dom0 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0119.html |