If a timer interrupt is received between time_init() and init_workqueues() and HYPERVISOR_shared_info->wc_version ticks over and causes clock_was_set() to be called from timer_interrupt() the kernel can oops because clock_was_set() will attempt to defer work to the keventd_wq which has not yet been initialised. Ooops: Checking if this processor honours the WP bit even in supervisor mode... Ok. printing eip: c012a724 02669000 -> *pde = 00000000:00000000 Oops: 0000 [#1] SMP Modules linked in: CPU: 0 EIP: 0061:[<c012a724>] Not tainted VLI EFLAGS: 00010046 (2.6.9-55.ELxenU) EIP is at queue_work+0x21/0x53 eax: 00001004 ebx: 00000000 ecx: 00000000 edx: c029b600 esi: 00000000 edi: 00000000 ebp: c0338f6c esp: c0338f58 ds: 007b es: 007b ss: 0068 Process swapper (pid: 0, threadinfo=c0338000 task=c0297a40) Stack: 006fae09 00000000 c012dff0 00000000 006fae09 c0338f6c c0338f6c 006fae09 00000000 00000000 00000000 c010dbd0 00cf1f60 00000000 00000000 00000000 c10000c0 c1000080 00000000 00000000 45cf494b 0000c74f 00000000 00000000 Call Trace: [<c012dff0>] clock_was_set+0x2d/0x180 [<c010dbd0>] timer_interrupt+0x26e/0x3f5 [<c01094aa>] handle_IRQ_event+0x44/0x85 [<c0109a38>] do_IRQ+0x122/0x1b5 ======================= [<c01f7150>] evtchn_do_upcall+0x84/0xb8 [<c0107538>] hypervisor_callback+0x2c/0x34 [<c01020e6>] calibrate_delay+0xe3/0x1a8 [<c02f36f3>] start_kernel+0x14f/0x1b6 Code: 89 fa 5b 5e 5f e9 73 ea 13 00 56 31 f6 53 89 c3 b8 00 f0 ff ff 21 e0 8b 48 10 f0 0f ba 2a 00 19 c0 85 c0 75 33 8d 83 04 10 00 00 <39> 83 04 10 00 00 8d 42 04 0f 44 ce 39 42 04 74 08 0f 0b 68 00 <0>Kernel panic - not syncing: Fatal exception in interrupt It happens quite rarely but can be made to trigger quite reliably by adding a very large (multiple tens of seconds) delay after or during calibrate_delay() to increase the window of opportunity. The fix is in: http://xenbits.xensource.com/staging/linux-2.6.18-xen.hg?rev/cb040341e05a http://xenbits.xensource.com/kernels/rhel4x.hg?rev/4c6e7201cfb7 The indirection via the workqueue is not strictly necessary in 2.6.9 since clock_was_set() will do this itself. It is there because we wanted the fix to apply to a wide variety of kernels. Thanks, Ian Campbell, XenSource.
In RHEL4 things look like this: void clock_was_set(void) { struct k_itimer *timr; struct timespec new_wall_to; LIST_HEAD(cws_list); unsigned long seq; if (unlikely(in_interrupt())) { schedule_work(&clock_was_set_work); return; } This seems to indicate that your patch will result in the kernel doing exactly what it is doing today, but maybe a little more efficiently. Your patch also makes a possible (though highly unlikely in RHEL4) future change to clock_was_set() safe. However, I do not understand how your patch fixes the bug. If schedule_work is called before keventd_wq is initialized, surely it does not matter whether schedule_work is called from clock_was_set or directly from timer_interrupt? What exactly is going on here?
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
The important bit of the patch is the addition of the "if (keventd_up())" which prevents clock_was_set() (or in this case schedule_work() directly) from being called before keventd_wq is set. The workqueue added by the patch is indeed unecessary for the 2.6.9 kernel since clock_was_set() does the same thing itself. It's just there so the patch is useful on a variety of kernels (e.g. in 2.6.21 clock_was_set() cannot be called from interrupt context and does not defer itself). If you wanted you could simplify to: if (keventd_up()) clock_was_set();
Good point, doh! Thanks for this patch Ian, I'll try to get it folded into the RHEL 4.6 tree ASAP.
I have posted the patch on our internal kernel mailing list.
change QA contact
committed in stream U6 build 55.22. A test kernel with this patch is available from http://people.redhat.com/~jbaron/rhel4/
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0791.html