There are a couple of places in the -xen kernel where a struct xenbus_watch is allocated but not all fields are intialized. This can result in ->flags spuriously containing the XBWF_new_thread flag. A crash can result due to a race between this unexpected thread and deregistration and freeing of the watch structure. This problem is present in the /proc/xen/xenbus code and in pcifront. A trace from a kdump trace from a customer site where they are using /proc/xen/xenbus to register watches from userspace. Call Trace: [c01014a7] hypercall_page+0x4a7 (37: __HYPERVISOR_kexec_op) c024c9a1 machine_kexec+0x21 c0147ba6 crash_kexec+0x66 c0106adf die+0x33f c0118eb4 do_page_fault+0x7c4 c0105d43 error_code+0x2b c024f47d xenwatch_handle_callback+0x1d c0139d68 kthread+0xe8 c024f460 read_reply+0x100 c0139c80 keventd_create_kthread+0x70 c0103005 kernel_thread_helper+0x5 and the associated panic message, which isn't as complete as the above backtrace but does show thatit is the xenwatch_cb thread which has died: <1>BUG: unable to handle kernel paging request at virtual address 2d343665 <1> printing eip: <4>2d343665 <1>2bda8000 -> *pde = 00000001:24abf027 <1>074bf000 -> *pme = 00000000:00000000 <0>Oops: 0010 [#1] <0>SMP <1>last sysfs file: /devices/xen-backend/vbd-30-51952/statistics/wr_sect <4>Modules linked in: fuse tun microcode bridge sunrpc ipt_REJECT xt_tcpudp xt_state ip_conntrack nfnetlink iptable_filter ip_tables x_tables binfmt_misc dm_mirror dm_multipath dm_mod video thermal sbs processor backlight i2c_ec i2c_core fan container button battery asus_acpi ac parport_pc lp parport nvram sr_mod cdrom evdev e1000 sg serio_raw bnx2 ata_piix libata usbhid zlib_inflate pcspkr serial_core rtc ide_generic ide_disk megaraid_sas sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd usbcore <0>CPU: 0 <4>EIP: 0061:[<2d343665>] Not tainted VLI <4>EFLAGS: 00010292 (2.6.18-53.1.4.el5.xs4.0.96.222.252xen #1) <0>EIP is at 0x2d343665 <0>eax: ee814508 ebx: c0a38880 ecx: 00000002 edx: c8561cc0 <0>esi: ee814508 edi: c0a38880 ebp: ebe7bfc4 esp: ebe7bfb8 <0>ds: 007b es: 007b ss: 0069 <0>Process xenwatch_cb (pid: 26286, ti=ebe7a000 task=eac7a2f0 task.ti=ebe7a000) <0>Stack: c024f47d fffffffc c0929f6c ebe7bfe4 c0139d68 c024f460 ffffffff ffffffff <0> c0139c80 00000000 00000000 00000000 c0103005 c0929f64 00000000 00000000 <0> 00000000 00000000 <0>Call Trace: <0> [<c010647a>] show_trace_log_lvl+0x1a/0x30 <0> [<c0106541>] show_stack_log_lvl+0xb1/0xe0 <0> [<c010671a>] show_registers+0x1aa/0x230 <0> [<c01068ea>] die+0x14a/0x370 <0> [<c0118eb4>] do_page_fault+0x7c4/0xefd <0> [<c0105d43>] error_code+0x2b/0x30 <0> [<c0139d68>] kthread+0xe8/0xf0 <0> [<c0103005>] kernel_thread_helper+0x5/0x10 <0> ======================= <0>Code: Bad EIP value. <0>EIP: [<2d343665>] 0x2d343665 SS:ESP 0069:ebe7bfb8 The fix is pretty simple, use kzalloc to ensure all fields are initialised to zero. There are two places were kmalloc was used instead which are fixed by http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/43de9d7c3c63 http://xenbits.xensource.com/staging/linux-2.6.18-xen.hg?rev/7c04748ed275 Both of these are applicable to 2.6.18-53.1.13.el5 and the former only is applicable to 2.6.9-67.0.4.EL FWIW the only place which uses XBWF_new_thread deliberately is setup_cpu_watcher which uses a static watch structure so isn't subject to the race.
Created attachment 322593 [details] Patch posted for 5.3 inclusion This should be backported to 5.2 z-stream