Bug 435273 - Uninitialised watch structure leading to kernel crash
Summary: Uninitialised watch structure leading to kernel crash
Keywords:
Status: CLOSED DUPLICATE of bug 465849
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen
Version: 5.1
Hardware: All
OS: Linux
low
low
Target Milestone: rc
: ---
Assignee: Don Dutile (Red Hat)
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-02-28 14:01 UTC by Ian Campbell
Modified: 2008-11-06 16:55 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-11-06 16:55:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch posted for 5.3 inclusion (1023 bytes, patch)
2008-11-05 15:40 UTC, Don Dutile (Red Hat)
no flags Details | Diff

Description Ian Campbell 2008-02-28 14:01:25 UTC
There are a couple of places in the -xen kernel where a struct xenbus_watch is
allocated but not all fields are intialized.

This can result in ->flags spuriously containing the XBWF_new_thread flag. A
crash can result due to a race between this unexpected thread and deregistration
and freeing of the watch structure. This problem is present in the
/proc/xen/xenbus code and in pcifront.

A trace from a kdump trace from a customer site where they are using
/proc/xen/xenbus to register watches from userspace.
	Call Trace:
	  [c01014a7] hypercall_page+0x4a7 (37: __HYPERVISOR_kexec_op)
	   c024c9a1  machine_kexec+0x21
	   c0147ba6  crash_kexec+0x66
	   c0106adf  die+0x33f
	   c0118eb4  do_page_fault+0x7c4
	   c0105d43  error_code+0x2b
	   c024f47d  xenwatch_handle_callback+0x1d
	   c0139d68  kthread+0xe8
	   c024f460  read_reply+0x100
	   c0139c80  keventd_create_kthread+0x70
	   c0103005  kernel_thread_helper+0x5
and the associated panic message, which isn't as complete as the above backtrace
but does show thatit is the xenwatch_cb thread which has died:
	<1>BUG: unable to handle kernel paging request at virtual address 2d343665
	<1> printing eip:
	<4>2d343665
	<1>2bda8000 -> *pde = 00000001:24abf027
	<1>074bf000 -> *pme = 00000000:00000000
	<0>Oops: 0010 [#1]
	<0>SMP 
	<1>last sysfs file: /devices/xen-backend/vbd-30-51952/statistics/wr_sect
	<4>Modules linked in: fuse tun microcode bridge sunrpc ipt_REJECT xt_tcpudp
xt_state ip_conntrack nfnetlink iptable_filter 
ip_tables x_tables binfmt_misc dm_mirror dm_multipath dm_mod video thermal sbs
processor backlight i2c_ec i2c_core fan container 
button battery asus_acpi ac parport_pc lp parport nvram sr_mod cdrom evdev e1000
sg serio_raw bnx2 ata_piix libata usbhid zlib_inflate
 pcspkr serial_core rtc ide_generic ide_disk megaraid_sas sd_mod scsi_mod ext3
jbd ehci_hcd ohci_hcd uhci_hcd usbcore
	<0>CPU:    0
	<4>EIP:    0061:[<2d343665>]    Not tainted VLI
	<4>EFLAGS: 00010292   (2.6.18-53.1.4.el5.xs4.0.96.222.252xen #1) 
	<0>EIP is at 0x2d343665
	<0>eax: ee814508   ebx: c0a38880   ecx: 00000002   edx: c8561cc0
	<0>esi: ee814508   edi: c0a38880   ebp: ebe7bfc4   esp: ebe7bfb8
	<0>ds: 007b   es: 007b   ss: 0069
	<0>Process xenwatch_cb (pid: 26286, ti=ebe7a000 task=eac7a2f0 task.ti=ebe7a000)
	<0>Stack: c024f47d fffffffc c0929f6c ebe7bfe4 c0139d68 c024f460 ffffffff ffffffff 
	<0>       c0139c80 00000000 00000000 00000000 c0103005 c0929f64 00000000 00000000 
	<0>       00000000 00000000 
	<0>Call Trace:
	<0> [<c010647a>] show_trace_log_lvl+0x1a/0x30
	<0> [<c0106541>] show_stack_log_lvl+0xb1/0xe0
	<0> [<c010671a>] show_registers+0x1aa/0x230
	<0> [<c01068ea>] die+0x14a/0x370
	<0> [<c0118eb4>] do_page_fault+0x7c4/0xefd
	<0> [<c0105d43>] error_code+0x2b/0x30
	<0> [<c0139d68>] kthread+0xe8/0xf0
	<0> [<c0103005>] kernel_thread_helper+0x5/0x10
	<0> =======================
	<0>Code:  Bad EIP value.
	<0>EIP: [<2d343665>] 0x2d343665 SS:ESP 0069:ebe7bfb8

The fix is pretty simple, use kzalloc to ensure all fields are initialised to
zero. There are two places were kmalloc was used instead which are fixed by
http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/43de9d7c3c63
http://xenbits.xensource.com/staging/linux-2.6.18-xen.hg?rev/7c04748ed275

Both of these are applicable to 2.6.18-53.1.13.el5 and the former only is
applicable to 2.6.9-67.0.4.EL

FWIW the only place which uses XBWF_new_thread deliberately is setup_cpu_watcher
which uses a static watch structure so isn't subject to the race.

Comment 1 Don Dutile (Red Hat) 2008-11-05 15:40:11 UTC
Created attachment 322593 [details]
Patch posted for 5.3 inclusion

This should be backported to 5.2 z-stream


Note You need to log in before you can comment on or make changes to this bug.