Bug 435273 - Uninitialised watch structure leading to kernel crash
Uninitialised watch structure leading to kernel crash
Status: CLOSED DUPLICATE of bug 465849
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen (Show other bugs)
All Linux
low Severity low
: rc
: ---
Assigned To: Don Dutile
Martin Jenner
Depends On:
  Show dependency treegraph
Reported: 2008-02-28 09:01 EST by Ian Campbell
Modified: 2008-11-06 11:55 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-11-06 11:55:56 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Patch posted for 5.3 inclusion (1023 bytes, patch)
2008-11-05 10:40 EST, Don Dutile
no flags Details | Diff

  None (edit)
Description Ian Campbell 2008-02-28 09:01:25 EST
There are a couple of places in the -xen kernel where a struct xenbus_watch is
allocated but not all fields are intialized.

This can result in ->flags spuriously containing the XBWF_new_thread flag. A
crash can result due to a race between this unexpected thread and deregistration
and freeing of the watch structure. This problem is present in the
/proc/xen/xenbus code and in pcifront.

A trace from a kdump trace from a customer site where they are using
/proc/xen/xenbus to register watches from userspace.
	Call Trace:
	  [c01014a7] hypercall_page+0x4a7 (37: __HYPERVISOR_kexec_op)
	   c024c9a1  machine_kexec+0x21
	   c0147ba6  crash_kexec+0x66
	   c0106adf  die+0x33f
	   c0118eb4  do_page_fault+0x7c4
	   c0105d43  error_code+0x2b
	   c024f47d  xenwatch_handle_callback+0x1d
	   c0139d68  kthread+0xe8
	   c024f460  read_reply+0x100
	   c0139c80  keventd_create_kthread+0x70
	   c0103005  kernel_thread_helper+0x5
and the associated panic message, which isn't as complete as the above backtrace
but does show thatit is the xenwatch_cb thread which has died:
	<1>BUG: unable to handle kernel paging request at virtual address 2d343665
	<1> printing eip:
	<1>2bda8000 -> *pde = 00000001:24abf027
	<1>074bf000 -> *pme = 00000000:00000000
	<0>Oops: 0010 [#1]
	<1>last sysfs file: /devices/xen-backend/vbd-30-51952/statistics/wr_sect
	<4>Modules linked in: fuse tun microcode bridge sunrpc ipt_REJECT xt_tcpudp
xt_state ip_conntrack nfnetlink iptable_filter 
ip_tables x_tables binfmt_misc dm_mirror dm_multipath dm_mod video thermal sbs
processor backlight i2c_ec i2c_core fan container 
button battery asus_acpi ac parport_pc lp parport nvram sr_mod cdrom evdev e1000
sg serio_raw bnx2 ata_piix libata usbhid zlib_inflate
 pcspkr serial_core rtc ide_generic ide_disk megaraid_sas sd_mod scsi_mod ext3
jbd ehci_hcd ohci_hcd uhci_hcd usbcore
	<0>CPU:    0
	<4>EIP:    0061:[<2d343665>]    Not tainted VLI
	<4>EFLAGS: 00010292   (2.6.18-53.1.4.el5.xs4. #1) 
	<0>EIP is at 0x2d343665
	<0>eax: ee814508   ebx: c0a38880   ecx: 00000002   edx: c8561cc0
	<0>esi: ee814508   edi: c0a38880   ebp: ebe7bfc4   esp: ebe7bfb8
	<0>ds: 007b   es: 007b   ss: 0069
	<0>Process xenwatch_cb (pid: 26286, ti=ebe7a000 task=eac7a2f0 task.ti=ebe7a000)
	<0>Stack: c024f47d fffffffc c0929f6c ebe7bfe4 c0139d68 c024f460 ffffffff ffffffff 
	<0>       c0139c80 00000000 00000000 00000000 c0103005 c0929f64 00000000 00000000 
	<0>       00000000 00000000 
	<0>Call Trace:
	<0> [<c010647a>] show_trace_log_lvl+0x1a/0x30
	<0> [<c0106541>] show_stack_log_lvl+0xb1/0xe0
	<0> [<c010671a>] show_registers+0x1aa/0x230
	<0> [<c01068ea>] die+0x14a/0x370
	<0> [<c0118eb4>] do_page_fault+0x7c4/0xefd
	<0> [<c0105d43>] error_code+0x2b/0x30
	<0> [<c0139d68>] kthread+0xe8/0xf0
	<0> [<c0103005>] kernel_thread_helper+0x5/0x10
	<0> =======================
	<0>Code:  Bad EIP value.
	<0>EIP: [<2d343665>] 0x2d343665 SS:ESP 0069:ebe7bfb8

The fix is pretty simple, use kzalloc to ensure all fields are initialised to
zero. There are two places were kmalloc was used instead which are fixed by

Both of these are applicable to 2.6.18-53.1.13.el5 and the former only is
applicable to 2.6.9-67.0.4.EL

FWIW the only place which uses XBWF_new_thread deliberately is setup_cpu_watcher
which uses a static watch structure so isn't subject to the race.
Comment 1 Don Dutile 2008-11-05 10:40:11 EST
Created attachment 322593 [details]
Patch posted for 5.3 inclusion

This should be backported to 5.2 z-stream

Note You need to log in before you can comment on or make changes to this bug.