Bug 455025

Summary: [RHEL5.3][Kernel] kernel BUG at kernel/utrace.c:345!
Product: Red Hat Enterprise Linux 5 Reporter: Jeff Burke <jburke>
Component: kernelAssignee: Anton Arapov <anton>
Status: CLOSED DUPLICATE QA Contact: Martin Jenner <mjenner>
Severity: low Docs Contact:
Priority: low    
Version: 5.3CC: arozansk, duck, dzickus, jmarchan, nobody, onestero, roland, vmayatsk
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
URL: http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=3587278
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-11-05 12:53:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jeff Burke 2008-07-11 14:48:08 UTC
Description of problem:
 While running the kernel tier one tests scrashme caused a kernel BUG

Version-Release number of selected component (if applicable):
 2.6.18-96.el5 debug

How reproducible:
 Very intermittent

Steps to Reproduce:
1. Run scrashme test
  
Actual results:
------------[ cut here ]------------
kernel BUG at kernel/utrace.c:345!
invalid opcode: 0000 [#1]
SMP 
last sysfs file: /devices/pci0000:00/0000:00:10.0/0000:06:02.1/irq
Modules linked in: autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 xfrm_nalgo
crypto_api cpufreq_ondemand dm_multipath video sbs backlight i2c_ec button
battery asus_acpi ac parport_pc lp parport ide_cd sr_mod k8_edac i2c_nforce2
edac_mc cdrom serio_raw k8temp e1000 i2c_core hwmon sg pcspkr dm_snapshot
dm_zero dm_mirror dm_mod usb_storage mptsas mptscsih mptbase scsi_transport_sas
sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
CPU:    2
EIP:    0060:[<c045735d>]    Not tainted VLI
EFLAGS: 00010202   (2.6.18-96.el5debug #1) 
EIP is at check_dead_utrace+0x123/0x14d
eax: 00000020   ebx: f083e794   ecx: c042fc59   edx: f1afb000
esi: de4a31e0   edi: 00000000   ebp: f083e78c   esp: f1afbf44
ds: 007b   es: 007b   ss: 0068
Process scrashme (pid: 18353, ti=f1afb000 task=de4a31e0 task.ti=f1afb000)
Stack: 00000020 f083e794 00000000 f083e794 f083e78c c04573e1 00000000 de4a31e0 
       de4a31e0 f083e78c de4a31e0 f1afbfa4 c0457415 00000000 00000010 c042910e 
       f7c8eca0 de4a3294 00000000 00000002 00000000 f7167cf8 00000000 bfbd92b8 
Call Trace:
 [<c04573e1>] remove_detached+0x5a/0x6d
 [<c0457415>] finish_report_death+0x21/0x24
 [<c042910e>] do_exit+0x733/0x7b8
 [<c0429209>] sys_exit_group+0x0/0xd
 [<c0404f7b>] syscall_call+0x7/0xb
 =======================
Code: 98 00 00 00 89 f0 e8 c7 87 fd ff 83 be 98 00 00 00 ff 75 1f b8 20 00 00 00
87 86 90 00 00 00 83 f8 10 c7 04 24 20 00 00 00 74 08 <0f> 0b 59 01 5a db 63 c0
b8 00 4a 73 c0 e8 c2 f8 1b 00 83 3c 24 
EIP: [<c045735d>] check_dead_utrace+0x123/0x14d SS:ESP 0068:f1afbf44

Expected results:
This should pass

Additional info:
The kexec tools were enabled during this test run. There is a vmcore file
available for debugging.

Comment 3 Anton Arapov 2008-10-15 10:56:07 UTC
vmcore is not available.
no luck with reproducing.
me stuck.

Comment 4 Oleg Nesterov 2008-10-24 13:46:28 UTC
Perhaps this relates to BUG 466774?

If the race do exists, do_wait() can win and set EXIT_DEAD before
check_dead_utrace().

Comment 5 Jeff Burke 2008-10-29 02:21:17 UTC
On RHTS Job 33958 Recipe 120752 system ibm-qs22-01.lab.bos.redhat.com. While running the /kernel/syscalls/scrashme/multiple. Using kernel 2.6.18-120.el5 kernel on issue on PPC.

kernel BUG in check_dead_utrace at kernel/utrace.c:345!
cpu 0x0: Vector: 700 (Program Check) at [c0000003ea41b920]
    pc: c0000000000b36f0: .check_dead_utrace+0x1d0/0x22c
    lr: c0000000000b36b8: .check_dead_utrace+0x198/0x22c
    sp: c0000003ea41bba0
   msr: 9000000000029032
  current = 0xc0000003fe19a470
  paca    = 0xc00000000052af00
    pid   = 23970, comm = scrashme
kernel BUG in check_dead_utrace at kernel/utrace.c:345!

Comment 6 Oleg Nesterov 2008-10-29 20:00:46 UTC
Now I am more or less sure this bug duplicates BUG 466774,
see the test case I sent.

Comment 7 Anton Arapov 2008-11-05 12:53:25 UTC

*** This bug has been marked as a duplicate of bug 466774 ***