Bug 249093

Summary: [HANG]kernel BUG at include/linux/tracehook.h
Product: [Fedora] Fedora Reporter: Alexis Deruelle <alexis.deruelle>
Component: kernelAssignee: Roland McGrath <roland>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 9CC: dr.diesel, eharney, jean.visagie, kernel-maint
Target Milestone: ---Keywords: Regression, Reopened
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: 2.6.25.6-55.fc9 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-06-13 20:55:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg cpuinfo and lspci -vvvv output none

Description Alexis Deruelle 2007-07-20 21:04:57 UTC
Description of problem:

Hard freeze

Version-Release number of selected component (if applicable):

2.6.20-1.2962.1.cfs.v19.fc6

How reproducible:

Use MS apps under cxoffice for one or two days

Caught the bug through net console.

------------[ cut here ]------------
kernel BUG at include/linux/tracehook.h:369!
invalid opcode: 0000 [#1]
SMP 
last sysfs file: /devices/pci0000:00/0000:00:1f.3/i2c-0/name
Modules linked in: snd_rtctimer netconsole autofs4 hidp rfcomm l2cap bluetooth 
sunrpc ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables dm_multipath 
video sbs i2c_ec dock button battery asus_acpi backlight snd_pcm_oss 
snd_mixer_oss snd_pcm ide_cd snd_timer snd parport_pc soundcore iTCO_wdt 
i2c_i801 parport snd_page_alloc iTCO_vendor_support serio_raw floppy i2c_core 
e1000 cdrom pcspkr dm_snapshot dm_zero dm_mirror dm_mod usb_storage ata_piix 
libata sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
CPU:    0
EIP:    0060:[<c0429576>]    Not tainted VLI
EFLAGS: 00210087   (2.6.20-1.2962.1.cfs.v19.fc6 #1)
EIP is at release_task+0x38/0x2dc
eax: d24d0030   ebx: d24d1870   ecx: c1507160   edx: dfd0cde4
esi: 00000001   edi: d24d1870   ebp: 00000000   esp: cb01fe68
ds: 007b   es: 007b   ss: 0068
Process EXCEL.EXE (pid: 17568, ti=cb01f000 task=d24d0030 task.ti=cb01f000)
Stack: 00000020 00000000 d24d0030 c15a1870 c042a8ce ddf7eb84 17768104 00000009 
       00000009 00000008 d24d0174 00000000 c6261040 00000000 cb01ffb8 cb01ffb8 
       c042a991 c7e770e4 00000009 cb01ffb8 c0431bc8 00000001 c1500288 cb01fefc 
Call Trace:
 [<c042a8ce>] do_exit+0x685/0x6db
 [<c042a991>] sys_exit_group+0x0/0xd
 [<c0431bc8>] get_signal_to_deliver+0x383/0x3a8
 [<c04035f4>] do_notify_resume+0x84/0x65b
 [<c04204d8>] update_curr+0x23a/0x25b
 [<c0420237>] update_stats_wait_end+0x81/0xaa
 [<c06238e5>] kprobe_flush_task+0x4b/0x80
 [<c0620095>] __sched_text_start+0x6ed/0x744
 [<c0622bd8>] do_page_fault+0x0/0x4da
 [<c0403ff6>] work_notifysig+0x13/0x19
 [<c062003b>] __sched_text_start+0x693/0x744
 =======================
Code: 28 05 00 00 00 74 07 89 f8 e8 fe 9e 02 00 8b 87 10 02 00 00 90 ff 48 04 
b8 00 ca 73 c0 e8 ca 80 1f 00 83 bf 20 01 00 00 20 74 04 <0f> 0b eb fe 83 bf 
28 05 00 00 00 74 28 8d b7 38 05 00 00 31 db 
EIP: [<c0429576>] release_task+0x38/0x2dc SS:ESP 0068:cb01fe68
 <1>Fixing recursive fault but reboot is needed!

Comment 1 Alexis Deruelle 2007-07-20 21:04:57 UTC
Created attachment 159698 [details]
dmesg cpuinfo and lspci -vvvv output

Comment 2 Alexis Deruelle 2007-07-24 13:44:26 UTC
Reproduced today with fc6 mainline (2.6.20-1.2962.fc6) but the trace has been 
truncated unfortunately (netconsole) :

kernel BUG at include/linux/tracehook.h:369!
invalid opcode: 0000 [#1]
SMP
last sysfs file: /power/state
Modules linked in: netconsole snd_rtctimer autofs4 hidp rfcomm l2cap bluetooth 
sunrpc ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables dm_multipath 
video sbs i2c_ec dock button battery serio_raw



Comment 3 Chuck Ebbert 2007-07-24 21:51:36 UTC
Should be fixed in the new FC6 2.6.22 kernels.



Comment 4 Alexis Deruelle 2007-07-30 08:22:10 UTC
No hang this time, this is with 2.6.22.1-24.fc6

Jul 30 10:08:29 kernel: kernel BUG at include/linux/tracehook.h:368!
Jul 30 10:08:29 kernel: invalid opcode: 0000 [#1]
Jul 30 10:08:29 kernel: SMP
Jul 30 10:08:29 kernel: last sysfs file: /power/state
Jul 30 10:08:29 kernel: Modules linked in: netconsole autofs4 hidp rfcomm 
l2cap bluetooth sunrpc ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables 
x_tables dm_multi
path video sbs button dock battery ac ipv6 lp snd_intel8x0 snd_ac97_codec 
ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device 
snd_pcm_oss snd_mixer_oss sn
d_pcm snd_timer snd floppy i2c_i801 soundcore parport_pc i2c_core ide_cd 
snd_page_alloc parport rtc_cmos e1000 iTCO_wdt serio_raw cdrom 
iTCO_vendor_support dm_snapshot dm_zero dm_
mirror dm_mod usb_storage ata_piix libata sd_mod scsi_mod ext3 jbd mbcache 
ehci_hcd ohci_hcd uhci_hcd
Jul 30 10:08:29  kernel: CPU:    0
Jul 30 10:08:29  kernel: EIP:    0060:[<c042a89d>]    Not tainted VLI
Jul 30 10:08:29  kernel: EFLAGS: 00210287   (2.6.22.1-24.fc6 #1)
Jul 30 10:08:29  kernel: EIP is at release_task+0x2e/0x324
Jul 30 10:08:29  kernel: eax: df802578   ebx: c6bca600   ecx: c14095e0   edx: 
d4ba64f4
Jul 30 10:08:29  kernel: esi: c6bca600   edi: 00000001   ebp: 00000000   esp: 
d4eaae68
Jul 30 10:08:29  kernel: ds: 007b   es: 007b   fs: 00d8  gs: 0000  ss: 0068
Jul 30 10:08:29  kernel: Process EXCEL.EXE (pid: 9951, ti=d4eaa000 
task=d7f54c00 task.ti=d4eaa000)
Jul 30 10:08:30  kernel: Stack: 00000020 d7f54c00 00000000 c14a2000 c042bc95 
00000009 00000008 00005082
Jul 30 10:08:31  kernel:        d7f550e4 00000002 d7f54d44 00000000 d10e3ae4 
cd003380 00000000 d4eaafb8
Jul 30 10:08:31  kernel:        c042bd5c d10e3ae4 00000009 00000005 c0432312 
d7f54c48 8bcead13 00005a68
Jul 30 10:08:31  kernel: Call Trace:
Jul 30 10:08:31  kernel:  [<c042bc95>] do_exit+0x67a/0x6d4
Jul 30 10:08:31  kernel:  [<c042bd5c>] sys_exit_group+0x0/0xd
Jul 30 10:08:31  kernel:  [<c0432312>] get_signal_to_deliver+0x36d/0x391
Jul 30 10:08:31  kernel:  [<c04045c2>] do_notify_resume+0x8c/0x6c7
Jul 30 10:08:31  kernel:  [<c0420838>] update_stats_wait_end+0x84/0xad
Jul 30 10:08:31  kernel:  [<c062ea36>] __sched_text_start+0x686/0x718
Jul 30 10:08:31  kernel:  [<c0407bc4>] do_syscall_trace+0x2f/0xc2
Jul 30 10:08:31  kernel:  [<c040502a>] work_notifysig+0x13/0x19
Jul 30 10:08:31  kernel:  [<c0630033>] _write_unlock_irq+0x6/0x9
Jul 30 10:08:31  kernel:  =======================
Jul 30 10:08:31  kernel: Code: 89 c6 53 0f ae f0 89 f6 83 be 28 05 00 00 00 74 
07 89 f0 e8 49 f3 02 00 8b 86 10 02 00 00 90 ff 48 04 83 be 20 01 00 00 20 74 
04 <0f>
 0b eb fe 83 be 28 05 00 00 00 74 27 8d 86 38 05 00 00 e8 01
Jul 30 10:08:31  kernel: EIP: [<c042a89d>] release_task+0x2e/0x324 SS:ESP 
0068:d4eaae68
Jul 30 10:08:31  kernel: Fixing recursive fault but reboot is needed!


Comment 5 Alexis Deruelle 2007-07-31 07:36:08 UTC
Hit this one more time today, with a slightly different trace though...

Jul 31 09:22:46 kernel: kernel BUG at include/linux/tracehook.h:368!
Jul 31 09:22:46 kernel: invalid opcode: 0000 [#2]
Jul 31 09:22:46 kernel: SMP
Jul 31 09:22:46 kernel: last sysfs file: /power/state
Jul 31 09:22:46 kernel: Modules linked in: netconsole autofs4 hidp rfcomm 
l2cap bluetooth sunrpc ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables 
x_tables dm_multi
path video sbs button dock battery ac ipv6 lp snd_intel8x0 snd_ac97_codec 
ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device 
snd_pcm_oss snd_mixer_oss sn
d_pcm snd_timer snd floppy i2c_i801 soundcore parport_pc i2c_core ide_cd 
snd_page_alloc parport rtc_cmos e1000 iTCO_wdt serio_raw cdrom 
iTCO_vendor_support dm_snapshot dm_zero dm_
mirror dm_mod usb_storage ata_piix libata sd_mod scsi_mod ext3 jbd mbcache 
ehci_hcd ohci_hcd uhci_hcd
Jul 31 09:22:46 kernel: CPU:    0
Jul 31 09:22:46 kernel: EIP:    0060:[<c042a89d>]    Not tainted VLI
Jul 31 09:22:46 kernel: EFLAGS: 00010287   (2.6.22.1-24.fc6 #1)
Jul 31 09:22:46 kernel: EIP is at release_task+0x2e/0x324
Jul 31 09:22:46 kernel: eax: df802578   ebx: d8f0e600   ecx: c14095e0   edx: 
d4ba6ea4
Jul 31 09:22:46 kernel: esi: d8f0e600   edi: 00000001   ebp: 00000000   esp: 
d0a85e68
Jul 31 09:22:46 kernel: ds: 007b   es: 007b   fs: 00d8  gs: 0000  ss: 0068
Jul 31 09:22:46 kernel: Process EXCEL.EXE (pid: 27349, ti=d0a85000 
task=d8f0e000 task.ti=d0a85000)
Jul 31 09:22:46 kernel: Stack: 00000020 d8f0e000 00000000 c14a2000 c042bc95 
00000009 00000008 c7cf5580
Jul 31 09:22:47 kernel:        d8f0e4e4 7c77c480 d8f0e144 00000000 d6d05064 
c7fd5c40 00000000 d0a85fb8
Jul 31 09:22:47 kernel:        c042bd5c d6d05064 00000009 c06313d3 c0432312 
00000000 d0a85ee4 c1400244
Jul 31 09:22:47 kernel: Call Trace:
Jul 31 09:22:47 kernel:  [<c042bc95>] do_exit+0x67a/0x6d4
Jul 31 09:22:47 kernel:  [<c042bd5c>] sys_exit_group+0x0/0xd
Jul 31 09:22:47 kernel:  [<c06313d3>] do_page_fault+0x0/0x516
Jul 31 09:22:47 kernel:  [<c0432312>] get_signal_to_deliver+0x36d/0x391
Jul 31 09:22:47 kernel:  [<c062f183>] __wait_on_bit_lock+0x4b/0x52
Jul 31 09:22:47 kernel:  [<c06313d3>] do_page_fault+0x0/0x516
Jul 31 09:22:47 kernel:  [<c04045c2>] do_notify_resume+0x8c/0x6c7
Jul 31 09:22:47 kernel:  [<c0438596>] wake_bit_function+0x0/0x3c
Jul 31 09:22:47 kernel:  [<c0468f2b>] __handle_mm_fault+0x7e0/0x8b8
Jul 31 09:22:47 kernel:  [<c0631646>] do_page_fault+0x273/0x516
Jul 31 09:22:47 kernel:  [<c06313d3>] do_page_fault+0x0/0x516
Jul 31 09:22:47 kernel:  [<c040502a>] work_notifysig+0x13/0x19
Jul 31 09:22:47 kernel:  [<c0630033>] _write_unlock_irq+0x6/0x9
Jul 31 09:22:47 kernel:  =======================
Jul 31 09:22:47 kernel: Code: 89 c6 53 0f ae f0 89 f6 83 be 28 05 00 00 00 74 
07 89 f0 e8 49 f3 02 00 8b 86 10 02 00 00 90 ff 48 04 83 be 20 01 00 00 20 74 
04 <0f>
 0b eb fe 83 be 28 05 00 00 00 74 27 8d 86 38 05 00 00 e8 01
Jul 31 09:22:47 kernel: EIP: [<c042a89d>] release_task+0x2e/0x324 SS:ESP 
0068:d0a85e68
Jul 31 09:22:47 kernel: Fixing recursive fault but reboot is needed!


Comment 6 Alexis Deruelle 2007-07-31 12:54:34 UTC
changing severity to low as it doesn't hang anymore and trying 
2.6.22.1-29.fc6...

Comment 7 Chuck Ebbert 2007-07-31 13:05:30 UTC
Should be fixed in -29 and up.
-31 will be pushed to testing soon to fix additional bugs found when 2.6.22
kernel was released for Fedora 7.


Comment 8 Chuck Ebbert 2007-08-23 20:21:12 UTC
Closing as fixed.

Comment 9 Alexis Deruelle 2008-05-14 12:40:22 UTC
Reopening this bug after FC8 -> FC9 upgrade

Description of problem:

Hard freeze

Version-Release number of selected component (if applicable):

kernel-2.6.25-14.fc9.i686

How reproducible:

Installing/Running an application with wine

Caught the bug through net console.


kernel BUG at include/linux/tracehook.h:345!
invalid opcode: 0000 [#1] SMP 
Modules linked in: netconsole configfs bridge bnep rfcomm l2cap bluetooth
autofs4 fuse sunrpc nf_conntrack_ipv4 ipt_REJECT iptable_filter ip_tables
nf_conntrack_ipv6 xt_state nf_conntrack xt_tcpudp ip6t_ipv6header ip6t_REJECT
ip6table_filter x_tables floppy snd_mixer_oss snd_pcm snd_timer snd pcspkr
soundcore snd_page_alloc e1000 i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support sg
button sr_mod cdrom dm_snapshot dm_zero dm_mirror dm_mod pata_acpi ata_generic
ata_piix libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd
[last unloaded: microcode]

Pid: 3865, comm: msiexec.exe Not tainted (2.6.25-14.fc9.i686 #1)
EIP: 0060:[<c0429855>] EFLAGS: 00010087 CPU: 0
EIP is at release_task+0x4c/0x31b
EAX: c073ca00 EBX: d6a80000 ECX: 00000010 EDX: c073ca00
ESI: d6a80000 EDI: 00000000 EBP: df317e64 ESP: df317e50
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process msiexec.exe (pid: 3865, ti=df317000 task=d2624e90 task.ti=df317000)
Stack: c0792668 00000001 00000020 00000000 d2624e90 df317e90 c042aa7d 00000001 
       d2625074 d2625074 df828000 d2624e88 00000000 defd0c00 7eb4fa08 00000000 
       df317ea4 c042ab5e 00000009 df317f10 df317ed0 c043339a df317f10 
Call Trace:
 [<c042aa7d>] ? do_exit+0x4e1/0x554
 [<c042ab5e>] ? do_group_exit+0x6e/0x85
 [<c043339a>] ? get_signal_to_deliver+0x25d/0x28b
 [<c041c1ef>] ? __wake_up_common+0x35/0x5b
 [<c0404e2c>] ? do_notify_resume+0x9b/0x79c
 [<c043b6ed>] ? ktime_get+0x13/0x2f
 [<c041ced4>] ? hrtick_start_fair+0x111/0x15b
 [<c0424ba7>] ? hrtick_set+0x80/0xe5
 [<c062a22a>] ? schedule+0x6a9/0x6db
 [<c0483c48>] ? fput+0x17/0x19
 [<c040b1a4>] ? do_syscall_trace+0x69/0x16d
 [<c0405ccc>] ? work_notifysig+0x13/0x1b
 =======================
Code: 89 f0 e8 3d 0a 03 00 8b 86 f0 02 00 00 90 ff 48 04 89 f0 e8 0e b4 08 00 b8
00 ca 73 c0 e8 64 21 20 00 83 be c0 01 00 00 20 74 04 <0f> 0b eb fe 83 be 24 06
00 00 00 74 22 8d 9e 34 06 00 00 89 d8 
EIP: [<c0429855>] release_task+0x4c/0x31b SS:ESP 0068:df317e50


Comment 10 Chuck Ebbert 2008-05-27 23:33:26 UTC
*** Bug 445049 has been marked as a duplicate of this bug. ***

Comment 11 Chuck Ebbert 2008-05-27 23:36:24 UTC
include/linux/tracehook.h:345:
        BUG_ON(p->exit_state != EXIT_DEAD);


Comment 12 Chuck Ebbert 2008-05-27 23:39:32 UTC
*** Bug 448529 has been marked as a duplicate of this bug. ***

Comment 13 Jean Visagie 2008-06-03 13:47:55 UTC
I am a bit concerned about the low priority, because this bug makes my servers
completely unusable. If apache can trigger it, as is my case, then running a web
server on fedora 9 is impossible.

Comment 14 Chuck Ebbert 2008-06-06 00:57:39 UTC
*** Bug 449874 has been marked as a duplicate of this bug. ***

Comment 15 Eric Harney 2008-06-06 03:16:51 UTC
Note: bug 449874 has this crash and trace on x86_64 too.

Comment 16 Alexis Deruelle 2008-06-12 10:33:40 UTC
Seems to be fixed in 2.6.25.6-55.fc9.i686

Comment 17 Jean Visagie 2008-06-13 20:39:15 UTC
I have not seen it with 2.6.25.6-55.fc9.i686.