Description of problem: Attached testcase causes Kernel BUG crash. It SIGKILLs a process doing execve() in a loop. Version-Release number of selected component (if applicable): RHEL-4.7 kernel-smp-2.6.9-78.EL.x86_64 Heuristically tested as non-crashing: RHEL-5.2 kernel-2.6.18-92.el5.x86_64 F-9 kernel-2.6.25.9-76.fc9.x86_64 F-9 kernel-vanilla-2.6.25.6-55.vanilla.fc9.x86_64 (but no-one knows if the race isn't just less reproducible there) How reproducible: At most several seconds. Steps to Reproduce: 1. gcc -o exitcrash exitcrash.c -Wall -ggdb2 -pthread -D_GNU_SOURCE 2. ./exitcrash Actual results: ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at signal:377 invalid operand: 0000 [1] SMP CPU 0 Modules linked in: md5 ipv6 parport_pc lp parport autofs4 sunrpc ds yenta_socket pcmcia_core cpufreq_powersave loop button battery ac uhci_hcd ehci_hcd hw_random snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc tg3 floppy sr_mod dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod ahci libata sd_mod scsi_mod Pid: 31269, comm: exe Not tainted 2.6.9-78.ELsmp RIP: 0010:[<ffffffff80141f0a>] <ffffffff80141f0a>{__exit_signal+29} RSP: 0018:0000010023895c58 EFLAGS: 00010046 RAX: 000001003d2d20d0 RBX: 0000000000000000 RCX: 0000000000000054 RDX: 000001000000c000 RSI: ffffffff8050e600 RDI: 000001003d2d2030 RBP: 000001003d2d2030 R08: 0000000000000000 R09: 00000001801ae824 R10: 0000000000000000 R11: ffffffff801ae824 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 000001002383f700 FS: 0000000000000000(0000) GS:ffffffff8050d280(005b) knlGS:00000000f7fdeba0 CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b CR2: 00000000f7fdd388 CR3: 0000000000101000 CR4: 00000000000006e0 Process exe (pid: 31269, threadinfo 0000010023894000, task 00000100246457f0) Stack: 000001003d2d2030 000001003d2d2030 000001003d2d2030 0000000000000000 0000000000000000 ffffffff80139c21 000001000000c000 0000000000000010 000001003d2d2030 000001003eb4dac0 Call Trace:<ffffffff80139c21>{release_task+126} <ffffffff80185c9f>{flush_old_exec+1696} <ffffffff8017bbf1>{vfs_read+248} <ffffffff80130807>{load_elf32_binary+1673} <ffffffff801a6c26>{load_elf_binary+5452} <ffffffff8015e3aa>{generic_file_aio_read+48} <ffffffff8017bacd>{do_sync_read+178} <ffffffff8013017e>{load_elf32_binary+0} <ffffffff80186789>{search_binary_handler+209} <ffffffff801a3487>{compat_do_execve+398} <ffffffff80128757>{sys32_execve+53} <ffffffff801269cd>{ia32_ptregs_common+37} Code: 0f 0b 8a 25 33 80 ff ff ff ff 79 01 8b 03 85 c0 75 0c 0f 0b RIP <ffffffff80141f0a>{__exit_signal+29} RSP <0000010023895c58> <0>Kernel panic - not syncing: Oops Expected results: No crash. Additional info: The extra thread there may be redundant, it is derived from a ptrace-testsuite testcase late-ptrace-may-attach-check.c.
Created attachment 311664 [details] Testcase.
Threading appears to be required to crash it, Bug 311931 may need more fixes. Kernel 2.6.9-78.ELsmp on an x86_64 RHTS Job 25225 - intel-s5000phb-01.rhts.bos.redhat.com ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at signal:377 invalid operand: 0000 [1] SMP CPU 5 Modules linked in: md5 ipv6 parport_pc lp parport autofs4 sunrpc ds yenta_socket pcmcia_core cpufreq_powersave loop button battery ac uhci_hcd ehci_hcd i5000_edac edac_mc hw_random e1000 dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod ata_piix libata mptscsih mptsas mptspi mptscsi mptbase sd_mod scsi_mod Pid: 1, comm: init Not tainted 2.6.9-78.ELsmp RIP: 0010:[<ffffffff80141f0a>] <ffffffff80141f0a>{__exit_signal+29} RSP: 0018:000001003fb61e68 EFLAGS: 00010046 RAX: 000001003ba47890 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000007fbfffd501 RSI: 0000000000000000 RDI: 000001003ba477f0 RBP: 000001003ba477f0 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 000001003ba47918 R15: 0000007fbfffd584 FS: 0000002a95562360(0000) GS:ffffffff8050d500(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000409fe028 CR3: 0000000037e12000 CR4: 00000000000006e0 Process init (pid: 1, threadinfo 000001003fb60000, task 000001000153f7f0) Stack: 000001003ba477f0 000001003ba477f0 00000000000064fa 0000000000000000 0000000000000000 ffffffff80139c21 0000007fbfffd501 000001003ba477f0 00000000000064fa 0000000000000000 Call Trace:<ffffffff80139c21>{release_task+126} <ffffffff8013c3f2>{do_wait+2758} <ffffffff80134709>{default_wake_function+0} <ffffffff80134709>{default_wake_function+0} <ffffffff8011037f>{sysret_signal+28} <ffffffff801102f6>{system_call+126} Code: 0f 0b 8a 25 33 80 ff ff ff ff 79 01 8b 03 85 c0 75 0c 0f 0b RIP <ffffffff80141f0a>{__exit_signal+29} RSP <000001003fb61e68> <0>Kernel panic - not syncing: Oops
Updating PM score.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
I didn't reproduce the bug as easily as stated above. I had to adjust the timeout to a few minutes to reproduce it on x86_64, but it's still systematic. I haven't reproduce it so far on an other arch, but I keep trying. I don't think it's x86_64 specific.
I still don't know too much about why the crash happens, but a least I reproduced it on i686. The reproducibility of that bug depends a lot on the machine it runs on.
This a duplicate of 452706. It's already fixed in recent kernels. *** This bug has been marked as a duplicate of bug 452706 ***
Denys, found out this testcase+Bug is forgotten to be included in the ptrace testsuite and also in the tests/kernel/syscalls/ptrace/BUGS RHEL Bugs list.