Bug 313111

Summary: utrace: crash - utrace_get_signal
Product: [Fedora] Fedora Reporter: Jan Kratochvil <jan.kratochvil>
Component: kernelAssignee: Roland McGrath <roland>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: medium    
Version: 6CC: jonstanley
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-02-08 04:25:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 312951    
Bug Blocks: 427887    
Attachments:
Description Flags
Testcase, a modified one from adobriyan-at-sw.ru. none

Description Jan Kratochvil 2007-09-30 14:55:11 UTC
+++ This bug was initially created as a clone of Bug #312951 +++

The rawhide problem is present also in F6.

Description of problem:
-- Additional comment from adobriyan on 2007-05-29 03:55 EST --
We saw the following oops on rhel5 utrace code

BUG: unable to handle kernel paging request at virtual address 7ca1c291
EIP is at utrace_get_signal+0x46/0x477
          get_signal_to_deliver+0xdf/0x3b1
          do_notify_resume+0xa9/0x6a5
          audit_syscall_exit+0x285/0x2a1
          work_notifysig+0x13/0x19
          copy_to_user_policy+0x73/0x7f

The failing IP corresponds to code in utrace_get_signal():

int
utrace_get_signal(struct task_struct *tsk, struct pt_regs *regs,
                  siginfo_t *info, struct k_sigaction *return_ka)
{
        struct utrace *utrace = tsk->utrace;
                ...
        if (utrace->u.live.signal != NULL) {
                signal.signr = utrace->u.live.signal->signr;
                copy_siginfo(info, utrace->u.live.signal->info);
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^

Bogus pointer was supplied here.

How ->utrace assignment should be handled correctly?
---------------------------------------------------------------------

On (2 CPUs; qemu-kvm) kernel-2.6.22.7-57.fc6.x86_64 seen:

------------[ cut here ]------------
kernel BUG at kernel/utrace.c:328!
invalid opcode: 0000 [1] SMP 
last sysfs file: /class/sound/sequencer2/dev
CPU 0 
Modules linked in: snd_hda_intel snd_usb_audio snd_seq_dummy snd_seq_oss
snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss snd_pcm snd_timer
snd_page_alloc snd_usb_lib snd_rawmidi snd_seq_device snd_hwdep snd soundcore
nfs nfsd exportfs lockd nfs_acl sunrpc ipv6 dm_mirror dm_mod video sbs button
dock battery ac floppy 8139too parport_pc 8139cp ide_cd mii cdrom parport
ata_piix ahci libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd
Pid: 407, comm: clone-get-signa Not tainted 2.6.22.7-57.fc6 #1
RIP: 0010:[<ffffffff81068116>]  [<ffffffff81068116>] check_dead_utrace+0xf2/0x158
RSP: 0018:ffff810008ee3ca8  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff81000919e800 RCX: 0000000000000000
RDX: 0000000000000008 RSI: ffff810009b90280 RDI: ffff81000919e800
RBP: 0000000000000000 R08: ffff81000d681030 R09: ffff810009b47c98
R10: 0000000000000001 R11: ffff810008ee3f58 R12: ffff810009b90280
R13: 0000000000000000 R14: 0000000000000000 R15: ffff81000919e7d8
FS:  00002aaaaaac2240(0000) GS:ffffffff813ed000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000304b550904 CR3: 0000000008f4b000 CR4: 00000000000006e0
Process clone-get-signa (pid: 407, threadinfo ffff810008ee2000, task
ffff81000919e000)
Stack:  ffffffff81067eec ffff81000919e800 0000000000000000 000000000007e569
 000000000007e569 ffffffff8106887d ffff810009b90280 ffff81000919e800
 ffff810009b90c80 ffffffff810689da 0000000000000044 ffff81000d681030
Call Trace:
 [<ffffffff81067eec>] remove_engine+0x76/0x95
 [<ffffffff8106887d>] wake_quiescent+0x4f/0x10d
 [<ffffffff810689da>] utrace_detach+0x9f/0xb2
 [<ffffffff8106b71f>] ptrace_exit+0x63/0xf2
 [<ffffffff8107642e>] zone_statistics+0x3f/0x60
 [<ffffffff81037545>] do_exit+0x144/0x7db
 [<ffffffff81037c5b>] sys_exit_group+0x0/0xe
 [<ffffffff8103eb26>] get_signal_to_deliver+0x3a5/0x3d3
 [<ffffffff81009053>] do_notify_resume+0x9c/0x726
 [<ffffffff8102ab11>] enqueue_task+0x3c/0x4f
 [<ffffffff8102aedd>] update_curr_load+0x6c/0x82
 [<ffffffff8102c09a>] __check_preempt_curr_fair+0x5c/0x7d
 [<ffffffff8126c783>] thread_return+0x0/0xd8
 [<ffffffff81062222>] audit_syscall_exit+0x33d/0x35c
 [<ffffffff81009d6f>] int_signal+0x12/0x17


Code: 0f 0b eb fe 4c 89 e7 e8 e9 fd ff ff 49 83 fd 10 75 35 8b b3 
RIP  [<ffffffff81068116>] check_dead_utrace+0xf2/0x158
 RSP <ffff810008ee3ca8>
Fixing recursive fault but reboot is needed!

---------------------------------------------------------------------


Version-Release number of selected component (if applicable):
kernel-2.6.22.7-57.fc6.x86_64

How reproducible:
After 42 runs of the testcase; testcase has 2000 internal loops =>
=> approx. 84000th cycle.

Steps to Reproduce:
1. gcc -o ./clone-get-signal ./clone-get-signal.c -Wall -ggdb2
2. while ./clone-get-signal ;do echo -n .;done

Actual results:
Kernel crash.

Expected results:
No kernel crash, just infinite dotting.

Comment 1 Jan Kratochvil 2007-09-30 14:55:11 UTC
Created attachment 211781 [details]
Testcase, a modified one from adobriyan-at-sw.ru.

Comment 2 Jon Stanley 2008-01-08 01:48:08 UTC
(This is a mass-update to all current FC6 kernel bugs in NEW state)

Hello,

I'm reviewing this bug list as part of the kernel bug triage project, an attempt
to isolate current bugs in the Fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I am CC'ing myself to this bug, however this version of Fedora is no longer
maintained.

Please attempt to reproduce this bug with a current version of Fedora (presently
Fedora 8). If the bug no longer exists, please close the bug or I'll do so in a
few days if there is no further information lodged.

Thanks for using Fedora!

Comment 3 Jon Stanley 2008-02-08 04:25:43 UTC
Per the previous comment in this bug, I am closing it as INSUFFICIENT_DATA,
since no information has been lodged for over 30 days.

Please re-open this bug or file a new one if you can provide the requested data,
and thanks for filing the original report!