Bug 470161 - disassociate_ctty GPF
disassociate_ctty GPF
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
All Linux
high Severity high
: ---
: ---
Assigned To: Alan Cox
Fedora Extras Quality Assurance
Depends On:
  Show dependency treegraph
Reported: 2008-11-05 22:18 EST by Frank Ch. Eigler
Modified: 2008-12-01 12:38 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-12-01 12:38:28 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Frank Ch. Eigler 2008-11-05 22:18:32 EST
Description of problem:
disassociate_ctty corrupted memory use.

Version-Release number of selected component (if applicable):; also seen in other builds

How reproducible:

Steps to Reproduce:
1. build gcc
2. run gcc test suite - or other dejagnu/expect/pty/fork-extensive job
Actual results:
Observe kernel GPF message, a hung process

Expected results:
Free pony and hat.

Additional info:
general protection fault: 0000 [2] SMP 
CPU 3 
Modules linked in: uprobes loop tun nf_conntrack_sip nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp ipmi_devintf ipmi_si ipmi_msghandler dell_rbu nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs xt_length xt_DSCP xt_state xt_tcpudp ipt_REJECT ipt_LOG xt_limit iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack iptable_filter ip_tables x_tables ppp_deflate zlib_deflate ppp_synctty ppp_async crc_ccitt ppp_generic slhc bridge dm_mirror dm_log dm_multipath dm_mod tcp_westwood kvm_intel kvm dcdbas iTCO_wdt serio_raw iTCO_vendor_support pcspkr sr_mod cdrom ses enclosure bnx2 sg i5000_edac joydev edac_core usb_storage pata_acpi ata_generic ata_piix libata shpchp megaraid_sas sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: stap_1d96356f11e05416fc98f5229b1491b7_1390]
Pid: 9855, comm: xgcc Tainted: G      D #1
RIP: 0010:[<ffffffff811ae939>]  [<ffffffff811ae939>] disassociate_ctty+0x55/0x269
RSP: 0018:ffff81030fd1fe98  EFLAGS: 00010202
RAX: 6b6b6b6b6b6b6b6b RBX: ffff81031244a600 RCX: 00000000dfc51268
RDX: 0000000000006101 RSI: 0000000000000001 RDI: 0000000000000001
RBP: ffff81030fd1feb8 R08: ffffffff8142ab80 R09: 0000000000000001
R10: ffffffff81a9d160 R11: ffff810023265100 R12: ffff8101ffd2a968
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff81032fc04e10(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000039c7276027 CR3: 0000000000201000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
Process xgcc (pid: 9855, threadinfo ffff81030fd1e000, task ffff8103171a0000)
Stack:  ffff81030fd1feb8 0000000000001510 ffff8103171a0000 00007fff0850ac00
 ffff81030fd1ff38 ffffffff8103b533 000000010fd1ff28 ffffffff8107baab
 ffffffff8107b77c 0000000000000000 0000000000000001 0000000000000100
Call Trace:
 [<ffffffff8103b533>] do_exit+0x382/0x8f9
 [<ffffffff8107baab>] ? audit_syscall_entry+0x126/0x15a
 [<ffffffff8107b77c>] ? audit_syscall_exit+0x331/0x353
 [<ffffffff8103bb23>] do_group_exit+0x79/0xa9
 [<ffffffff8103bb65>] sys_exit_group+0x12/0x14
 [<ffffffff8100c2f2>] tracesys+0xd0/0xd5

Code: 54 48 8b 98 c0 01 00 00 48 85 db 74 03 f0 ff 03 48 c7 c7 40 ab 42 81 e8 37 e8 10 00 e8 ac 03 11 00 45 85 ed 74 1c 49 8b 44 24 08 <66> 83 b8 a4 00 00 00 04 74 0d 49 8d bc 24 80 03 00 00 e8 58 f5 
RIP  [<ffffffff811ae939>] disassociate_ctty+0x55/0x269
 RSP <ffff81030fd1fe98>
---[ end trace 3e54b6274b7dbeba ]---

according to objdump -drS vmlinux, the erroneous access occurs
in drivers/char/tty_io.c:

                /* XXX: here we race, there is nothing protecting tty */
                if (on_exit && tty->driver->type != TTY_DRIVER_TYPE_PTY)
ffffffff811ae92f:       45 85 ed                test   %r13d,%r13d
ffffffff811ae932:       74 1c                   je     ffffffff811ae950 <disassociate_ctty+0x6c>
ffffffff811ae934:       49 8b 44 24 08          mov    0x8(%r12),%rax
ffffffff811ae939:       66 83 b8 a4 00 00 00    cmpw   $0x4,0xa4(%rax)
ffffffff811ae940:       04 
ffffffff811ae941:       74 0d                   je     ffffffff811ae950 <disassociate_ctty+0x6c>

See also 
Comment 1 Roland McGrath 2008-11-11 20:32:08 EST
Believed to be an upstream bug, someone might verify anew that kernel-vanilla sees it too.
Comment 2 Frank Ch. Eigler 2008-11-13 11:13:04 EST
I confirmed that this bug exists with the fedora kernel, with utrace entirely
compiled out by commenting out the few %patch chunks in the spec file.

It seems easy to reproduce now - both systemtap and plain gcc
"make check" manages to trigger it nearly every time on my
2*2-core Xeon 5150 machine, after some minutes.

I raised the priority of this bug because it results in hung userspace
processes that can't be kill -9'd.
Comment 3 Alan Cox 2008-11-13 11:48:58 EST
Was basically fixed in 2.6.27 or should have been. At least the other test case was. Full refcounting fix will be in 2.6.28
Comment 4 Frank Ch. Eigler 2008-11-26 13:27:16 EST
On the kernel, I have so far been unable to trigger this.

Note You need to log in before you can comment on or make changes to this bug.