Bug 239604 - [RHEL5] console: kobject_add failed
Summary: [RHEL5] console: kobject_add failed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Aristeu Rozanski
QA Contact: desktop-bugs@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 425461
TreeView+ depends on / blocked
 
Reported: 2007-05-09 20:35 UTC by Alan Cox
Modified: 2009-01-20 20:07 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-01-20 20:07:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
test case (2.43 KB, text/x-csrc)
2007-08-31 20:00 UTC, Aristeu Rozanski
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2009:0225 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.3 kernel security and bug fix update 2009-01-20 16:06:24 UTC

Description Alan Cox 2007-05-09 20:35:37 UTC
Switched from graphical to text console early in rhgb and got this

kobject_add failed for vcs1 with -EEXIST, don't try to register things with the 
same name in the same directory.

Call Trace:
 [<ffffffff8013be70>] kobject_add+0x16e/0x199
 [<ffffffff801a4302>] class_device_add+0xaf/0x44b
 [<ffffffff801a4788>] class_device_create+0xd8/0x107
 [<ffffffff8002dd9b>] __wake_up+0x38/0x4f
 [<ffffffff800468f7>] init_dev+0x3d4/0x51d
 [<ffffffff801894be>] vcs_make_devfs+0x2d/0x5b
 [<ffffffff8018e898>] con_open+0x7c/0x8b
 [<ffffffff80186bce>] tty_open+0x1d9/0x3af
 [<ffffffff8004721b>] chrdev_open+0x14d/0x183
 [<ffffffff800470ce>] chrdev_open+0x0/0x183
 [<ffffffff8001df1e>] __dentry_open+0xd9/0x1dc
 [<ffffffff80026f3b>] do_filp_open+0x2a/0x38
 [<ffffffff80019373>] do_sys_open+0x44/0xbe
 [<ffffffff8005b14e>] system_call+0x7e/0x83

kobject_add failed for vcsa1 with -EEXIST, don't try to register things with the
 same name in the same directory.

Call Trace:
 [<ffffffff8013be70>] kobject_add+0x16e/0x199
 [<ffffffff801a4302>] class_device_add+0xaf/0x44b
 [<ffffffff801a4788>] class_device_create+0xd8/0x107
 [<ffffffff8002dd9b>] __wake_up+0x38/0x4f
 [<ffffffff800468f7>] init_dev+0x3d4/0x51d
 [<ffffffff8018e898>] con_open+0x7c/0x8b
 [<ffffffff80186bce>] tty_open+0x1d9/0x3af
 [<ffffffff8004721b>] chrdev_open+0x14d/0x183
 [<ffffffff800470ce>] chrdev_open+0x0/0x183
 [<ffffffff8001df1e>] __dentry_open+0xd9/0x1dc
 [<ffffffff80026f3b>] do_filp_open+0x2a/0x38
 [<ffffffff80019373>] do_sys_open+0x44/0xbe
 [<ffffffff8005b14e>] system_call+0x7e/0x83

Comment 1 Don Zickus 2007-08-23 18:54:31 UTC
Same thing noticed recently in some kernel testing.

http://rhts.lab.boston.redhat.com/cgi-bin/rhts/test_log.cgi?id=565497
(at the very bottom)

setting the flag for some attention

Comment 2 Aristeu Rozanski 2007-08-31 20:00:02 UTC
Created attachment 183881 [details]
test case

it's a good idea to run this testcase using ssh, making sure X is running on
tty7 and killing it before starting the test:
killall Xorg; ./vcs_stress

Comment 3 Aristeu Rozanski 2007-08-31 20:01:05 UTC
ok, finally found what's going on.
on release_dev(), the driver's close() function is called without holding any
locks and with the current tty->count. In the con_close() (console driver), it
checks for the tty->count to determine if it's needed to call vcs_remove_devfs().
If there's two simultaneous closes for the same tty, it's possible to the last
two processes call con_close() at almost the same time so both get tty->count == 2
and thus vcs_remove_devfs() never being called. Next time the terminal is opened,
vcs_make_devfs() is called and try to add a pre-existing entry.

I'm still looking in a correct way to fix this problem. the entire tty->count
thing looks horribly broken.


Comment 4 Alan Cox 2007-08-31 20:49:01 UTC
tty->count should only be touched under lock.


Comment 5 Aristeu Rozanski 2007-08-31 21:18:21 UTC
yes. and con_close() grabs the lock before using it. the problem is that there's
no atomicity between driver->close() and the actual decrement of tty->count in
release_dev().

release_dev():
   console_driver->close()
         grab console mutex
         grab tty_mutex
         check tty->count
         release tty_mutex
         release console mutex
   grab tty_mutex
   (later on)
   tty->count--

thread 1                           thread 2
grabs BKL
release_dev()
console_driver->con_close()
sleeps waiting console mutex,
   releases BKL
                                   grabs BKL
                                   release_dev()
                                   console_driver->con_close()
                                   sleeps waiting console mutex
grabs console mutex, tty_mutex
tty->count == 2
returns to release_dev
sleeps waiting for tty_mutex
                                   grabs console mutex
                                   gets tty_mutex first for some reason
                                   tty->count still 2
                                   releases both mutexes
grabs tty_mutex
decrements the tty->count
...

the tty release finished releasing the resources but vcs_remove_devfs never
gets called.


Comment 6 RHEL Program Management 2008-02-01 22:41:49 UTC
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".

Comment 8 Mike Gahagan 2008-03-13 23:40:45 UTC
I'm seeing some similar messages on a ppc system in RHTS running the -85 kernel
on a 5.1 distro:


Mar 13 15:06:32 ibm-js21-01 kernel: TCP: Hash tables configured (established
131072 bind 65536)
Mar 13 15:06:32 ibm-js21-01 hcid[1909]: Bluetooth HCI daemon
Mar 13 15:06:32 ibm-js21-01 kernel: TCP reno registered
Mar 13 15:06:32 ibm-js21-01 hcid[1909]: Register path:/org/bluez fallback:1
Mar 13 15:06:32 ibm-js21-01 sdpd[1913]: Bluetooth SDP daemon
Mar 13 15:06:32 ibm-js21-01 pcscd: pcscdaemon.c:464:main() pcsc-lite 1.3.1
daemon ready.
Mar 13 15:06:32 ibm-js21-01 kernel: kobject_add failed for 4003 with -EEXIST,
don't try to register things with the same name in the same directory.
Mar 13 15:06:32 ibm-js21-01 kernel: Call Trace:
Mar 13 15:06:32 ibm-js21-01 kernel: [C00000000FF0FBA0] [C000000000010378]
.show_stack+0x68/0x1b0 (unreliable)
Mar 13 15:06:32 ibm-js21-01 kernel: [C00000000FF0FC40] [C0000000001B6CF8]
.kobject_add+0x1a4/0x1fc
Mar 13 15:06:32 ibm-js21-01 hidd[1992]: Bluetooth HID daemon
Mar 13 15:06:32 ibm-js21-01 kernel: [C00000000FF0FCE0] [C0000000002815FC]
.device_add+0x88/0x3b0
Mar 13 15:06:32 ibm-js21-01 kernel: [C00000000FF0FD90] [C000000000021FCC]
.vio_register_device_node+0x1dc/0x238
Mar 13 15:06:32 ibm-js21-01 kernel: [C00000000FF0FE40] [C00000000046A110]
.vio_bus_init+0x98/0xc4
Mar 13 15:06:32 ibm-js21-01 kernel: [C00000000FF0FEC0] [C000000000460518]
.init+0x250/0x388
Mar 13 15:06:32 ibm-js21-01 kernel: [C00000000FF0FF90] [C000000000026FA8]
.kernel_thread+0x4c/0x68
Mar 13 15:06:32 ibm-js21-01 kernel: vio_register_device_node: failed to register
device 4003
Mar 13 15:06:32 ibm-js21-01 kernel: IBM eBus Device Driver
Mar 13 15:06:32 ibm-js21-01 kernel: scan-log-dump not implemented on this system
Mar 13 15:06:33 ibm-js21-01 kernel: audit: initializing netlink socket (disabled)
Mar 13 15:06:33 ibm-js21-01 kernel: audit(1205435168.164:1): initialized
Mar 13 15:06:33 ibm-js21-01 kernel: Total HugeTLB memory allocated, 0
Mar 13 15:06:33 ibm-js21-01 kernel: VFS: Disk quotas dquot_6.5.1



Comment 10 RHEL Program Management 2008-07-09 20:01:58 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 14 Mike Gahagan 2008-08-14 21:35:20 UTC
Still seeing this with the -104 kernel:

Aug 14 08:06:02 ibm-js21-01 sdpd[1979]: Bluetooth SDP daemon 
Aug 14 08:06:02 ibm-js21-01 kernel: kobject_add failed for 4003 with -EEXIST, don't try to regi
ster things with the same name in the same directory.
Aug 14 08:06:02 ibm-js21-01 kernel: Call Trace:
Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FBA0] [C0000000000103E0] .show_stack+0x68/0x1b
0 (unreliable)
Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FC40] [C0000000001BA540] .kobject_add+0x1a4/0x
1fc
Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FCE0] [C000000000284FE4] .device_add+0x88/0x3b
0
Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FD90] [C0000000000221E0] .vio_register_device_
node+0x1dc/0x238
Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FE40] [C00000000046A110] .vio_bus_init+0x98/0x
c4
Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FEC0] [C000000000460518] .init+0x250/0x388
Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FF90] [C0000000000271BC] .kernel_thread+0x4c/0
x68
Aug 14 08:06:02 ibm-js21-01 kernel: vio_register_device_node: failed to register device 4003
Aug 14 08:06:02 ibm-js21-01 kernel: IBM eBus Device Driver
Aug 14 08:06:02 ibm-js21-01 pcscd: pcscdaemon.c:507:main() pcsc-lite 1.4.4 daemon ready.
Aug 14 08:06:02 ibm-js21-01 kernel: scan-log-dump not implemented on this system
Aug 14 08:06:02 ibm-js21-01 hidd[2058]: Bluetooth HID daemon

This appears to be the same system I reported the problem on back in March.

Comment 15 Aristeu Rozanski 2008-08-29 13:48:26 UTC
(In reply to comment #14)
> Still seeing this with the -104 kernel:
> 
> Aug 14 08:06:02 ibm-js21-01 sdpd[1979]: Bluetooth SDP daemon 
> Aug 14 08:06:02 ibm-js21-01 kernel: kobject_add failed for 4003 with -EEXIST,
> don't try to regi
> ster things with the same name in the same directory.
> Aug 14 08:06:02 ibm-js21-01 kernel: Call Trace:
> Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FBA0] [C0000000000103E0]
> .show_stack+0x68/0x1b
> 0 (unreliable)
> Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FC40] [C0000000001BA540]
> .kobject_add+0x1a4/0x
> 1fc
> Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FCE0] [C000000000284FE4]
> .device_add+0x88/0x3b
> 0
> Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FD90] [C0000000000221E0]
> .vio_register_device_
> node+0x1dc/0x238
> Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FE40] [C00000000046A110]
> .vio_bus_init+0x98/0x
> c4
> Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FEC0] [C000000000460518]
> .init+0x250/0x388
> Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FF90] [C0000000000271BC]
> .kernel_thread+0x4c/0
> x68
> Aug 14 08:06:02 ibm-js21-01 kernel: vio_register_device_node: failed to
> register device 4003
> Aug 14 08:06:02 ibm-js21-01 kernel: IBM eBus Device Driver
> Aug 14 08:06:02 ibm-js21-01 pcscd: pcscdaemon.c:507:main() pcsc-lite 1.4.4
> daemon ready.
> Aug 14 08:06:02 ibm-js21-01 kernel: scan-log-dump not implemented on this
> system
> Aug 14 08:06:02 ibm-js21-01 hidd[2058]: Bluetooth HID daemon
> 
> This appears to be the same system I reported the problem on back in March.
No. The same message but different source. This message means someone tried
to register a sysfs entry that already exists. The culprit on this bug is the
vt code. The problem you reported appears to be in vio code.

Comment 17 Don Zickus 2008-09-10 20:12:46 UTC
in kernel-2.6.18-110.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 19 Ryan Lerch 2008-11-06 02:16:50 UTC
This bug has been marked for inclusion in the Red Hat Enterprise Linux 5.3
Release Notes.

To aid in the development of relevant and accurate release notes, please fill
out the "Release Notes" field above with the following 4 pieces of information:


Cause:   What actions or circumstances cause this bug to present.

Consequence:  What happens when the bug presents.

Fix:   What was done to fix the bug.

Result:  What now happens when the actions or circumstances above occur. (NB:
this is not the same as 'the bug doesn't present anymore')

Comment 22 errata-xmlrpc 2009-01-20 20:07:32 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html


Note You need to log in before you can comment on or make changes to this bug.