Bug 239604 - [RHEL5] console: kobject_add failed
[RHEL5] console: kobject_add failed
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: Aristeu Rozanski
desktop-bugs@redhat.com
:
Depends On:
Blocks: 425461
  Show dependency treegraph
 
Reported: 2007-05-09 16:35 EDT by Alan Cox
Modified: 2009-01-20 15:07 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-01-20 15:07:32 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
test case (2.43 KB, text/x-csrc)
2007-08-31 16:00 EDT, Aristeu Rozanski
no flags Details

  None (edit)
Description Alan Cox 2007-05-09 16:35:37 EDT
Switched from graphical to text console early in rhgb and got this

kobject_add failed for vcs1 with -EEXIST, don't try to register things with the 
same name in the same directory.

Call Trace:
 [<ffffffff8013be70>] kobject_add+0x16e/0x199
 [<ffffffff801a4302>] class_device_add+0xaf/0x44b
 [<ffffffff801a4788>] class_device_create+0xd8/0x107
 [<ffffffff8002dd9b>] __wake_up+0x38/0x4f
 [<ffffffff800468f7>] init_dev+0x3d4/0x51d
 [<ffffffff801894be>] vcs_make_devfs+0x2d/0x5b
 [<ffffffff8018e898>] con_open+0x7c/0x8b
 [<ffffffff80186bce>] tty_open+0x1d9/0x3af
 [<ffffffff8004721b>] chrdev_open+0x14d/0x183
 [<ffffffff800470ce>] chrdev_open+0x0/0x183
 [<ffffffff8001df1e>] __dentry_open+0xd9/0x1dc
 [<ffffffff80026f3b>] do_filp_open+0x2a/0x38
 [<ffffffff80019373>] do_sys_open+0x44/0xbe
 [<ffffffff8005b14e>] system_call+0x7e/0x83

kobject_add failed for vcsa1 with -EEXIST, don't try to register things with the
 same name in the same directory.

Call Trace:
 [<ffffffff8013be70>] kobject_add+0x16e/0x199
 [<ffffffff801a4302>] class_device_add+0xaf/0x44b
 [<ffffffff801a4788>] class_device_create+0xd8/0x107
 [<ffffffff8002dd9b>] __wake_up+0x38/0x4f
 [<ffffffff800468f7>] init_dev+0x3d4/0x51d
 [<ffffffff8018e898>] con_open+0x7c/0x8b
 [<ffffffff80186bce>] tty_open+0x1d9/0x3af
 [<ffffffff8004721b>] chrdev_open+0x14d/0x183
 [<ffffffff800470ce>] chrdev_open+0x0/0x183
 [<ffffffff8001df1e>] __dentry_open+0xd9/0x1dc
 [<ffffffff80026f3b>] do_filp_open+0x2a/0x38
 [<ffffffff80019373>] do_sys_open+0x44/0xbe
 [<ffffffff8005b14e>] system_call+0x7e/0x83
Comment 1 Don Zickus 2007-08-23 14:54:31 EDT
Same thing noticed recently in some kernel testing.

http://rhts.lab.boston.redhat.com/cgi-bin/rhts/test_log.cgi?id=565497
(at the very bottom)

setting the flag for some attention
Comment 2 Aristeu Rozanski 2007-08-31 16:00:02 EDT
Created attachment 183881 [details]
test case

it's a good idea to run this testcase using ssh, making sure X is running on
tty7 and killing it before starting the test:
killall Xorg; ./vcs_stress
Comment 3 Aristeu Rozanski 2007-08-31 16:01:05 EDT
ok, finally found what's going on.
on release_dev(), the driver's close() function is called without holding any
locks and with the current tty->count. In the con_close() (console driver), it
checks for the tty->count to determine if it's needed to call vcs_remove_devfs().
If there's two simultaneous closes for the same tty, it's possible to the last
two processes call con_close() at almost the same time so both get tty->count == 2
and thus vcs_remove_devfs() never being called. Next time the terminal is opened,
vcs_make_devfs() is called and try to add a pre-existing entry.

I'm still looking in a correct way to fix this problem. the entire tty->count
thing looks horribly broken.
Comment 4 Alan Cox 2007-08-31 16:49:01 EDT
tty->count should only be touched under lock.
Comment 5 Aristeu Rozanski 2007-08-31 17:18:21 EDT
yes. and con_close() grabs the lock before using it. the problem is that there's
no atomicity between driver->close() and the actual decrement of tty->count in
release_dev().

release_dev():
   console_driver->close()
         grab console mutex
         grab tty_mutex
         check tty->count
         release tty_mutex
         release console mutex
   grab tty_mutex
   (later on)
   tty->count--

thread 1                           thread 2
grabs BKL
release_dev()
console_driver->con_close()
sleeps waiting console mutex,
   releases BKL
                                   grabs BKL
                                   release_dev()
                                   console_driver->con_close()
                                   sleeps waiting console mutex
grabs console mutex, tty_mutex
tty->count == 2
returns to release_dev
sleeps waiting for tty_mutex
                                   grabs console mutex
                                   gets tty_mutex first for some reason
                                   tty->count still 2
                                   releases both mutexes
grabs tty_mutex
decrements the tty->count
...

the tty release finished releasing the resources but vcs_remove_devfs never
gets called.
Comment 6 RHEL Product and Program Management 2008-02-01 17:41:49 EST
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".
Comment 8 Mike Gahagan 2008-03-13 19:40:45 EDT
I'm seeing some similar messages on a ppc system in RHTS running the -85 kernel
on a 5.1 distro:


Mar 13 15:06:32 ibm-js21-01 kernel: TCP: Hash tables configured (established
131072 bind 65536)
Mar 13 15:06:32 ibm-js21-01 hcid[1909]: Bluetooth HCI daemon
Mar 13 15:06:32 ibm-js21-01 kernel: TCP reno registered
Mar 13 15:06:32 ibm-js21-01 hcid[1909]: Register path:/org/bluez fallback:1
Mar 13 15:06:32 ibm-js21-01 sdpd[1913]: Bluetooth SDP daemon
Mar 13 15:06:32 ibm-js21-01 pcscd: pcscdaemon.c:464:main() pcsc-lite 1.3.1
daemon ready.
Mar 13 15:06:32 ibm-js21-01 kernel: kobject_add failed for 4003 with -EEXIST,
don't try to register things with the same name in the same directory.
Mar 13 15:06:32 ibm-js21-01 kernel: Call Trace:
Mar 13 15:06:32 ibm-js21-01 kernel: [C00000000FF0FBA0] [C000000000010378]
.show_stack+0x68/0x1b0 (unreliable)
Mar 13 15:06:32 ibm-js21-01 kernel: [C00000000FF0FC40] [C0000000001B6CF8]
.kobject_add+0x1a4/0x1fc
Mar 13 15:06:32 ibm-js21-01 hidd[1992]: Bluetooth HID daemon
Mar 13 15:06:32 ibm-js21-01 kernel: [C00000000FF0FCE0] [C0000000002815FC]
.device_add+0x88/0x3b0
Mar 13 15:06:32 ibm-js21-01 kernel: [C00000000FF0FD90] [C000000000021FCC]
.vio_register_device_node+0x1dc/0x238
Mar 13 15:06:32 ibm-js21-01 kernel: [C00000000FF0FE40] [C00000000046A110]
.vio_bus_init+0x98/0xc4
Mar 13 15:06:32 ibm-js21-01 kernel: [C00000000FF0FEC0] [C000000000460518]
.init+0x250/0x388
Mar 13 15:06:32 ibm-js21-01 kernel: [C00000000FF0FF90] [C000000000026FA8]
.kernel_thread+0x4c/0x68
Mar 13 15:06:32 ibm-js21-01 kernel: vio_register_device_node: failed to register
device 4003
Mar 13 15:06:32 ibm-js21-01 kernel: IBM eBus Device Driver
Mar 13 15:06:32 ibm-js21-01 kernel: scan-log-dump not implemented on this system
Mar 13 15:06:33 ibm-js21-01 kernel: audit: initializing netlink socket (disabled)
Mar 13 15:06:33 ibm-js21-01 kernel: audit(1205435168.164:1): initialized
Mar 13 15:06:33 ibm-js21-01 kernel: Total HugeTLB memory allocated, 0
Mar 13 15:06:33 ibm-js21-01 kernel: VFS: Disk quotas dquot_6.5.1

Comment 10 RHEL Product and Program Management 2008-07-09 16:01:58 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 14 Mike Gahagan 2008-08-14 17:35:20 EDT
Still seeing this with the -104 kernel:

Aug 14 08:06:02 ibm-js21-01 sdpd[1979]: Bluetooth SDP daemon 
Aug 14 08:06:02 ibm-js21-01 kernel: kobject_add failed for 4003 with -EEXIST, don't try to regi
ster things with the same name in the same directory.
Aug 14 08:06:02 ibm-js21-01 kernel: Call Trace:
Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FBA0] [C0000000000103E0] .show_stack+0x68/0x1b
0 (unreliable)
Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FC40] [C0000000001BA540] .kobject_add+0x1a4/0x
1fc
Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FCE0] [C000000000284FE4] .device_add+0x88/0x3b
0
Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FD90] [C0000000000221E0] .vio_register_device_
node+0x1dc/0x238
Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FE40] [C00000000046A110] .vio_bus_init+0x98/0x
c4
Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FEC0] [C000000000460518] .init+0x250/0x388
Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FF90] [C0000000000271BC] .kernel_thread+0x4c/0
x68
Aug 14 08:06:02 ibm-js21-01 kernel: vio_register_device_node: failed to register device 4003
Aug 14 08:06:02 ibm-js21-01 kernel: IBM eBus Device Driver
Aug 14 08:06:02 ibm-js21-01 pcscd: pcscdaemon.c:507:main() pcsc-lite 1.4.4 daemon ready.
Aug 14 08:06:02 ibm-js21-01 kernel: scan-log-dump not implemented on this system
Aug 14 08:06:02 ibm-js21-01 hidd[2058]: Bluetooth HID daemon

This appears to be the same system I reported the problem on back in March.
Comment 15 Aristeu Rozanski 2008-08-29 09:48:26 EDT
(In reply to comment #14)
> Still seeing this with the -104 kernel:
> 
> Aug 14 08:06:02 ibm-js21-01 sdpd[1979]: Bluetooth SDP daemon 
> Aug 14 08:06:02 ibm-js21-01 kernel: kobject_add failed for 4003 with -EEXIST,
> don't try to regi
> ster things with the same name in the same directory.
> Aug 14 08:06:02 ibm-js21-01 kernel: Call Trace:
> Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FBA0] [C0000000000103E0]
> .show_stack+0x68/0x1b
> 0 (unreliable)
> Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FC40] [C0000000001BA540]
> .kobject_add+0x1a4/0x
> 1fc
> Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FCE0] [C000000000284FE4]
> .device_add+0x88/0x3b
> 0
> Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FD90] [C0000000000221E0]
> .vio_register_device_
> node+0x1dc/0x238
> Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FE40] [C00000000046A110]
> .vio_bus_init+0x98/0x
> c4
> Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FEC0] [C000000000460518]
> .init+0x250/0x388
> Aug 14 08:06:02 ibm-js21-01 kernel: [C00000000FF1FF90] [C0000000000271BC]
> .kernel_thread+0x4c/0
> x68
> Aug 14 08:06:02 ibm-js21-01 kernel: vio_register_device_node: failed to
> register device 4003
> Aug 14 08:06:02 ibm-js21-01 kernel: IBM eBus Device Driver
> Aug 14 08:06:02 ibm-js21-01 pcscd: pcscdaemon.c:507:main() pcsc-lite 1.4.4
> daemon ready.
> Aug 14 08:06:02 ibm-js21-01 kernel: scan-log-dump not implemented on this
> system
> Aug 14 08:06:02 ibm-js21-01 hidd[2058]: Bluetooth HID daemon
> 
> This appears to be the same system I reported the problem on back in March.
No. The same message but different source. This message means someone tried
to register a sysfs entry that already exists. The culprit on this bug is the
vt code. The problem you reported appears to be in vio code.
Comment 17 Don Zickus 2008-09-10 16:12:46 EDT
in kernel-2.6.18-110.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5
Comment 19 Ryan Lerch 2008-11-05 21:16:50 EST
This bug has been marked for inclusion in the Red Hat Enterprise Linux 5.3
Release Notes.

To aid in the development of relevant and accurate release notes, please fill
out the "Release Notes" field above with the following 4 pieces of information:


Cause:   What actions or circumstances cause this bug to present.

Consequence:  What happens when the bug presents.

Fix:   What was done to fix the bug.

Result:  What now happens when the actions or circumstances above occur. (NB:
this is not the same as 'the bug doesn't present anymore')
Comment 22 errata-xmlrpc 2009-01-20 15:07:32 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html

Note You need to log in before you can comment on or make changes to this bug.