Bug 505284 - kobject_add failed for i5000_edac with -EEXIST, don't try to register things with the same name in the same directory. [NEEDINFO]
kobject_add failed for i5000_edac with -EEXIST, don't try to register things ...
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.4
All Linux
medium Severity medium
: rc
: ---
Assigned To: Aristeu Rozanski
Red Hat Kernel QE team
:
Depends On:
Blocks: 533192
  Show dependency treegraph
 
Reported: 2009-06-11 06:48 EDT by yeylon@redhat.com
Modified: 2016-04-18 02:20 EDT (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-06-02 09:20:15 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
pm-rhel: needinfo? (yeylon)


Attachments (Terms of Use)
more information of the host (19.83 KB, application/gzip)
2009-06-11 06:48 EDT, yeylon@redhat.com
no flags Details
Sosreport generated on one of affected hosts (14.16 MB, application/x-bzip2)
2009-12-28 05:30 EST, Bozidar Kenig
no flags Details
Increment mc instance counter (1.24 KB, patch)
2010-11-30 06:51 EST, Mauro Carvalho Chehab
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Legacy) 45188 None None None Never

  None (edit)
Description yeylon@redhat.com 2009-06-11 06:48:07 EDT
Created attachment 347372 [details]
more information of the host

Description of problem:

i'm running ovirt node based on rhel5.4 and got the following error during boot in the dmesg. (see below)

host info:

Red Hat Enterprise Virtualization Hypervisor release 5.4-1.beta4.el5rhev

Red Hat Enterprise Virtualization Hypervisor release 5.4-1.beta4.el5rhev
[root@green-vdsb ~]# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU            5150  @ 2.66GHz
stepping        : 6
cpu MHz         : 2664.000
cache size      : 4096 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
apicid          : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
bogomips        : 5319.99
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU            5150  @ 2.66GHz
stepping        : 6
cpu MHz         : 1998.000
cache size      : 4096 KB
physical id     : 0
siblings        : 2
core id         : 1
cpu cores       : 2
apicid          : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
bogomips        : 5320.00
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

EDAC MC: Ver: 2.0.1 Jun  3 2009
i5000_edac: waiting for edac_mc to get alive
EDAC MC0: Giving out device to i5000_edac.c I5000: DEV 0000:00:10.0
kobject_add failed for i5000_edac with -EEXIST, don't try to register things with the same name in the same directory.

Call Trace:
 [<ffffffff8014f522>] kobject_add+0x170/0x19b
 [<ffffffff8014ea52>] cmp_ex+0x0/0x10
 [<ffffffff8014f656>] kobject_register+0x20/0x39
 [<ffffffff80041618>] load_module+0x16b9/0x1a19
 [<ffffffff800a05aa>] autoremove_wake_function+0x0/0x2e
 [<ffffffff801565f9>] pci_bus_read_config_byte+0x0/0x72
 [<ffffffff801293fb>] task_has_capability+0x54/0x60
 [<ffffffff800a05aa>] autoremove_wake_function+0x0/0x2e
 [<ffffffff800a68ac>] sys_init_module+0x4d/0x1e8
 [<ffffffff8005e116>] system_call+0x7e/0x83
Comment 1 yeylon@redhat.com 2009-06-11 09:35:39 EDT
not sure it has something to do with the device but..

the guest running on this host are much slower than other guests on different hosts.

installation of rhel53 client took more than 2 hours. (single guest on host)

using iscsi storage.
Comment 2 Perry Myers 2009-06-11 10:15:48 EDT
Yaniv, what is the frequency that you see this dmesg trace come up?  Does it happen on every boot?  Or just some percentage of boots?  Does it every pop up again during system operation or only on the boot process?
Comment 3 Perry Myers 2009-06-11 10:27:08 EDT
Also, please provide sosreport when filing bugs like this.  Thanks!
Comment 5 yeylon@redhat.com 2009-06-11 17:42:49 EDT
(In reply to comment #2)
> Yaniv, what is the frequency that you see this dmesg trace come up?  Does it
> happen on every boot?  Or just some percentage of boots?  Does it every pop up
> again during system operation or only on the boot process?  

i've seen it once after reboot.
as written in comment#4 it is probably will no occur on every boot.

one thing to notices that i was playing with this host before  prarit@redhat.com has started and i've remove the module using "rmmod   i5000_edac"
Comment 8 Rainer Traut 2009-08-17 05:38:58 EDT
I have seen this problem on a bunch of Dell Servers: PE2950 and PE1950. It started to happen with Kernel 2.6.18-128.4.1.

So this is _not_ an 5.4 Beta problem, it is a 5.3 problem.

And yes - it does not occur on every boot.


PE1950 for example:

# uname -a
Linux xxx 2.6.18-128.4.1.el5 #1 SMP Tue Aug 4 20:19:25 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux 

dmesg part:
EDAC MC: Ver: 2.0.1 Aug  4 2009
i5000_edac: waiting for edac_mc to get alive
sr0: scsi-1 drive
Uniform CD-ROM driver Revision: 3.20
sr 1:0:0:0: Attached scsi CD-ROM sr0
EDAC MC0: Giving out device to i5000_edac.c I5000: DEV 0000:00:10.0
kobject_add failed for i5000_edac with -EEXIST, don't try to register things with the same name in the same directory.
 
Call Trace:
 [<ffffffff80149776>] kobject_add+0x170/0x19b
 [<ffffffff80148ca6>] cmp_ex+0x0/0x10
 [<ffffffff801498aa>] kobject_register+0x20/0x39
 [<ffffffff80040d07>] load_module+0x16b9/0x1a19
 [<ffffffff80150877>] pci_bus_read_config_byte+0x0/0x72
 [<ffffffff8000de6e>] do_mmap_pgoff+0x66c/0x7d7
 [<ffffffff8009dbae>] autoremove_wake_function+0x0/0x2e
 [<ffffffff800a3ead>] sys_init_module+0x4d/0x1e8
 [<ffffffff8005d116>] system_call+0x7e/0x83
Comment 9 Rainer Traut 2009-08-17 05:50:18 EDT
And here part of dmesg on a Dell PE2950:

# uname -a
Linux xxx 2.6.18-128.4.1.el5 #1 SMP Thu Jul 23 19:59:19 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux 

EDAC MC: Ver: 2.0.1 Aug  4 2009
i5000_edac: waiting for edac_mc to get alive
sr0: scsi-1 drive
Uniform CD-ROM driver Revision: 3.20
sr 1:0:0:0: Attached scsi CD-ROM sr0
EDAC MC0: Giving out device to i5000_edac.c I5000: DEV 0000:00:10.0
kobject_add failed for i5000_edac with -EEXIST, don't try to register things with the same name in the same directory.
 
Call Trace:
 [<ffffffff80149776>] kobject_add+0x170/0x19b
 [<ffffffff80148ca6>] cmp_ex+0x0/0x10
 [<ffffffff801498aa>] kobject_register+0x20/0x39
 [<ffffffff80040d07>] load_module+0x16b9/0x1a19
 [<ffffffff80150877>] pci_bus_read_config_byte+0x0/0x72
 [<ffffffff8000de6e>] do_mmap_pgoff+0x66c/0x7d7
 [<ffffffff8009dbae>] autoremove_wake_function+0x0/0x2e
 [<ffffffff800a3ead>] sys_init_module+0x4d/0x1e8
 [<ffffffff8005d116>] system_call+0x7e/0x83
Comment 10 Prarit Bhargava 2009-08-17 09:49:36 EDT
Rainer, can you give me an estimate of how often this happens with 128.4.1 ?

I have a dell-pe1950 series system here in my office that I can do a reboot test with but I'd like to know how often the bug occurs.

Also, can you run sosreport and attach the output to this bugzilla?

Thanks,

P.
Comment 11 Rainer Traut 2009-08-18 04:19:20 EDT
Prarit, sadly no, I do not have numbers how often it occurs.

I have two identical PE2950 - identical HW, Bios + Firmwares, identical Kernel, clean ESM log:

- when I rebooted both with 128.4.1 one showed the trace, one did not.
- I then rebootet the problem server twice over the weekend, both times it was ok, so both PE2950 do not show the error anymore
- the reboot was in accordance to Dell support whom I talked to

And I have two identical PE1950 - same story:

- but one of them is still running with the error message, I saw the kernel message too late to reboot it over the weekend

My spare servers to test are older generations and I guess they don't have the Intel 5000 chipset.

But I guess you should be able to reproduce it by rebooting your PE1950.
Comment 12 Mauro Carvalho Chehab 2009-08-18 15:38:16 EDT
At the place it is happening, it seems more a sysfs related bug than a i5000_edac one. Anyway, I've investigated the changes at load_module and at i5000_edac driver. The last ones occurred on 2.6.18-106.el5, and doesn't seem to affect the related issue.

In order to proceed on this bug analysis, we need the sosreport, in order to better understand what happened inside the kernel.
Comment 13 Bozidar Kenig 2009-12-28 05:30:18 EST
Created attachment 380613 [details]
Sosreport generated on one of affected hosts

Here is attached sosreport generated on host which produced the same bug last week:

kobject_add failed for i5000_edac with -EEXIST, don't try to register things with the same name in the same directory.

Call Trace:
 [<ffffffff8014ebaf>] kobject_add+0x170/0x19b
 [<ffffffff8014e0e6>] cmp_ex+0x0/0x10
 [<ffffffff8014ece3>] kobject_register+0x20/0x39
 [<ffffffff80041306>] load_module+0x1692/0x19f0
 [<ffffffff80155cbf>] pci_bus_read_config_byte+0x0/0x72
 [<ffffffff8000e139>] do_mmap_pgoff+0x66c/0x7d7
 [<ffffffff8009fdcf>] autoremove_wake_function+0x0/0x2e
 [<ffffffff800a6183>] sys_init_module+0x4d/0x1f2
 [<ffffffff8005d116>] system_call+0x7e/0x83

The md5sum of generated sosreport is: d603817ceeea97b11ed0dd66da9a1b20

Best regards.
Comment 15 Mauro Carvalho Chehab 2010-11-30 06:51:58 EST
Created attachment 463702 [details]
Increment mc instance counter

Does this machine have more than one memory controller? If so, the enclosed patch should fix the issue.
Comment 26 yeylon@redhat.com 2011-06-23 02:13:08 EDT
just come across this issue  on a rhel5.7  host.

i'm using an IBM blade server that was upgraded from 5.6 and this issues pops again.

i'm using Linux green-vdsd 2.6.18-268.el5 #1 SMP Tue Jun 14 18:24:50 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux


kobject_add failed for usbdev2.3_ep81 with -EEXIST, don't try to register things with the same name in the same directory.

Call Trace:
 [<ffffffff801552b1>] kobject_add+0x166/0x191
 [<ffffffff801cb15b>] device_add+0x85/0x372
 [<ffffffff80200f1a>] usb_create_ep_files+0x137/0x19a
 [<ffffffff802009b6>] usb_create_sysfs_intf_files+0x80/0x93
 [<ffffffff801fe288>] usb_set_configuration+0x3aa/0x3d9
 [<ffffffff801f9f9b>] usb_new_device+0x253/0x2c4
 [<ffffffff801fb0d3>] hub_thread+0x742/0xb01
 [<ffffffff800a2e52>] autoremove_wake_function+0x0/0x2e
 [<ffffffff801fa991>] hub_thread+0x0/0xb01
 [<ffffffff800a2c3a>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80032722>] kthread+0xfe/0x132
 [<ffffffff8009f864>] request_module+0x0/0x14d
 [<ffffffff8005dfb1>] child_rip+0xa/0x11
 [<ffffffff800a2c3a>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80032624>] kthread+0x0/0x132
 [<ffffffff8005dfa7>] child_rip+0x0/0x11
Comment 27 Niels de Vos 2011-06-23 04:53:41 EDT
Hi Yaniv,

the stacktrace you posted does not seem to be the same as the i5000_edac one. You are nowlikely hitting Bug 703084.
Comment 28 Kai Schaetzl 2011-06-27 12:30:12 EDT
Just hit this problem on CentOS 5.6 when updating to the latest xen x64 kernel (2.6.18-238.12.1.el5xen). It appeared on the reboot of both Dell R200s we have. They are identical hardware, only different CPUs. I've never seen this error before. USB controller is identified as Intel Corporation 82801I (ICH9 Family), three USB and one USB2.

Jun 26 22:59:07 c4 kernel: kobject_add failed for usbdev1.2_ep81 with -
EEXIST, don't try to register things with the same name in the same 
directory.
Jun 26 22:59:07 c4 kernel:
Jun 26 22:59:07 c4 kernel: Call Trace:
Jun 26 22:59:07 c4 kernel:  [<ffffffff803457a0>] kobject_add+0x166/0x191
Jun 26 22:59:07 c4 kernel:  [<ffffffff803af893>] device_add+0x85/0x372
Jun 26 22:59:07 c4 kernel:  [<ffffffff803f03f6>] 
usb_create_ep_files+0x137/0x19a
Jun 26 22:59:07 c4 kernel:  [<ffffffff80478406>] klist_add_tail+0x35/0x42
Jun 26 22:59:07 c4 kernel:  [<ffffffff803efe92>] 
usb_create_sysfs_intf_files+0x80/0x93
Jun 26 22:59:07 c4 kernel:  [<ffffffff803ed710>] 
usb_set_configuration+0x3aa/0x3d9
Jun 26 22:59:07 c4 kernel:  [<ffffffff803e9408>] 
usb_new_device+0x253/0x2c4
Jun 26 22:59:07 c4 kernel:  [<ffffffff803ea554>] hub_thread+0x74e/0xb11
Jun 26 22:59:07 c4 kernel:  [<ffffffff8029e04c>] 
autoremove_wake_function+0x0/0x2e
Jun 26 22:59:07 c4 kernel:  [<ffffffff803e9e06>] hub_thread+0x0/0xb11
Jun 26 22:59:07 c4 kernel:  [<ffffffff8029de34>] 
keventd_create_kthread+0x0/0xc4
Jun 26 22:59:07 c4 kernel:  [<ffffffff80233dee>] kthread+0xfe/0x132
Jun 26 22:59:07 c4 kernel:  [<ffffffff80260b2c>] child_rip+0xa/0x12
Jun 26 22:59:07 c4 kernel:  [<ffffffff8029de34>] 
keventd_create_kthread+0x0/0xc4
Jun 26 22:59:07 c4 kernel:  [<ffffffff80233cf0>] kthread+0x0/0x132
Jun 26 22:59:07 c4 kernel:  [<ffffffff80260b22>] child_rip+0x0/0x12

I can't view bug 703084, access denied.
Comment 30 Patti Clark 2011-07-25 08:53:30 EDT
We are an all Dell shop and have been running into this error off and on since 5.6.  I currently have one of three identically configured SC1435 systems running 2.6.18-238.19.1.el5 #1 SMP Sun Jul 10 08:43:41 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux that generates the error, consistently only on boot, but sporadically at other times as seen with the following:

Jul 24 15:05:16 app1 kernel: usb 1-3: USB disconnect, address 6
Jul 24 15:05:16 app1 kernel: usb 1-3: new high speed USB device using ehci_hcd and address 7
Jul 24 15:05:16 app1 kernel: usb 1-3: configuration #1 chosen from 1 choice
Jul 24 15:05:16 app1 kernel: hub 1-3:1.0: USB hub found
Jul 24 15:05:16 app1 kernel: hub 1-3:1.0: 2 ports detected
Jul 24 15:05:16 app1 kernel: kobject_add failed for usbdev1.7_ep81 with -EEXIST, don't try to register things with the same name in the same directory.
Jul 24 15:05:16 app1 kernel: 
Jul 24 15:05:16 app1 kernel: Call Trace:
Jul 24 15:05:16 app1 kernel:  [<ffffffff8015440c>] kobject_add+0x166/0x191
Jul 24 15:05:16 app1 kernel:  [<ffffffff801ca179>] device_add+0x85/0x372
Jul 24 15:05:16 app1 kernel:  [<ffffffff801ffeb6>] usb_create_ep_files+0x137/0x19a
Jul 24 15:05:16 app1 kernel:  [<ffffffff801ff952>] usb_create_sysfs_intf_files+0x80/0x93
Jul 24 15:05:16 app1 kernel:  [<ffffffff801fd224>] usb_set_configuration+0x3aa/0x3d9
Jul 24 15:05:16 app1 kernel:  [<ffffffff801f8f3c>] usb_new_device+0x253/0x2c4
Jul 24 15:05:16 app1 kernel:  [<ffffffff801fa074>] hub_thread+0x742/0xb01
Jul 24 15:05:16 app1 kernel:  [<ffffffff800a28fb>] autoremove_wake_function+0x0/0x2e
Jul 24 15:05:16 app1 kernel:  [<ffffffff801f9932>] hub_thread+0x0/0xb01
Jul 24 15:05:16 app1 kernel:  [<ffffffff800a26e3>] keventd_create_kthread+0x0/0xc4
Jul 24 15:05:16 app1 kernel:  [<ffffffff80032b26>] kthread+0xfe/0x132
Jul 24 15:05:16 app1 kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Jul 24 15:05:16 app1 kernel:  [<ffffffff800a26e3>] keventd_create_kthread+0x0/0xc4
Jul 24 15:05:16 app1 kernel:  [<ffffffff801caf0f>] klist_drivers_get+0x0/0xc
Jul 24 15:05:16 app1 kernel:  [<ffffffff80032a28>] kthread+0x0/0x132
Jul 24 15:05:16 app1 kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Comment 34 RHEL Product and Program Management 2014-03-07 07:16:18 EST
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in the  last planned RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX. To request that Red Hat re-consider this request, please re-open the bugzilla via  appropriate support channels and provide additional business and/or technical details about its importance to you.
Comment 35 RHEL Product and Program Management 2014-06-02 09:20:15 EDT
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).

Note You need to log in before you can comment on or make changes to this bug.