Bug 713519 - [RHEL5.7] ses: kobject_add failed for ArrayDevice03 with -EEXIST
Summary: [RHEL5.7] ses: kobject_add failed for ArrayDevice03 with -EEXIST
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.7
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: ---
Assignee: James Takahashi (IBM)
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks: 697486 57KnownIssue
TreeView+ depends on / blocked
 
Reported: 2011-06-15 16:22 UTC by Martin Wilck
Modified: 2011-11-21 10:52 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
On servers with SAS expanders/enclosures (e.g., Fujitsu PRIMERGY TX300S6), firmware may incorrectly report duplicate element names which results in the following warnings being logged: kobject_add failed for ArrayDevice03 with -EEXIST, don't try to register things with the same name in the same directory. Please see your firmware vendor for a fix for this issue. Otherwise, you may blacklist the 'ses' module to prevent it from loading.
Clone Of:
Environment:
Last Closed: 2011-06-17 23:44:01 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
log file with some information about the enclosure. (20.20 KB, text/plain)
2011-06-15 16:25 UTC, Martin Wilck
no flags Details

Description Martin Wilck 2011-06-15 16:22:37 UTC
Description of problem:

Multiple kernel messages when loading ses.ko:

kobject_add failed for ArrayDevice03 with -EEXIST, don't try to register
things with the same name in the same directory.

Call Trace:
 [<ffffffff80155274>] kobject_add+0x166/0x191
 [<ffffffff801cd822>] class_device_add+0xa6/0x422
 [<ffffffff80057a83>] kobject_get+0x12/0x17
 [<ffffffff88775727>] :enclosure:enclosure_component_register+0xa6/0xdb
 [<ffffffff88781a9b>] :ses:ses_intf_add+0x49f/0x680
 [<ffffffff801ce005>] class_interface_register+0x76/0xb5
 [<ffffffff8822100d>] :ses:ses_init+0xd/0x35
 [<ffffffff800a93a7>] sys_init_module+0xbd/0x206
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0

It is highly likely (though not verified yet) that pulling a disk in the backplane will cause a kernel panic.

Version-Release number of selected component (if applicable):
2.6.18-266.el5

How reproducible:
always

Steps to Reproduce:
1. install on PRIMERGY TX300S6 with SAS expander (enclosure) backplane

Actual results:
see above

Expected results:
no error messages, no panic

Additional info:
See bug #619422 for the same problem on RHEL6. Under RHEL5.6 the problem didn't exist because ses support was lacking. A possible workaround is to disable the ses module.

Comment 1 Martin Wilck 2011-06-15 16:25:53 UTC
Created attachment 504903 [details]
log file with some information about the enclosure.

The interesting part is sg_ses -p 7 /dev/sg0:

Element descriptor In diagnostic page:
  generation code: 0x0
    Element type: Array device, subenclosure id: 0
    Overall descriptor: ArrayDevicesInSubEnclsr0
      Element 1 descriptor: ArrayDevice00
      Element 2 descriptor: ArrayDevice01
      Element 3 descriptor: ArrayDevice02
      Element 4 descriptor: ArrayDevice03
      Element 5 descriptor: ArrayDevice03
      Element 6 descriptor: ArrayDevice03
      Element 7 descriptor: ArrayDevice03
      Element 8 descriptor: ArrayDevice03
      Element 9 descriptor: ArrayDevice03
      Element 10 descriptor: ArrayDevice03
      Element 11 descriptor: ArrayDevice03
      Element 12 descriptor: ArrayDevice03

The firmware uses the same name for all Elements 4...12. This could be a FW bug, but the kernel should be able to handle it gracefully.

Comment 2 Tom Coughlan 2011-06-15 21:11:59 UTC
From:

https://bugzilla.redhat.com/show_bug.cgi?id=619422#c39

"It's the fact that ses.ko can't handle non-unique element names."

Considering the short time remaining, we may need to consider a way to disable ses by default in 5.7. Or remove it and ship it as a DUP, some other alternative...

Comment 3 James Takahashi (IBM) 2011-06-16 00:16:08 UTC
(In reply to comment #2)
> From:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=619422#c39
> 
> "It's the fact that ses.ko can't handle non-unique element names."

Could someone please enable IBM access to bz619422 so we can read the entire thread?  Thanks in advance.

Comment 4 Qian Cai 2011-06-16 02:38:35 UTC
Could this problem is smiliar to,
https://bugzilla.redhat.com/show_bug.cgi?id=703084

Comment 5 Martin Wilck 2011-06-16 09:28:25 UTC
The panic situation when a disk is pulled from the expander has *not* been reproduced on RHEL5.7.

Thus, as there is a workaround available (disable ses), and apparently no panic, we can accept this as a limitation.

Comment 6 Tom Coughlan 2011-06-16 19:17:16 UTC
Okay, so as I understand it, no code change for 5.7 is needed, but we will need a Technical Note to warn about the problem. Martin, would you be willing to draft that? 

James, if this is correct, then you can set the technical_note flag, and close this BZ.

Comment 7 James Takahashi (IBM) 2011-06-17 23:44:01 UTC
(In reply to comment #6)
> Okay, so as I understand it, no code change for 5.7 is needed, but we will need
> a Technical Note to warn about the problem. Martin, would you be willing to
> draft that? 
> 
> James, if this is correct, then you can set the technical_note flag, and close
> this BZ.

I added a tech note, and am closing per Tom's and peterm's recommendation.

Comment 8 James Takahashi (IBM) 2011-06-17 23:44:01 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
On Fujitsu PRIMERGY TX300S6 servers with SAS expander, firmware may incorrectly report duplicate element names which result in the following warnings being logged:
kobject_add failed for ArrayDevice03 with -EEXIST, don't try to register things with the same name in the same directory.
Please see your firmware vendor for a fix for this issue.  Otherwise, you may blacklist the 'ses' module to prevent it from loading.

Comment 9 Martin Wilck 2011-06-20 09:08:21 UTC
The problem is not limited to PRIMERGY TX300S6. It will occur on all PRIMERGYs with SAS expander. Actually, it will occur on all SAS expanders with a certain LSI firmware, there may be expanders from other vendors than Fujitsu. I'd appreciate a modified release note like this:

On some servers with SAS expanders (enclosures), firmware may incorrectly report duplicate element names which result in the following warnings being logged:
kobject_add failed for ArrayDevice03 with -EEXIST, don't try to register things with the same name in the same directory.
This happens e.g. on Fujitsu PRIMERGY TX300S6.
Please see your firmware vendor for a fix for this issue.  Otherwise, you may blacklist the 'ses' module to prevent it from loading.

Comment 10 James Takahashi (IBM) 2011-06-21 00:44:56 UTC
Tweaking tech note per Martin's comment #9.

Comment 11 James Takahashi (IBM) 2011-06-21 00:44:56 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,3 +1,3 @@
-On Fujitsu PRIMERGY TX300S6 servers with SAS expander, firmware may incorrectly report duplicate element names which result in the following warnings being logged:
+On servers with SAS expanders/enclosures (e.g., Fujitsu PRIMERGY TX300S6), firmware may incorrectly report duplicate element names which results in the following warnings being logged:
 kobject_add failed for ArrayDevice03 with -EEXIST, don't try to register things with the same name in the same directory.
 Please see your firmware vendor for a fix for this issue.  Otherwise, you may blacklist the 'ses' module to prevent it from loading.


Note You need to log in before you can comment on or make changes to this bug.