RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 596496 - [RHEL6] mpt2sas returns ENXIO for enclosure_identifier
Summary: [RHEL6] mpt2sas returns ENXIO for enclosure_identifier
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: udev
Version: 6.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Harald Hoyer
QA Contact: Karel Volný
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-05-26 20:05 UTC by Issue Tracker
Modified: 2018-11-14 19:10 UTC (History)
8 users (show)

Fixed In Version: udev-147-2.21.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-11-15 14:50:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
path_id patch to handle sas (5.79 KB, patch)
2010-07-08 12:46 UTC, Harald Hoyer
no flags Details | Diff

Description Issue Tracker 2010-05-26 20:05:55 UTC
Escalated to Bugzilla from IssueTracker

Comment 1 Issue Tracker 2010-05-26 20:05:56 UTC
Event posted on 05-26-2010 04:02pm EDT by kbaxley

Identical to an issue reported by Fujitsu with RHEL6:

https://enterprise.redhat.com/issue-tracker/?module=issues&action=view&tid=885313

From: Jim Garlick <garlick>
To: bwoodard
Cc: behlendorf1, morrone2
Subject: Re: mpt2sas update
Date: 05/25/2010 06:29:04 PM


With the following patch in udev's path_id.c, I can get /dev/disk/by-path
populated (with "unknown" in the enclosure part of the path):

--- path_id.c.bak       2010-03-10 11:30:01.000000000 -0800
+++ path_id.c   2010-05-25 16:25:47.000000000 -0700
@@ -148,17 +148,13 @@
               return NULL;
       }
       enc = udev_device_get_sysattr_value(sasdev, "enclosure_identifier");
-       if (enc == NULL) {
-               parent = NULL;
-               goto out;
-       }
       bay = udev_device_get_sysattr_value(sasdev, "bay_identifier");
       if (bay == NULL) {
               parent = NULL;
               goto out;
       }

-       path_prepend(path, "sas-%s-%s", enc, bay);
+       path_prepend(path, "sas-%s-%s", enc ? enc : "unknown", bay);
out:
       udev_device_unref(sasdev);
       return parent;

The problem seems to be that udev gets ENXIO when it tries to read the
enclosure_identifier path from sys.  An example path is:

/sys/devices/pci0000:00/0000:00:0f.0/0000:07:00.0/host0/port-0:0/expander-0:0/port-0:0:1/end_device-0:0:1/sas_device//end_device-0:0:1/enclosure_identifier

The relevant code in the mpt2sas driver is in mpt2sas_transport.c:

static int
_transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
{
       struct MPT2SAS_ADAPTER *ioc = rphy_to_ioc(rphy);
       struct _sas_node *sas_expander;
       unsigned long flags;

       spin_lock_irqsave(&ioc->sas_node_lock, flags);
       sas_expander = mpt2sas_scsih_expander_find_by_sas_address(ioc,
           rphy->identify.sas_address);
       spin_unlock_irqrestore(&ioc->sas_node_lock, flags);

       if (!sas_expander)
               return -ENXIO;

       *identifier = sas_expander->enclosure_logical_id;
       return 0;
}

So that needs fixing...

Jim
This event sent from IssueTracker by kbaxley  [LLNL (HPC)]
 issue 943543

Comment 2 Issue Tracker 2010-05-26 20:05:57 UTC
Event posted on 05-26-2010 04:04pm EDT by kbaxley

The fact that udev has this problem seems to point to a deeper problem in
the kernel. At first I thought that the bug might be addressed by moving
to the 5.xx. mpt2sas driver. see: IT#886383 & BZ#591971 but looking at the
code it simply looks like a simple search through all of the enclosure IDs
and I don't think that code is wrong. However, when looking around at the
patches for the mpt2sas driver posted to the LKML I happened to notice this
the following patch:
http://kerneltrap.org/mailarchive/linux-scsi/2010/3/12/6850653/thread the
latest 2.6.32-29 kernel doesn't have that patch:


  284          cdev->groups = enclosure_groups;
  285  
  286          err = device_register(cdev);
  287          if (err)
  288                  ERR_PTR(err);
  289  
  290          return ecomp;
  291  }


However, the more that I look at it the more that I become convinced that
the patch that I referenced is not going to help this problem.


This event sent from IssueTracker by kbaxley  [LLNL (HPC)]
 issue 943543

Comment 3 Issue Tracker 2010-05-26 20:06:21 UTC
Event posted on 05-26-2010 04:06pm EDT by kbaxley


150 hard drives behind mptsas fabric (10 ports, 2 per card, each with 15
drives).


This event sent from IssueTracker by kbaxley 
 issue 943543

Comment 4 Issue Tracker 2010-05-26 20:10:47 UTC
Event posted on 05-26-2010 04:10pm EDT by kbaxley

udev-147-2.15.el6.x86_64 is what we're testing with here.


This event sent from IssueTracker by kbaxley 
 issue 943543

Comment 5 RHEL Program Management 2010-05-26 20:26:22 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 6 Issue Tracker 2010-06-02 15:01:46 UTC
Event posted on 2010-06-02 08:01 PDT by woodard

mpt2sas driver version 05.100.00.02 doesn't seem to help.

So can we get some assistance on the enclosure handling so that it
doesn't pass ENXIO up to udev.





This event sent from IssueTracker by woodard 
 issue 943543

Comment 7 Phil Knirsch 2010-06-04 14:59:27 UTC
Isn't this one a duplicate of #591133?

Thanks & regards, Phil

Comment 8 Ben Woodard 2010-06-04 15:20:09 UTC
They are supposed to be distinct.

This one is not supposed to be associated with udev. One is supposed to be associated with getting the kernel, in particular the mpt2sas driver, fixed so that it doesn't pass ENXIO up to udev which doesn't know how to handle it. One is supposed to be to teach udev how to handle the corner cases when some driver doesn't pass the enclosure information up properly. Belt & suspenders. Somehow, the cases got muddied. Maybe it is clearer if you go back to the ITs.

Comment 15 Harald Hoyer 2010-07-08 12:41:00 UTC
Btw, upstream has now a better sas patch for path_id.
They also rejected  enc ? enc : "unknown", because nobody should rely on a link, which disappears, if the kernel driver is fixed.

Comment 16 Harald Hoyer 2010-07-08 12:46:13 UTC
Created attachment 430334 [details]
path_id patch to handle sas

Comment 18 Harald Hoyer 2010-07-09 10:40:28 UTC
Here are test packages:
http://people.redhat.com/harald/downloads/udev/udev-147-2.20.0.sastest.1.el6/

Comment 20 Issue Tracker 2010-07-09 18:16:54 UTC
Event posted on 07-09-2010 02:16pm EDT by tgummels

+       end_dev_wwn = udev_device_get_sysattr_value(sas_dev,
"sas_address");
+
+       path_prepend(path, "sas-%s", end_dev_wwn);

Just having the wwn in the name is too little information for our
purposes.  We need the name to contain components that uniquely identify
an enclosure connection (channel) and a drive bay.  This is so we can
easily and consistently map the by-path names to drive locations on all
of our server nodes.

For example, see our latest patch submitted against issue 364360
(attached).  The naming convention it implements lets us create the
desired mapping using patterns like:

a1      pci-0000:83:00.0-sas-phy-3:0-0x[a-f0-9]{16}-0
a2      pci-0000:83:00.0-sas-phy-3:0-0x[a-f0-9]{16}-1
..
b1      pci-0000:83:00.0-sas-phy-3:4-0x[a-f0-9]{16}-0
b2      pci-0000:83:00.0-sas-phy-3:4-0x[a-f0-9]{16}-1

Here the phy-x:y component identifies a channel and the trailing number
identifies a drive bay within that channel.

Thanks,
Ned


This event sent from IssueTracker by tgummels 
 issue 943543

Comment 39 Karel Volný 2010-10-19 11:39:45 UTC
I have to say I'm pretty confused about the patches related to this bug ...

in recent udev-147-2.29.el6, there are the patches:

udev-147-sas.patch
udev-147-patch_id-sas.patch
udev-147-patch_id-sas2.patch
udev-147-patch_id-sas3.patch
udev-147-virtio-blk-patch_id.patch

after applying everything within the %prep phase, I get the following differences against the original sources:

--- path_id.c~  2009-08-08 15:42:05.000000000 +0200
+++ path_id.c   2010-10-19 13:01:09.425046791 +0200
@@ -121,7 +121,74 @@
 
 static struct udev_device *handle_scsi_sas(struct udev_device *parent, char **path)
 {
-       return NULL;
+       struct udev *udev  = udev_device_get_udev(parent);
+       struct udev_device *end_dev;
+       struct udev_device *sasdev;
+       struct udev_device *phydev;
+       struct udev_device *parent_dev;
+       struct udev_device *child_dev;
+       char syspath[UTIL_PATH_SIZE], *base;
+       char phy_path[UTIL_PATH_SIZE];
+       const char *end_name, *name, *enc, *bay;
+       const char *port_path;
+       int scsi_host;
+       int num_phys = 8;
+       int i;
+
+       parent_dev = parent;
+       while (1) {
+               child_dev = parent_dev;
+               parent_dev = udev_device_get_parent(child_dev);
+               if (parent_dev == NULL)
+                       return NULL;
+               name = udev_device_get_sysname(parent_dev);
+               if (strstr(name, "end_device-")) {
+                       end_dev = parent_dev;
+                       end_name = name;
+               } else if (sscanf(name, "host%d", &scsi_host)) {
+                       port_path = udev_device_get_syspath(child_dev);
+                       break;
+               }
+       }
+
+       base = strdup(udev_device_get_syspath(end_dev));
+       if (!base)
+               return NULL;
+       snprintf(syspath, sizeof(syspath), "%s/sas_device/%s", base, end_name);
+       free(base);
+
+       for (i = 0 ; i < num_phys ; i++) {
+               snprintf(phy_path, sizeof(phy_path), "%s/phy-%d:%d",
+                               port_path, scsi_host, i);
+               if (phydev = udev_device_new_from_syspath(udev, phy_path)) {
+                       udev_device_unref(phydev);
+                       break;
+               }
+       }
+       if (i == num_phys)
+               return NULL;
+
+       sasdev = udev_device_new_from_syspath(udev, syspath);
+       if (sasdev == NULL) {
+               fprintf(stderr, "unable to access '%s'\n", syspath);
+               return NULL;
+       }
+       enc = udev_device_get_sysattr_value(sasdev, "enclosure_identifier");
+        if (enc == NULL) {
+                parent = NULL;
+                goto out;
+        }
+
+       bay = udev_device_get_sysattr_value(sasdev, "bay_identifier");
+       if (bay == NULL) {
+               parent = NULL;
+               goto out;
+       }
+
+       path_prepend(path, "sas-%s-%s-%s", strrchr(phy_path, '/') + 1, enc, bay);
+out:
+       udev_device_unref(sasdev);
+       return parent;
 }
 
 static struct udev_device *handle_scsi_iscsi(struct udev_device *parent, char **path)
@@ -448,6 +515,9 @@
                } else if (strcmp(subsys, "xen") == 0) {
                        path_prepend(&path, "xen-%s", udev_device_get_sysname(parent));
                        parent = skip_subsystem(parent, "xen");
+               } else if (strcmp(subsys, "virtio") == 0) {
+                       path_prepend(&path, "virtio-pci-%s", udev_device_get_sysname(parent));
+                       parent = skip_subsystem(parent, "virtio");
                }
 
                parent = udev_device_get_parent(parent);


but looking at the comments, I see a few suspicious places:

1) comment #1:

it removes the code:

-       if (enc == NULL) {
-               parent = NULL;
-               goto out;
-       }

which stays present in the latest version

2) comment #16/attachment #430334 [details]:

this patch doesn't seem to be applied at all, none of its hunks is present in the actual code

3) comment #20:

no mention of "wwn" in the sources/patches

4) comment #22:

this seems to correspond to the changes that actually got applied, except for the part adding:

+               } else if (strcmp(subsys, "virtio") == 0) {
+                       path_prepend(&path, "virtio-pci-%s", udev_device_get_sysname(parent));
+                       parent = skip_subsystem(parent, "virtio");

however, if I get it right, this is NOT the patch applied to the testing package used in comment #25

also, guessing from comment #26, the current patch is not compatible with upstream, which is undesirable and may lead to regressions in the future

*****

in addition, the Changelog for udev does not mention fixing this bug

please clarify the points above - which code is actually supposed to be included?

Comment 40 Harald Hoyer 2010-10-19 12:29:40 UTC
(In reply to comment #39)
> but looking at the comments, I see a few suspicious places:
> 
> 1) comment #1:
> 
> it removes the code:
> 
> -       if (enc == NULL) {
> -               parent = NULL;
> -               goto out;
> -       }
> 
> which stays present in the latest version

comment #15

> 
> 2) comment #16/attachment #430334 [details]:
> 
> this patch doesn't seem to be applied at all, none of its hunks is present in
> the actual code

this was the upstream fix, which did not satisfy the customers needs.

> 
> 3) comment #20:
> 
> no mention of "wwn" in the sources/patches

this was the upstream fix, which did not satisfy the customers needs.

> 
> 4) comment #22:
> 
> this seems to correspond to the changes that actually got applied, except for
> the part adding:
> 
> +               } else if (strcmp(subsys, "virtio") == 0) {
> +                       path_prepend(&path, "virtio-pci-%s",
> udev_device_get_sysname(parent));
> +                       parent = skip_subsystem(parent, "virtio");
> 
> however, if I get it right, this is NOT the patch applied to the testing
> package used in comment #25

This was added for virtio devices (another bug)

> 
> also, guessing from comment #26, the current patch is not compatible with
> upstream, which is undesirable and may lead to regressions in the future

yes

> 
> *****
> 
> in addition, the Changelog for udev does not mention fixing this bug
> 
> please clarify the points above - which code is actually supposed to be
> included?

https://bugzilla.redhat.com/show_bug.cgi?id=537185 was the main path_id for sas bug, together with this one. I might have forgotten to mention this one.

Comment 41 Karel Volný 2010-10-19 15:25:15 UTC
okay, thanks for the information

so, in the end, the code looks like what is desired, so I can confirm with SanityOnly

(although, in my personal opinion, I do not like those obfuscating patches over patches, and the fact we diverge from upstream ...)

Comment 42 releng-rhel@redhat.com 2010-11-15 14:50:43 UTC
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.