Bug 1394089
Summary: | [LLNL 7.4 Bug] 7.3 regression: the kernel does not create the /sys/block/<sd device>/devices/enclosure_device symlinks | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Ben Woodard <woodard> | |
Component: | kernel | Assignee: | Maurizio Lombardi <mlombard> | |
kernel sub component: | Storage Drivers | QA Contact: | guazhang <guazhang> | |
Status: | CLOSED ERRATA | Docs Contact: | Jana Heves <jsvarova> | |
Severity: | urgent | |||
Priority: | urgent | CC: | akkornel, bdonahue, behlendorf1, bubrown, bugproxy, christopher.voltz, dhoward, emilne, foraker1, guazhang, hannsj_uhl, hutter2, jeff.johnson, jjarvis, jkachuck, joseph.szczypek, jsvarova, karen.skweres, linda.knippers, mkolaja, mlombard, myamazak, salmy, tgummels, tom.vaden, trinh.dao, troy.ablan, woodard, yizhang | |
Version: | 7.3 | Keywords: | Patch, Regression, ZStream | |
Target Milestone: | rc | |||
Target Release: | 7.4 | |||
Hardware: | All | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | kernel-3.10.0-650.el7 | Doc Type: | If docs needed, set a value | |
Doc Text: |
Due to a regression, the kernel previously failed to create the /sys/block/<sd device>/devices/enclosure_device symlinks. The provided patch corrects the call to the scsi_is_sas_rphy() function, which is now made on the SAS end device, instead of the SCSI device.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1425678 1427426 1427815 1460204 (view as bug list) | Environment: | ||
Last Closed: | 2017-08-02 04:28:27 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1299988, 1354610, 1381646, 1425678, 1427426, 1427815, 1446211, 1455358, 1460204, 1473286 | |||
Attachments: |
Description
Ben Woodard
2016-11-11 01:16:13 UTC
Additional information pending. 7.2 kernel: $ ls -l /sys/block/sda/device/enclosure* lrwxrwxrwx 1 root root 0 Nov 10 17:28 /sys/block/sda/device/enclosure_device:SLOT 21 27 -> ../../../../port-0:0:1/end_device-0:0:1/target0:0:1/0:0:1:0/enclosure/0:0:1:0/SLOT 21 27 7.3 kernel: $ ls -l /sys/block/sda/device/enclosure* ls: cannot access /sys/block/sda/device/enclosure*: No such file or directory Not that it is likely to matter but the affected hardware is of two types: http://www.raidinc.com/products/jbod/2u-24-bay-ssdsas-ability-ebod http://www.raidinc.com/products/jbod/4u-84-bay-jbod Interestingly, on both the old and the new kernels, the /sys/class/enclosure/* entries that enclosure_device would point to are present. The symlink just isn't there on the newer kernel. Right now we have custom code which relies on this symlink in sysfs to map the block device to the enclosure slot. If there's a different officially supported way to do this mapping we're open to changing to use that in our current code. However, at the moment we perceive this an interface change and therefore a regression. See bug 1370231, the patches in question were added to fix an install failure with certain hardware. It appears that your problem is that with these patches, scsi_is_sas_rphy() is returning false where the previous code which used is_sas_attached() would return true. diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c index fe8e241..5c721ac 100644 --- a/drivers/scsi/ses.c +++ b/drivers/scsi/ses.c @@ -587,7 +587,7 @@ static void ses_match_to_enclosure(struct enclosure_device *edev, ses_enclosure_data_process(edev, to_scsi_device(edev->edev.parent), 0); - if (is_sas_attached(sdev)) + if (scsi_is_sas_rphy(&sdev->sdev_gendev)) efd.addr = sas_get_address(sdev); if (efd.addr) { As a result efd.addr remains 0 and we consequently do not send the udev change event, I think this is how the symlinks get generated. Are you able to provide boot log information and dump of the sysfs hierarchy, or attach an sosreport? This will help understand your configuration. Created attachment 1233604 [details]
boot log
Due to maze of symlinks that is sysfs a find isn't practical, but I can tell you exactly which entry is missing. The "enclosure*" link under /sys/block/sda/device/ has disappeared. As I said earlier:
I can give you the boot log but I believe that I have provided enough information to recognize the regression where we changed the sysfs interface to the devices:
7.2 kernel:
$ ls -l /sys/block/sda/device/enclosure*
lrwxrwxrwx 1 root root 0 Nov 10 17:28 /sys/block/sda/device/enclosure_device:SLOT 21 27 -> ../../../../port-0:0:1/end_device-0:0:1/target0:0:1/0:0:1:0/enclosure/0:0:1:0/SLOT 21 27
7.3 kernel:
$ ls -l /sys/block/sda/device/enclosure*
ls: cannot access /sys/block/sda/device/enclosure*: No such file or directory
What is more clear than that?
Yes we also identified that same patch and we reverted it locally so that this regression wouldn't impact us. We did not have a problem with the booting problem that necessitated that patch.
Attached to the boot log. I don't know what additional information you will be able to glean from it. Rather than trying to dig through a huge highly repetitive boot log, isn't there a more targeted question you can ask?
You know the patch that caused the regression.
You know the symlink that is missing.
You know the hardware.
It seems to me that what needs to happen is you need to figure out a new way to fix the original boot problem that doesn't change the sysfs interface for not impacted HW.
Thank you for providing the boot log. It is helpful because it provides most of the information we need (SCSI device mappings, SAS addresses, ses enclosure probe) all at once so we don't have to keep asking. >I can give you the boot log but I believe that I have provided enough information >to recognize the regression where we changed the sysfs interface to the devices: >What is more clear than that? The problem is clear, you have identified the patch that caused the regression. >You know the patch that caused the regression. >You know the symlink that is missing. >You know the hardware. Yes, but we can't revert the patch, there are a large number of systems that won't boot without it. As you said, we have to figure out why this didn't work on your hardware. Unfortunately, we do not have your hardware, and we did not see this problem on hardware we used for testing here. This is why we asked for more information. What I was curious about with the sysfs information was whether all the sysfs symlinks were missing, or just some of them. You mentioned sda only but I assume it is all of them. >7.2 kernel: >$ ls -l /sys/block/sda/device/enclosure* >lrwxrwxrwx 1 root root 0 Nov 10 17:28 /sys/block/sda/device/enclosure_device:SLOT 21 27 -> ../../../../port-0:0:1/end_device-0:0:1/target0:0:1/0:0:1:0/enclosure/0:0:1:0/SLOT 21 27 It would also help to understand the topology. In the boot log provided 0:0:1:0 is sdb. What we are trying to understand is why the upstream fix to use scsi_is_sas_rphy() did not work in your configuration, it is presumably broken in those kernels as well and we will have to fix it there too. We have found that we can't revert the patch as safely as we thought. We have found certain instances we can trigger a crash by unexpectedly pulling drives without this patch. We think that it is correlated with a management script accessing sysfs when the hotplug event occurs. Without the patch 1372041 the system crashes, but with the patch we have yet to be able to provoke the problem. Therefore we agree that the patch is required but the side effect of removing the symlinks is unwanted. All the sysfs symlinks which map the devices back to their enclosures are missing. We are surprised that there is some hardware which creates the symlinks because we don't see them on any of our hardware. It sounds like you are saying that for all the hardware we have in the RH test lab, the symlinks are being created it is just that it is broken on all of our hardware. We do not have entirely homogeneous hardware and so we find this surprising. I have found an external SAS JBOD enclosure that when accessed shows the same behavior. I have verified that it is broken in the upstream 4.10 scsi-fixes tree as well and am working on a solution. I am not sure why this was not discovered with the RHEL QE testing, perhaps it works for simple enclosures where it is not necessary to iterate. In any event it is clearly broken and is a regression in 7.3, we just need to fix it in a way that does not cause a crash like 1370231 or like you are seeing. We have three different HW configurations that we have confirmed this problem on so far: http://www.raidinc.com/products/jbod/2u-24-bay-ssdsas-ability-ebod http://www.raidinc.com/products/jbod/4u-84-bay-jbod LSI CORP SAS2X36 0717 The problem appears to be related to the SAS topology. With the JBOD I am using, it looks like: /sys/devices/pci0000:00/0000:00:07.0/0000:1a:00.0/host5/port-5:0/expander-5:0/port-5:0:0/end_device-5:0:0/target5:0:0/5:0:0:0/block/sdb And the SAS transport code called by the changes to the enclosure code only recognizes the end device and the expander as having a PHY. Dec 20 19:23:48 storageqe-07 kernel: sd 5:0:0:0: ses_match_to_enclosure Dec 20 19:23:48 storageqe-07 kernel: is_sas_attached 1 scsi_is_sas_rphy 0 Dec 20 19:23:48 storageqe-07 kernel: dev_name "5:0:0:0" Dec 20 19:23:48 storageqe-07 kernel: parent dev_name "target5:0:0" Dec 20 19:23:48 storageqe-07 kernel: parent scsi_is_sas_rphy 0 Dec 20 19:23:48 storageqe-07 kernel: parent dev_name "end_device-5:0:0" Dec 20 19:23:48 storageqe-07 kernel: parent scsi_is_sas_rphy 1 Dec 20 19:23:48 storageqe-07 kernel: parent dev_name "port-5:0:0" Dec 20 19:23:48 storageqe-07 kernel: parent scsi_is_sas_rphy 0 Dec 20 19:23:48 storageqe-07 kernel: parent dev_name "expander-5:0" Dec 20 19:23:48 storageqe-07 kernel: parent scsi_is_sas_rphy 1 Dec 20 19:23:48 storageqe-07 kernel: parent dev_name "port-5:0" Dec 20 19:23:48 storageqe-07 kernel: parent scsi_is_sas_rphy 0 Dec 20 19:23:48 storageqe-07 kernel: parent dev_name "host5" Dec 20 19:23:48 storageqe-07 kernel: parent scsi_is_sas_rphy 0 Dec 20 19:23:48 storageqe-07 kernel: parent dev_name "0000:1a:00.0" Dec 20 19:23:48 storageqe-07 kernel: parent scsi_is_sas_rphy 0 Dec 20 19:23:48 storageqe-07 kernel: parent dev_name "0000:00:07.0" Dec 20 19:23:48 storageqe-07 kernel: parent scsi_is_sas_rphy 0 Dec 20 19:23:48 storageqe-07 kernel: parent dev_name "pci0000:00" Dec 20 19:23:48 storageqe-07 kernel: parent scsi_is_sas_rphy 0 Dec 20 19:23:48 storageqe-07 kernel: efd.addr 0 Dec 20 19:23:48 storageqe-07 kernel: sas_get_address 5764824128737142441 (this is decimal) I doubt this is what Johannes intended. Will discuss with him. Have also asked Maurizio to assist with this as he has been working on the enclosure code. Per LLNL, they asked me to make this BZ public to facilitate collaboration while working toward resolution. I have reviewed the case and there is no information in the case which needs to be made private. I have a feeling that ses_match_to_enclosure() is getting passed the sysfs gendev device rather than the expander/endpoint device. If you print dev_name(&sdev->sdev_gendev) in ses_match_to_enclosure(), you'll see the dev name in the form of "w:x:y:z", like how it gets set in scsi_sysfs_device_initialize: scsi_sysfs_device_initialize() ... dev_set_name(&sdev->sdev_gendev, "%d:%d:%d:%d", sdev->host->host_no, sdev->channel, sdev->id, sdev->lun); device_initialize(&sdev->sdev_dev); ...rather than the "expander-w:x:y:z" or "end_device-w:x:y:z" form from sas_expander_alloc() and sas_end_device_alloc() (where dev->release is being set): sas_expander_alloc() ... rdev->rphy.dev.release = sas_expander_release; ... dev_set_name(&rdev->rphy.dev, "expander-%d:%d", shost->host_no, rdev->rphy.scsi_target_id); sas_end_device_alloc() ... rdev->rphy.dev.release = sas_end_device_release; ... dev_set_name(&rdev->rphy.dev, "end_device-%d:%d", shost->host_no, parent->port_identifier); FYI... This is what I am testing with now, it seems to fix the problem for me. However I am unsure if it will work on all configurations. I am still awaiting a response from the author of the earlier upstream patch. [PATCH RFC] ses: Fix SAS device detection in enclosure The call to scsi_is_sas_rphy() needs to be made on the SAS end_device, not on the SCSI device. Signed-off-by: Ewan D. Milne <emilne> --- drivers/scsi/ses.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c index 8c9a35c..50adabb 100644 --- a/drivers/scsi/ses.c +++ b/drivers/scsi/ses.c @@ -587,7 +587,7 @@ static void ses_match_to_enclosure(struct enclosure_device *edev, ses_enclosure_data_process(edev, to_scsi_device(edev->edev.parent), 0); - if (scsi_is_sas_rphy(&sdev->sdev_gendev)) + if (scsi_is_sas_rphy(sdev->sdev_target->dev.parent)) efd.addr = sas_get_address(sdev); if (efd.addr) { -- 1.8.3.1 With this change, I get the sysfs links created, e.g.: # ls -l /sys/block/sdb/device/enclosure* lrwxrwxrwx. 1 root root 0 Dec 22 16:03 /sys/block/sdb/device/enclosure_device:SLOT 1 -> ../../../../port-1:0:19/end_device-1:0:19/target1:0:19/1:0:19:0/enclosure/1:0:19:0/SLOT 1 What I am not sure about yet is if this will break other topologies. Wanted to comment that the patch from comment #15 restores enclosure device symlinks on an LSI/Avago (mpt2sas) setup with SuperMicro enclosure backplanes using CentOS 7.3. Thanks for the sleuthing and hard work! The comment #15 patch does create the enclosure_device links on our 84-bay RAID Inc. JBOD, but I hit a general protection fault after rmmod/insmodding ses.ko. Here's the little I could capture: 2017-01-04 18:15:59 [ 3266.580855] ses 0:0:1:0: Attached Enclosure device 2017-01-04 18:15:59 [ 3266.587820] ses 0:0:4:0: Attached Enclosure device 2017-01-04 18:15:59 [ 3266.594001] ses 11:0:3:0: Attached Enclosure device 2017-01-04 18:15:59 [ 3266.600118] ses 11:0:5:0: Attached Enclosure device 2017-01-04 18:17:06 [ 3333.499627] general protection fault: 0000 [#1] SMP 2017-01-04 18:17:06 [ 3333.506745] Modules linked in: ses(E+) nfsv3 ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm iTCO_wdt iTCO_vendor_support mlx5_ib intel_powerclamp coretemp intel_rapl iosf_mbi kvm ib_core irqbypass pcspkr mlx5_core sb_edac edac_core mei_me lpc_ich mei i2c_i801 ioatdma enclosure ipmi_devintf zfs(POE) zunicode(POE) zavl(POE) icp(POE) sg shpchp ipmi_si ipmi_msghandler acpi_power_meter acpi_cpufreq binfmt_misc zcommon(POE) znvpair(POE) spl(OE) zlib_deflate nfsd nfs_acl ip_tables rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache dm_round_robin sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel scsi_transport_iscsi ghash_clmulni_intel 8021q garp stp llc mrp mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci ttm aesni_intel libahci lrw gf128mul ixgbe mxm_wmi dm_multipath dca glue_helper drm mpt3sas ablk_helper ptp libata i2c_core cryptd raid_class pps_core scsi_transport_sas mdio fjes wmi sunrpc dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ses] 2017-01-04 18:17:06 [ 3333.619809] CPU: 15 PID: 150658 Comm: insmod Tainted: P OE ------------ 3.10.0-514.0.0.1chaos.ch6.x86_64 #1 <reboot> I've hit the GPF after one to two rmmod/insmodding cycles. If I run without the patch, I'm able to rmmod/insmod ses.ko without issue. (In reply to Tony Hutter from comment #17) > The comment #15 patch does create the enclosure_device links on our 84-bay > RAID Inc. JBOD, but I hit a general protection fault after rmmod/insmodding > ses.ko. Here's the little I could capture: > > ... > > I've hit the GPF after one to two rmmod/insmodding cycles. If I run without > the patch, I'm able to rmmod/insmod ses.ko without issue. OK, thanks, I will look into this some more on the machine I have. I'm able to reproduce the issue at will with: rmmod ses # remove stock ses modules insmod drivers/scsi/ses.ko # load ses.ko with comment #15 patch rmmod ses insmod drivers/scsi/ses.ko < crash > I managed to capture a full call trace below. The root FS on the node is NFS mounted, so seeing the backtrace in there isn't unexpected if ses.ko is corrupting something. general protection fault: 0000 [#1] SMP Modules linked in: ses(E+) dm_round_robin enclosure sd_mod crc_t10dif crct10dif_generic sg nfsv3 ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx5_ib ib_core intel_powerclamp coretemp intel_rapl iosf_mbi ipmi_devintf iTCO_wdt iTCO_vendor_support kvm irqbypass mlx5_core pcspkr zfs(POE) zunicode(POE) zavl(POE) icp(POE) mei_me sb_edac edac_core mei ioatdma i2c_i801 lpc_ich shpchp ipmi_si ipmi_msghandler acpi_power_meter acpi_cpufreq binfmt_misc zcommon(POE) znvpair(POE) spl(OE) zlib_deflate nfsd nfs_acl ip_tables rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache 8021q garp stp llc mrp scsi_transport_iscsi mgag200 i2c_algo_bit drm_kms_helper crct10dif_pclmul syscopyarea crct10dif_common sysfillrect crc32_pclmul sysimgblt fb_sys_fops crc32c_intel ttm ghash_clmulni_intel ixgbe drm ahci aesni_intel dm_multipath lrw mxm_wmi libahci gf128mul dca mpt3sas glue_helper ptp ablk_helper i2c_core libata cryptd raid_class pps_core scsi_transport_sas mdio fjes wmi sunrpc dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ses] CPU: 4 PID: 101008 Comm: awk Tainted: P OE ------------ 3.10.0-514.0.0.1chaos.ch6.x86_64 #1 Hardware name: Intel Corporation S2600WTTR/S2600WTTR, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016} task: ffff880fb8f28000 ti: ffff880fad024000 task.ti: ffff880fad024000 RIP: 0010:[<ffffffff811e5d8b>] 4 [<ffffffff811e5d8b>] kmem_cache_alloc_trace+0xab/0x250 RSP: 0018:ffff880fad0279e0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff882031c3c860 RCX: 0000000000009cc2L04 RDX: 0000000000009cc1 RSI: 00000000000003ad RDI: ffff880fad027fd8 RBP: ffff880fad027a18 R08: 0000000000019a60 R09: ffff88018fc07c00 R10: ffffffffa0411ac4 R11: ffff882035f40c00 R12: 7275736f6c636e65 R13: 00000000000000d0 R14: 0000000000000020 R15: ffff88018fc07c00A74 FS: 0000000000000000(0000) GS:ffff88103e700000(0000) knlGS:0000000000000000.94 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffff7ff39ad CR3: 0000000fb92ad000 CR4: 00000000001407e0d<4 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000?4 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff88018fc07c00 ffffffffa0411ac4 ffff882031c3c860 ffff882031c3c800 ffff880fb740ac00 0000000000000000 ffff880fad027bf8 ffff880fad027a30 ffffffffa0411ac4 ffff8820345eb000 ffff880fad027a88 ffffffffa03fb3c6 Call Trace:d [<ffffffffa0411ac4>] ? nfs_alloc_seqid+0x24/0x60 [nfsv4]> [<ffffffffa0411ac4>] nfs_alloc_seqid+0x24/0x60 [nfsv4] [<ffffffffa03fb3c6>] nfs4_opendata_alloc+0xc6/0x4d0 [nfsv4]H [<ffffffffa03fe745>] nfs4_do_open+0x185/0x650 [nfsv4] [<ffffffffa03fed07>] nfs4_atomic_open+0xf7/0x110 [nfsv4]Q [<ffffffffa0413920>] nfs4_file_open+0x110/0x2b0 [nfsv4]N [<ffffffff81204bd7>] do_dentry_open+0x1a7/0x2e0 [<ffffffff812b4a1c>] ? security_inode_permission+0x1c/0x30 [<ffffffffa0413810>] ? nfs4_file_flush+0x90/0x90 [nfsv4] [<ffffffff81204daf>] vfs_open+0x5f/0xe0 [<ffffffff81212678>] ? may_open+0x68/0x110 [<ffffffff81215e2d>] do_last+0x1ed/0x12a0 [<ffffffff81185fee>] ? __find_get_page+0x2e/0xc0 [<ffffffff81217226>] path_openat+0x346/0x4d0A&5 [<ffffffff811865fb>] ? unlock_page+0x2b/0x30i*5 [<ffffffff8121900b>] do_filp_open+0x4b/0xb0\15 [<ffffffff81226367>] ? __alloc_fd+0xa7/0x130275 [<ffffffff81206113>] do_sys_open+0xf3/0x1f0L?5 [<ffffffff816a89e5>] ? do_page_fault+0x35/0x90{D5 [<ffffffff8120622e>] SyS_open+0x1e/0x20 [<ffffffff816ad249>] system_call_fastpath+0x16/0x1b8k5 Code: 8b 50 08 83 68 1c 01 4d 8b 20 49 8b 40 10 4d 85 e4 0f 84 49 01 00 00 48 85 c0 0f 84 40 01 00 00 49 63 47 20 48 8d 4a 01 4d 8b 07 <49> 8b 1c 04 4c 89 e0 65 49 0f c7 08 0f 94 c0 84 c0 74 aa 49 63 }L;"RIP [<ffffffff811e5d8b>] kmem_cache_alloc_trace+0xab/0x250 Thanks Tony, I am able to reproduce a crash when executing rmmod/modprobe with the rhel7 kernel (XFS as root filesystem) but the upstream kernel seems unaffected. It looks like a memory corruption; there is a commit in the upstream kernel (not backported to rhel7) called "ses: Fix racy cleanup of /sys in remove_dev()" that fixes a race condition when unregistering a device, in some cases the kernel may use invalid memory pointers. I guess it will fix this issue, I am going to test it. [ 193.138024] ses 5:0:39:0: Attached Enclosure device [ 660.600447] general protection fault: 0000 [#1] SMP [ 660.605447] Modules linked in: ses(+) enclosure coretemp kvm_intel cdc_ether kvm mpt2sas usbnet mii irqbypass raid_class iTCO_wdt ipmi_ssif iTCO_vendor_support i2c_i801 scsi_transport_sas ipmi_devintf sg pcspkr ipmi_si lpc_ich ipmi_msghandler ioatdma shpchp i7core_edac dca edac_core acpi_cpufreq ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_common mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ata_generic drm pata_acpi ata_piix libata crc32c_intel megaraid_sas serio_raw i2c_core bnx2 fjes dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ses] [ 660.660427] CPU: 1 PID: 10528 Comm: systemd-udevd Not tainted 3.10.0_ses_fix+ #1 [ 660.667816] Hardware name: IBM System x3550 M3 -[7944AC1]-/69Y4438, BIOS -[D6E162AUS-1.20]- 05/07/2014 [ 660.677112] task: ffff880078586dd0 ti: ffff880177b48000 task.ti: ffff880177b48000 [ 660.684585] RIP: 0010:[<ffffffff811d8f65>] [<ffffffff811d8f65>] __kmalloc+0x95/0x240 [ 660.692426] RSP: 0018:ffff880177b4bbf8 EFLAGS: 00010282 [ 660.697735] RAX: 0000000000000000 RBX: 000000000000001e RCX: 000000000000e460 [ 660.704861] RDX: 000000000000e45f RSI: 0000000000000000 RDI: 0000000000000003 [ 660.711988] RBP: ffff880177b4bc28 R08: 0000000000019ae0 R09: ffffffff812babf1 [ 660.719114] R10: ffff88017ac03c00 R11: 0000000000000004 R12: 00000000000000d0 [ 660.726240] R13: 7974697275636573 R14: 000000000000001f R15: ffff88017ac03c00 [ 660.733366] FS: 00007f631b1b78c0(0000) GS:ffff88017b040000(0000) knlGS:0000000000000000 [ 660.741448] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 660.747187] CR2: 00007f6318e1954c CR3: 000000017798e000 CR4: 00000000000007e0 [ 660.754313] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 660.761440] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 660.768566] Stack: [ 660.770580] ffffffff812babf1 000000000000001e ffff88007ec39aa0 ffff880177b4bcec [ 660.778044] ffff88017261d320 00000000000000d0 ffff880177b4bcc8 ffffffff812babf1 [ 660.785507] ffff88017ac03b00 00000000811dbd15 0000000000000001 ffffffff81220add [ 660.792970] Call Trace: [ 660.795418] [<ffffffff812babf1>] ? security_context_to_sid_core+0x61/0x260 [ 660.802373] [<ffffffff812babf1>] security_context_to_sid_core+0x61/0x260 [ 660.809156] [<ffffffff81220add>] ? __simple_xattr_set+0x4d/0x190 [ 660.815243] [<ffffffff81220b78>] ? __simple_xattr_set+0xe8/0x190 [ 660.821330] [<ffffffff811d8a73>] ? kfree+0x103/0x140 [ 660.826378] [<ffffffff812bc85c>] security_context_to_sid_force+0x1c/0x20 [ 660.833159] [<ffffffff812acba2>] selinux_inode_post_setxattr+0x72/0x120 [ 660.839857] [<ffffffff812a3b83>] security_inode_post_setxattr+0x33/0x50 [ 660.846552] [<ffffffff8121fbf0>] __vfs_setxattr_noperm+0x180/0x1b0 [ 660.852813] [<ffffffff8121fcd5>] vfs_setxattr+0xb5/0xc0 [ 660.858122] [<ffffffff8121fe0e>] setxattr+0x12e/0x1c0 [ 660.863258] [<ffffffff8120a8fd>] ? putname+0x3d/0x60 [ 660.868305] [<ffffffff8120baa2>] ? user_path_at_empty+0x72/0xc0 [ 660.874308] [<ffffffff811fc998>] ? __sb_start_write+0x58/0x110 [ 660.880224] [<ffffffff8168cc71>] ? __do_page_fault+0x171/0x450 [ 660.886137] [<ffffffff812201ef>] SyS_lsetxattr+0xaf/0xf0 [ 660.891531] [<ffffffff81691789>] system_call_fastpath+0x16/0x1b [ 660.897532] Code: d0 00 00 49 8b 50 08 4d 8b 28 49 8b 40 10 4d 85 ed 0f 84 30 01 00 00 48 85 c0 0f 84 27 01 00 00 49 63 42 20 48 8d 4a 01 4d 8b 02 <49> 8b 5c 05 00 4c 89 e8 65 49 0f c7 08 0f 94 c0 84 c0 74 b8 49 [ 660.917662] RIP [<ffffffff811d8f65>] __kmalloc+0x95/0x240 [ 660.923162] RSP <ffff880177b4bbf8> Unfortunately the "ses: Fix racy cleanup of /sys in remove_dev()" patch doesn't fix the issue, I continue to look into it. I tested the comment #15 patch on our large filesystem and saw that it was not creating all the sysfs entries all of the time. In fact, I was trying it out because we had noticed that an older version of the SES driver was also not creating all the links all the time. In both the old driver and with the new patch, I'm seeing ~50% of the nodes create all their symlinks, while the others only created some of them. For example, here's one of the nodes that didn't have all their links (below). It's sorted by device mapper (dm-*) device number: # for i in `ls /sys/block/ | grep dm-` ; do echo -n "$i enclosure: " && if ls /sys/block/$i/slaves/*/device/enclosure*/fault &> /dev/null ; then echo exists ; else echo missing ; fi ; done | sed 's/dm-//g' | sort -n 0 enclosure: exists 1 enclosure: exists 2 enclosure: exists 3 enclosure: exists 4 enclosure: exists 5 enclosure: exists 6 enclosure: exists 7 enclosure: exists 8 enclosure: exists 9 enclosure: exists 10 enclosure: exists 11 enclosure: exists 12 enclosure: exists 13 enclosure: exists 14 enclosure: exists 15 enclosure: exists 16 enclosure: exists 17 enclosure: exists 18 enclosure: exists 19 enclosure: missing 20 enclosure: missing ... 77 enclosure: missing 78 enclosure: missing 79 enclosure: exists 80 enclosure: missing 81 enclosure: missing ... 157 enclosure: missing 158 enclosure: missing 159 enclosure: exists This is with two 84 bay enclosures with 80 drives populated in each. You can see it creates links for the first 18 drives, and also for the last drive in each enclosure (oddly enough). Some debugging later, I saw that in the cases where it was failing, enclosure_add_device() was returning -2: static int ses_enclosure_find_by_addr(struct enclosure_device *edev, void *data) { struct efd *efd = data; int i; struct ses_component *scomp; int rc; if (!edev->component[0].scratch) { return 0; } for (i = 0; i < edev->components; i++) { scomp = edev->component[i].scratch; if (scomp->addr != efd->addr) { continue; } rc = enclosure_add_device(edev, i, efd->dev); if (rc == 0) { kobject_uevent(&efd->dev->kobj, KOBJ_CHANGE); } printk("debug: rc=%d\n", rc); return 1; } return 0; } One question would be whether the problem with not all the enclosures creating sysfs entries is a 7.3 regression. When you said "with the old driver" (ses) did you mean an earlier RHEL7 ses driver? Are all the enclosures the same or are some of them different hardware. There have been some problems in the past with probing certain enclosures. What I mean is, is is just the sysfs links that are missing or is it the enclosure devices themselves that do not show up. This was what I was referring to earlier when I mentioned that I was unsure if it work work on all configurations. Are the missing enclosures always the same ones? Do they have a different connection topology? You could see this in the sysfs device path (i.e. not the link). (In reply to Tony Hutter from comment #19) > I'm able to reproduce the issue at will with: > > rmmod ses # remove stock ses modules > insmod drivers/scsi/ses.ko # load ses.ko with comment #15 patch > rmmod ses > insmod drivers/scsi/ses.ko > < crash > This is a bug in another layer of the kernel, Ewan's patch is triggering it. I am going to open BZ to track it.
> This is a bug in another layer of the kernel, Ewan's patch is triggering it.
> I am going to open BZ to track it.
introduced in kernel version -464
Hi Tony, please answer to comment 25, thanks. We were running the latest RH 7.3 kernel, but with these patches reverted so that we could get the enclosure_device sysfs links: Revert "[scsi] sas: provide stub implementation for scsi_is_sas_rphy" This reverts commit 6721acd6a5ec8791e9afc3b3a3b62332fe038307. Revert "[scsi] ses: use scsi_is_sas_rphy instead of is_sas_attached" This reverts commit 6de748fe3354b4ef4bf0988af27ba5d72af1cc9c. Revert "[scsi] sas: remove is_sas_attached()" This reverts commit 6cf8131f01bdbce599928a10a0f9a929127677b1. So here's the RH kernel commit we're running: commit 4274bf5bb87f37e0899f20574f70114ce71d4285 Author: Paolo Abeni <pabeni> Date: Wed Oct 12 16:30:30 2016 +0200 IB/ipoib: move back IB LL address into the hard header And ses.c were running (aka the "old SES driver"): commit b4265f8439389c7e77336fc9a0d443f58f10c6e5 Author: Maurizio Lombardi <mlombard> Date: Thu Mar 24 14:17:12 2016 -0400 [scsi] ses: fix discovery of SATA devices in SAS enclosures Our nodes are connected to two 84-bay enclosures in a multipath configuration, so the node sees "four" enclosures devices. 80 of the 84 slots are populated with drives. The enclosures are the same hardware (84-bay RAID Inc JBODs). # ls -l /sys/class/enclosure/ 0:0:0:0 -> ../../devices/pci0000:00/0000:00:03.0/0000:05:00.0/host0/port-0:0/expander-0:0/port-0:0:0/end_device-0:0:0/target0:0:0/0:0:0:0/enclosure/0:0:0:0 0:0:81:0 -> ../../devices/pci0000:00/0000:00:03.0/0000:05:00.0/host0/port-0:1/expander-0:3/port-0:3:0/end_device-0:3:0/target0:0:81/0:0:81:0/enclosure/0:0:81:0 11:0:1:0 -> ../../devices/pci0000:00/0000:00:03.2/0000:06:00.0/host11/port-11:0/expander-11:0/port-11:0:2/end_device-11:0:2/target11:0:1/11:0:1:0/enclosure/11:0:1:0 11:0:81:0 -> ../../devices/pci0000:00/0000:00:03.2/0000:06:00.0/host11/port-11:1/expander-11:3/port-11:3:2/end_device-11:3:2/target11:0:81/11:0:81:0/enclosure/11:0:81:0 We do have cases where an enclosure doesn't show up, but that isn't the majority of the missing links. In most cases the node can see all four enclosures, but isn't creating links for all drives. For example, the node in Comment 23 saw all four enclosures but didn't create all the links. To give you a better idea, here's the number of drives with enclosure_device* present taken from a sampling of nodes (nodes should see 160 drives) Node drives with enclosure_device* sysfs link ---- ---------------------------------------- 1 159 2 159 3 22 4 159 5 77 6 160 7 91 8 160 9 160 10 160 11 21 12 160 13 160 14 160 15 41 16 30 17 157 18 35 19 0 ... I dunno if it's related, but I saw another case where we have two enclosures with two multipath links, but only see three /sys/class/enclosure devices: # ls /sys/class/enclosure 0:0:0:0 0:0:81:0 11:0:81:0 # lsscsi -g | grep EBOD [0:0:0:0] enclosu RAIDINC 84BAY EBOD 0204 - /dev/sg0 [0:0:81:0] enclosu RAIDINC 84BAY EBOD 0204 - /dev/sg81 [11:0:1:0] enclosu RAIDINC 84BAY EBOD 0204 - /dev/sg163 [11:0:81:0] enclosu RAIDINC 84BAY EBOD 0204 - /dev/sg243 (In reply to Tony Hutter from comment #30) > I dunno if it's related, but I saw another case where we have two enclosures > with two multipath links, but only see three /sys/class/enclosure devices: Ignore Comment 30, the device was actually down: sg_ses --verbose --page=ed /dev/sg163 inquiry cdb: 12 00 00 00 24 00 RAIDINC 84BAY EBOD 0204 enclosure services device Receive diagnostic results cmd: 1c 01 07 ff ff 00 receive diagnostic results: Fixed format, current; Sense key: Not Ready Additional sense: Enclosure services unavailable Attempt to fetch Element Descriptor (SES) diagnostic page failed device no ready (In reply to Tony Hutter from comment #29) > We were running the latest RH 7.3 kernel, but with these patches reverted so > that we could get the enclosure_device sysfs links: > > Revert "[scsi] sas: provide stub implementation for scsi_is_sas_rphy" > This reverts commit 6721acd6a5ec8791e9afc3b3a3b62332fe038307. > > Revert "[scsi] ses: use scsi_is_sas_rphy instead of is_sas_attached" > This reverts commit 6de748fe3354b4ef4bf0988af27ba5d72af1cc9c. > > Revert "[scsi] sas: remove is_sas_attached()" > This reverts commit 6cf8131f01bdbce599928a10a0f9a929127677b1. > thanks, unfortunately this is not sufficient to determine whether or not it's a RHEL7.3 regression, is it possible for you to test the RHEL7.2's kernel (version 3.10.0-327) on your machine? Unfortunately for us, installing an older kernel into our netboot image can't be done easily, since we have a bunch of packages in our image that are kernel version specific (like zfs, nvidia, hyperv, etc). Also, the system is in high demand, so getting time on it is difficult. Just to summarize where we're at: 1. RH7.3 ses.ko No symlinks at all, but is stable (can rmmod/insmod). 2. RH7.3 ses.ko + comment #15 patch symlinks created some of the time as described in comment 29, but can't rmmod without GPFs. 3. RH7.3 ses.ko + revert the three patches from comment #29 symlinks created some of the time, but get GPFs when removing/reinserting a number of disks (haven't tried rmmod/insmod yet) I did more digging and think I could be on to something with the symlink issue. I tested with RH7.3 ses.ko + comment #15 patch, and added some crude printks (patch attached) and noticed that the disks were still being detected while SES started to create the enclosure_device symlinks. The symlinks started failing with -ENOENT (No such file or directory) on the first disk that hadn't been discovered yet. For example, notice in the boot that 0:0:26:0 is the highest numbered disk we've detected, when we see SES's symlink creation start to fail at 0:0:27:0: ... [ 56.703415] sd 0:0:18:0: [sdr] Attached SCSI disk [ 56.703507] sd 0:0:12:0: [sdl] Attached SCSI disk [ 56.703770] sd 0:0:23:0: [sdw] Attached SCSI disk [ 56.709272] sd 0:0:17:0: [sdq] Attached SCSI disk [ 56.710065] sd 0:0:8:0: [sdh] Attached SCSI disk [ 56.711357] sd 0:0:10:0: [sdj] Attached SCSI disk [ 56.712004] sd 0:0:19:0: [sds] Attached SCSI disk [ 56.712912] sd 0:0:14:0: [sdn] Attached SCSI disk [ 56.718549] sd 0:0:16:0: [sdp] Attached SCSI disk [ 56.719839] sd 0:0:5:0: [sde] Attached SCSI disk [ 57.685196] sd 0:0:12:0: Attached scsi generic sg12 type 0 [ 57.691961] sd 0:0:13:0: Attached scsi generic sg13 type 0 [ 57.698555] sd 0:0:14:0: Attached scsi generic sg14 type 0 [ 57.705129] sd 0:0:15:0: Attached scsi generic sg15 type 0 [ 57.711686] sd 0:0:16:0: Attached scsi generic sg16 type 0 [ 57.718216] sd 0:0:17:0: Attached scsi generic sg17 type 0 [ 57.724780] sd 0:0:18:0: Attached scsi generic sg18 type 0 [ 57.731304] sd 0:0:19:0: Attached scsi generic sg19 type 0 [ 57.737858] sd 0:0:20:0: Attached scsi generic sg20 type 0 [ 57.744369] sd 0:0:21:0: Attached scsi generic sg21 type 0 [ 57.750884] sd 0:0:22:0: Attached scsi generic sg22 type 0 [ 57.757418] sd 0:0:23:0: Attached scsi generic sg23 type 0 [ 57.764694] sd 0:0:24:0: Attached scsi generic sg24 type 0 [ 57.769491] sd 0:0:24:0: [sdx] 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB) [ 57.769493] sd 0:0:24:0: [sdx] 4096-byte physical blocks [ 57.771676] sd 0:0:24:0: [sdx] Write Protect is off [ 57.771678] sd 0:0:24:0: [sdx] Mode Sense: db 00 10 08 [ 57.772994] sd 0:0:24:0: [sdx] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 57.786098] sd 0:0:24:0: [sdx] Attached SCSI disk [ 57.815417] sd 0:0:25:0: Attached scsi generic sg25 type 0 [ 57.822119] sd 0:0:25:0: [sdy] 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB) [ 57.831434] sd 0:0:25:0: [sdy] 4096-byte physical blocks [ 57.831451] sd 0:0:26:0: [sdz] 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB) [ 57.831452] sd 0:0:26:0: [sdz] 4096-byte physical blocks [ 57.833630] sd 0:0:26:0: [sdz] Write Protect is off [ 57.833631] sd 0:0:26:0: [sdz] Mode Sense: db 00 10 08 [ 57.834931] sd 0:0:26:0: [sdz] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 57.847084] sd 0:0:26:0: [sdz] Attached SCSI disk [ 57.882587] sd 0:0:25:0: [sdy] Write Protect is off [ 57.888340] sd 0:0:25:0: [sdy] Mode Sense: db 00 10 08 [ 57.895738] sd 0:0:25:0: [sdy] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 57.938246] device-mapper: multipath round-robin: version 1.1.0 loaded [ 57.938494] sd 0:0:25:0: [sdy] Attached SCSI disk [ 57.980800] ses_debug: 0:0:1:0, 5764824129064842522 [ 57.986801] ses_debug: rc=0 [ 58.058734] ses_debug: 0:0:2:0, 5764824129064880282 [ 58.064743] ses_debug: rc=0 [ 58.140634] ses_debug: 0:0:3:0, 5764824129064829498 [ 58.146583] ses_debug: rc=0 [ 58.221247] ses_debug: 0:0:4:0, 5764824129064882026 [ 58.227163] ses_debug: rc=0 [ 58.299347] ses_debug: 0:0:5:0, 5764824129064824566 [ 58.299366] ses_debug: rc=0 [ 58.375040] ses_debug: 0:0:6:0, 5764824129061792470 [ 58.380975] ses_debug: rc=0 [ 58.454510] ses_debug: 0:0:7:0, 5764824129049451830 [ 58.460616] ses_debug: rc=0 [ 58.536676] ses_debug: 0:0:8:0, 5764824129064879846 [ 58.542600] ses_debug: rc=0 [ 58.617497] ses_debug: 0:0:9:0, 5764824129064884190 [ 58.623429] ses_debug: rc=0 [ 58.699507] ses_debug: 0:0:10:0, 5764824129064262202 [ 58.699532] ses_debug: rc=0 [ 58.771484] ses_debug: 0:0:11:0, 5764824129064828746 [ 58.777576] ses_debug: rc=0 [ 58.850101] ses_debug: 0:0:12:0, 5764824129064878078 [ 58.856309] ses_debug: rc=0 [ 58.932471] ses_debug: 0:0:13:0, 5764824129064883994 [ 58.938896] ses_debug: rc=0 [ 59.013410] ses_debug: 0:0:14:0, 5764824129064884042 [ 59.019619] ses_debug: rc=0 [ 59.093899] ses_debug: 0:0:15:0, 5764824129064855742 [ 59.100222] ses_debug: rc=0 [ 59.174590] ses_debug: 0:0:16:0, 5764824129064844334 [ 59.180636] ses_debug: rc=0 [ 59.256396] ses_debug: 0:0:17:0, 5764824129064839942 [ 59.262561] ses_debug: rc=0 [ 59.340327] ses_debug: 0:0:18:0, 5764824129064883454 [ 59.346288] ses_debug: rc=0 [ 59.422159] ses_debug: 0:0:19:0, 5764824129064830418 [ 59.428123] ses_debug: rc=0 [ 59.504551] ses_debug: 0:0:20:0, 5764824129064841454 [ 59.504562] ses_debug: rc=0 [ 59.578479] ses_debug: 0:0:21:0, 5764824129064844882 [ 59.584427] ses_debug: rc=0 [ 59.660366] ses_debug: 0:0:22:0, 5764824129064852242 [ 59.666303] ses_debug: rc=0 [ 59.743882] ses_debug: 0:0:23:0, 5764824129064243850 [ 59.749811] ses_debug: rc=0 [ 59.826284] ses_debug: 0:0:24:0, 5764824129064876354 [ 59.832183] ses_debug: rc=0 [ 59.911356] ses_debug: 0:0:25:0, 5764824129064883990 [ 59.911366] ses_debug: rc=0 [ 59.984543] ses_debug: 0:0:26:0, 5764824129064841914 [ 59.990460] ses_debug: rc=0 [ 60.066524] ses_debug: 0:0:27:0, 5764824129064824278 [ 60.072440] ses_debug: createlink1 -2 [ 60.076869] ses_debug: cdev ffff88101ff8a318 [ 60.081992] ses_debug: cdev devname SLOT 58 52 [ 60.087546] ses_debug: cdev device SLOT 58 52 [ 60.092955] ses_debug: cdev dev 0:0:27:0 [ 60.097697] ses_debug: type 23 [ 60.101474] ses_debug: number 57 [ 60.105426] ses_debug: fault 0 [ 60.109155] ses_debug: active 0 [ 60.112989] ses_debug: status 0 [ 60.116832] ses_debug: rc=-2 [ 60.193019] ses_debug: 0:0:28:0, 5764824129064821270 [ 60.198872] ses_debug: createlink1 -2 [ 60.203347] ses_debug: cdev ffff88101ff8a5e8 [ 60.208463] ses_debug: cdev devname SLOT 59 53 [ 60.213955] ses_debug: cdev device SLOT 59 53 [ 60.219335] ses_debug: cdev dev 0:0:28:0 [ 60.224037] ses_debug: type 23 [ 60.227765] ses_debug: number 58 [ 60.231680] ses_debug: fault 0 [ 60.235400] ses_debug: active 0 [ 60.239208] ses_debug: status 0 [ 60.243033] ses_debug: rc=-2 [ 60.321518] ses_debug: 0:0:29:0, 5764824129064825026 [ 60.327370] ses_debug: createlink1 -2 [ 60.331750] ses_debug: cdev ffff88101ff8a8b8 [ 60.336797] ses_debug: cdev devname SLOT 60 54 [ 60.342257] ses_debug: cdev device SLOT 60 54 [ 60.347608] ses_debug: cdev dev 0:0:29:0 ... Created attachment 1242640 [details]
debug printks
(In reply to Tony Hutter from comment #34) > Unfortunately for us, installing an older kernel into our netboot image > can't be done easily, since we have a bunch of packages in our image that > are kernel version specific (like zfs, nvidia, hyperv, etc). Also, the > system is in high demand, so getting time on it is difficult. > > Just to summarize where we're at: > > 1. RH7.3 ses.ko > > No symlinks at all, but is stable (can rmmod/insmod). > > > 2. RH7.3 ses.ko + comment #15 patch > > symlinks created some of the time as described in comment 29, but can't > rmmod without GPFs. > > > 3. RH7.3 ses.ko + revert the three patches from comment #29 > > symlinks created some of the time, but get GPFs when removing/reinserting a > number of disks (haven't tried rmmod/insmod yet) We are working on fixing the GPF right now. > > I did more digging and think I could be on to something with the symlink > issue. I tested with RH7.3 ses.ko + comment #15 patch, and added some crude > printks (patch attached) and noticed that the disks were still being > detected while SES started to create the enclosure_device symlinks. Ah! This helps a lot, I understand the problem. (In reply to Tony Hutter from comment #34) > 1. RH7.3 ses.ko > > No symlinks at all, but is stable (can rmmod/insmod). > > > 2. RH7.3 ses.ko + comment #15 patch > > symlinks created some of the time as described in comment 29, but can't > rmmod without GPFs. > The GPF bug is going to be fixed in RHEL7.4, the following patch fixes it: diff --git a/drivers/base/core.c b/drivers/base/core.c index cb4115569ad8..16442c5e7157 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -1341,6 +1341,7 @@ void device_del(struct device *dev) kobject_del(&dev->kobj); /* This free's the allocation done in device_add() */ kfree(dev->device_rh); + dev->device_rh = NULL; put_device(parent); } Tony, about comment 34, can you post the entire dmesg please? Created attachment 1245968 [details]
full dmesg log with debug printks
Tony, can you please try this patch? This should fix the sysfs links creation bug you described in comment 23 when using multipath. You have to apply it on top of comment 15 and comment 37 patches diff --git a/drivers/misc/enclosure.c b/drivers/misc/enclosure.c index 3d4ae2f..91c35d3 100644 --- a/drivers/misc/enclosure.c +++ b/drivers/misc/enclosure.c @@ -381,8 +381,10 @@ int enclosure_add_device(struct enclosure_device *edev, int component, cdev = &edev->component[component]; - if (cdev->dev == dev) + if (cdev->dev == dev) { + enclosure_add_links(cdev); return -EEXIST; + } if (cdev->dev) enclosure_remove_links(cdev); Tony, wait please, I did a mistake in this patch, I will send you a new one soon Tony, I fixed the patch in comment 42, please let me know if it solves the problem with the symlinks. diff --git a/drivers/misc/enclosure.c b/drivers/misc/enclosure.c index 3d4ae2f..585a7ee 100644 --- a/drivers/misc/enclosure.c +++ b/drivers/misc/enclosure.c @@ -375,21 +375,33 @@ int enclosure_add_device(struct enclosure_device *edev, int component, struct device *dev) { struct enclosure_component *cdev; + int r; if (!edev || component >= edev->components) return -EINVAL; cdev = &edev->component[component]; - if (cdev->dev == dev) + if (cdev->dev == dev) { + if (!cdev->links_created) { + r = enclosure_add_links(cdev); + if (!r) + cdev->links_created = 1; + } return -EEXIST; + } if (cdev->dev) enclosure_remove_links(cdev); put_device(cdev->dev); cdev->dev = get_device(dev); - return enclosure_add_links(cdev); + r = enclosure_add_links(cdev); + if (!r) + cdev->links_created = 1; + else + cdev->links_created = 0; + return r; } EXPORT_SYMBOL_GPL(enclosure_add_device); diff --git a/include/linux/enclosure.h b/include/linux/enclosure.h index a4cf57c..9826a81 100644 --- a/include/linux/enclosure.h +++ b/include/linux/enclosure.h @@ -102,6 +102,7 @@ struct enclosure_component { int active; int locate; int slot; + int links_created; enum enclosure_status status; int power_status; }; Thanks, I'll give it a test Nice, that seems to fix it. Great work Maurizio! Thanks Tony, This requires a fix in the upstream Linux kernel, I am going to prepare a patch and submit it. I tested the patches in comments 15, 37 and 44 on 3.10.0-514.2.2 in a system with 120 multipath SAS disks and it works. # ls /sys/block/dm-8/slaves/sdei/device/enclosure_device\:DISK\ 18/ active device fault locate power power_status slot status type uevent Created attachment 1256925 [details] Consolidated patches from comments 15, 37, and 44 Just to make it simpler for other testers and to make sure we are all testing the same thing, I consolidated the patches from comments 15, 37, and 44 into a single patch. I tested the consolidated patch on a Hewlett Packard Enterprise Apollo 4520 with 46 multi-path SAS drives. The Apollo 4520 enclosure contains 2 server nodes, each with an H240 Smart HBA connected to 2 boxes of 23 drives each (such that each node can see both boxes and thus can access all 46 drives). With the patched kernel, all symlinks were created properly and no GPF was generated when the ses module was removed or inserted. Thanks for testing it. The patch in comment 15 has been accepted into the upstream Linux kernel and is ready to be backported to RHEL. The patch in comment 37 has been merged into RHEL. The patch in cooment 44 is still under review and not yet accepted by the upstream's enclosure driver maintainer. Hi Tony, I am trying to have the latest patch (comment 44) accepted by the upstream Linux kernel community, they however suggested an alternative patch which is a bit different from mine. I prepared a test kernel, is it possible for you to test it? This test kernel also prints a few debug messages, can you please send me the dmesg output? Thanks. http://people.redhat.com/~mlombard/.bz1394089/kernel-3.10.0-620.el7_bz1394089_v2.x86_64.rpm Does the test kernel include the other patches (comments 15 and 37) as well? If so, I can try and test it. yes, it includes all the necessary patches I tested the new kernel with multipath enabled on: * HPE Apollo 4520 Gen 9 with H244br, H240, and H241 HBAs and a D6020 JBOD attached (2 + 46 + 69 disks) * HPE DL360 Gen 9 with H240ar and H241 HBAs and a D3700 JBOD attached (2 + 25 disks) The patched kernel worked: * /sys/block/*/device/enclosure* symlinks created for all drives * /sys/class/enclosure symlinks created for all enclosures * ses could be removed and inserted repeatedly without error * drives could be removed and reinserted without error Thanks Cristopher for testing it. Can you also post the output of the dmesg command after booting with the test kernel? Created attachment 1266047 [details]
dmesg output of kernel-3.10.0-620.el7_bz1394089_v2.x86_64 on DL360+D3700
Created attachment 1266048 [details]
dmesg output of kernel-3.10.0-620.el7_bz1394089_v2.x86_64 on 4520+D6020
As requested, I have attached the full dmesg logs for both test configurations. Christopher, thanks for testing it. The dmesg however is not really useful, is it possible to test the kernel on the same machine used for comment 23? (In reply to Maurizio Lombardi from comment #59) > Hi Tony, > > I am trying to have the latest patch (comment 44) accepted by the upstream > Linux kernel community, they however suggested an alternative patch which is > a bit different from mine. > > I prepared a test kernel, is it possible for you to test it? > This test kernel also prints a few debug messages, can you please send me > the dmesg output? > > Thanks. > > http://people.redhat.com/~mlombard/.bz1394089/kernel-3.10.0-620. > el7_bz1394089_v2.x86_64.rpm Maurizio, can you point me to a patch with the changes you want me to test? For us the patch is a lot easier to test than a RPM. Created attachment 1268352 [details]
symlinks debug patch
Hi Tony,
You can find the patch attached to this message.
Please test it, thanks!
(In reply to Maurizio Lombardi from comment #67) > Christopher, > > thanks for testing it. > The dmesg however is not really useful, is it possible to test the kernel on > the same machine used for comment 23? Sorry but I don't have access to the machine used in comment 23 (different company and different geographical location). Thanks for the patch Maurizio. I'll see if I can get some time on our machines to test it.. We are running out of time for RHEL7.4. The patch in comment 15 is the most important one because it fixes a regression and I will proceed to merge it in RHEL ASAP. comment 37 's patch has been merged in RHEL already. The patch in comment 44 has not been approved by upstream yet so I will defer it, I will open a separate BZ to track it. Maurizio, I finally got around to testing the new patch (https://bugzilla.redhat.com/attachment.cgi?id=1268352&action=diff) and it worked fine. Symlinks were all created correctly. Tony, do you have the dmesg output? Created attachment 1270891 [details]
dmesg from newest patch
Created attachment 1270892 [details]
dmesg from newest patch - one bad drive example
Here's a (partial) dmesg output from an enclosure with a drive acting up (drive is physically present, but missing from /sys/class/block/*). The bad drive is unrelated to your patch; it's probably a problem with the enclosure slot. I just though it might be interesting output since it exercises some of your printks.
Patch(es) committed on kernel repository and an interim kernel build is undergoing testing Patch(es) available on kernel-3.10.0-650.el7 Hello the bug have been fixed on kernel-3.10.0-650.el7, I will move it to verified. 3.10.0-555.el7.x86_64 [root@storageqe-07 ~]# modprobe ses [root@storageqe-07 ~]# ls -l /sys/block/sd*/device/en* ls: cannot access /sys/block/sd*/device/en*: No such file or directory 3.10.0-650.el7.x86_64 [root@storageqe-07 ~]# ls -l /sys/block/sd*/device/en* lrwxrwxrwx. 1 root root 0 Apr 17 22:09 /sys/block/sdaa/device/enclosure_device:SLOT 7 -> ../../../../port-1:1:19/end_device-1:1:19/target1:0:39/1:0:39:0/enclosure/1:0:39:0/SLOT 7 lrwxrwxrwx. 1 root root 0 Apr 17 22:09 /sys/block/sdab/device/enclosure_device:SLOT 8 -> ../../../../port-1:1:19/end_device-1:1:19/target1:0:39/1:0:39:0/enclosure/1:0:39:0/SLOT 8 lrwxrwxrwx. 1 root root 0 Apr 17 22:09 /sys/block/sdac/device/enclosure_dev Requesting a z-stream fix for this issue. Customer impact: (from https://bugzilla.redhat.com/show_bug.cgi?id=1455358#c3) "Without this patch, the symlink in sysfs which binds a SAS device to an enclosure slot does not get created. This makes disk hotplug near impossible on large JBOD disk drawers." *** Bug 1455358 has been marked as a duplicate of this bug. *** ------- Comment From cdeadmin.com 2017-05-26 00:41 EDT------- cde00 (cdeadmin.com) added native attachment /tmp/AIXOS06838293/dmesg_newest_patch.txt on 2017-05-25 23:32:56 cde00 (cdeadmin.com) added native attachment /tmp/AIXOS06838293/r19-monitor3-dmesg on 2017-05-25 23:32:56 cde00 (cdeadmin.com) added native attachment /tmp/AIXOS06838293/r19-osd2-dmesg on 2017-05-25 23:32:56 cde00 (cdeadmin.com) added native attachment /tmp/AIXOS06838293/debug_printks.patch on 2017-05-25 23:32:56 cde00 (cdeadmin.com) added native attachment /tmp/AIXOS06838293/dmesg_newest_patch_one_drive_missing.txt on 2017-05-25 23:32:56 cde00 (cdeadmin.com) added native attachment /tmp/AIXOS06838293/console.jet21.gz on 2017-05-25 23:32:56 cde00 (cdeadmin.com) added native attachment /tmp/AIXOS06838293/0001-fix-symlinks.patch on 2017-05-25 23:32:56 cde00 (cdeadmin.com) added native attachment /tmp/AIXOS06838293/fix-enclosure.patch on 2017-05-25 23:32:56 cde00 (cdeadmin.com) added native attachment /tmp/AIXOS06838293/dmesg-with-debug.txt on 2017-05-25 23:32:56 Created attachment 1282478 [details]
sosreport
(In reply to John Jarvis from comment #83) > Requesting a z-stream fix for this issue. Customer impact: (from > https://bugzilla.redhat.com/show_bug.cgi?id=1455358#c3) > > "Without this patch, the symlink in sysfs which binds a SAS device to an > enclosure slot does not get created. This makes disk hotplug near impossible > on large JBOD disk drawers." Is it accurate to say that this issue is not fixed for customers running multipath, or did I put the pieces together incorrectly here? Is this sufficient for LLNL, without multipath support? The patch is working for us at LLNL. The bug itself is unrelated to multipath. I think the comment about making "hotplug near impossible" has to do with the fact that you need the sysfs links to tell you the mappings between disks and slot numbers. Without the links, if say, disk /dev/sdac failed, you wouldn't know which slot number to yank the drive from. The sysfs links provide that:
> $ readlink /sys/class/block/sdac/device/enclosure_device*
> ../../../../../../port-0:0:0/end_device-0:0:0/target0:0:0/0:0:0:0/enclosure/0:0:0:0/SLOT 60 54
> $
------- Comment From dougmill.com 2017-06-08 08:06 EDT------- FYI, I have been running a 3.10.0-663.el7.ppc64le kernel (7.4 Beta) and not seen any issues - all symlinks for locate LEDs are always present from the /sys/block/*/device/enclosure*/locate path. Hello, This bug has been copied as 7.3 z-stream (EUS) bug #1460204 Thank You Joe Kachuck ------- Comment From cdeadmin.com 2017-06-28 06:45 EDT------- This CMVC defect is being cancelled by the CDE Bridge because the corresponding CQ Defect [SW388722] was transferred out of the bridge domain. Here are the additional details: New Subsystem = ppc_triage New Release = unspecified New Component = redhat_linux New OwnerInfo = Chavez, Luciano (chavez.com) To continue tracking this issue, please follow CQ defect [SW388722]. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:1842 |