Created attachment 333149 [details] patch to append the new entry on rdac_dev_list. Description of problem: RDAC handler kernel module iscsi_dh_rdac doesn't include DELL MD3000i into the rdac_dev_list structure. Users using Dell MD3000i cannot benefit of the advantages provided by this module, forcing them to be stuck with vendor's dkms mppVhba module. Version-Release number of selected component (if applicable): tested against kernel-2.6.18-128.1.1.el5 Actual results: Here's a iSCSI logon messages with current kernel module: -------------------BEGIN LOG--------------- Feb 24 11:44:51 rhel5r300 kernel: scsi13 : iSCSI Initiator over TCP/IP Feb 24 11:44:51 rhel5r300 kernel: scsi14 : iSCSI Initiator over TCP/IP Feb 24 11:44:51 rhel5r300 kernel: scsi15 : iSCSI Initiator over TCP/IP Feb 24 11:44:51 rhel5r300 kernel: scsi16 : iSCSI Initiator over TCP/IP Feb 24 11:44:51 rhel5r300 kernel: Vendor: DELL Model: MD3000i Rev: 0670 Feb 24 11:44:51 rhel5r300 kernel: Type: Direct-Access ANSI SCSI revision: 05 Feb 24 11:44:51 rhel5r300 kernel: scsi 14:0:0:0: Attached scsi generic sg4 type 0 Feb 24 11:44:51 rhel5r300 kernel: Vendor: DELL Model: MD3000i Rev: 0670 Feb 24 11:44:51 rhel5r300 kernel: Vendor: DELL Model: MD3000i Rev: 0670 Feb 24 11:44:51 rhel5r300 kernel: Type: Direct-Access ANSI SCSI revision: 05 Feb 24 11:44:51 rhel5r300 kernel: scsi 16:0:0:0: Attached scsi generic sg5 type 0 Feb 24 11:44:51 rhel5r300 kernel: Type: Direct-Access ANSI SCSI revision: 05 Feb 24 11:44:51 rhel5r300 kernel: Vendor: DELL Model: MD3000i Rev: 0670 Feb 24 11:44:51 rhel5r300 kernel: Type: Direct-Access ANSI SCSI revision: 05 Feb 24 11:44:51 rhel5r300 kernel: scsi 15:0:0:0: Attached scsi generic sg6 type 0 Feb 24 11:44:51 rhel5r300 kernel: scsi 13:0:0:0: Attached scsi generic sg7 type 0 Feb 24 11:44:51 rhel5r300 kernel: Vendor: DELL Model: MD<5> Vendor: DELL Model: 3000i Rev: 0670 Feb 24 11:44:51 rhel5r300 kernel: Type: Direct-Access ANSI SCSI revision: 05 Feb 24 11:44:51 rhel5r300 kernel: MD3000i Rev: 0670 Feb 24 11:44:51 rhel5r300 kernel: Type: Direct-Access ANSI SCSI revision: 05 Feb 24 11:44:51 rhel5r300 kernel: Vendor: DELL Model: MD3000i Rev: 0670 Feb 24 11:44:51 rhel5r300 kernel: Type: Direct-Access ANSI SCSI revision: 05 Feb 24 11:44:51 rhel5r300 kernel: Vendor: DELL Model: MD3000i Rev: 0670 Feb 24 11:44:51 rhel5r300 kernel: Type: Direct-Access ANSI SCSI revision: 05 Feb 24 11:44:51 rhel5r300 kernel: SCSI device sdb: 52428800 512-byte hdwr sectors (26844 MB) Feb 24 11:44:51 rhel5r300 kernel: sdb: Write Protect is off Feb 24 11:44:51 rhel5r300 kernel: SCSI device sdc: 52428800 512-byte hdwr sectors (26844 MB) Feb 24 11:44:51 rhel5r300 kernel: sdc: Write Protect is off Feb 24 11:44:51 rhel5r300 kernel: SCSI device sdb: drive cache: write back w/ FUA Feb 24 11:44:51 rhel5r300 kernel: SCSI device sdd: 52428800 512-byte hdwr sectors (26844 MB) Feb 24 11:44:51 rhel5r300 kernel: SCSI device sdc: drive cache: write back w/ FUA Feb 24 11:44:51 rhel5r300 kernel: sdd: Write Protect is off Feb 24 11:44:51 rhel5r300 kernel: SCSI device sdd: drive cache: write back w/ FUA Feb 24 11:44:51 rhel5r300 kernel: SCSI device sdc: 52428800 512-byte hdwr sectors (26844 MB) Feb 24 11:44:51 rhel5r300 kernel: SCSI device sdb: 52428800 512-byte hdwr sectors (26844 MB) Feb 24 11:44:51 rhel5r300 kernel: sdc: Write Protect is off Feb 24 11:44:51 rhel5r300 multipathd: sdc: add path (uevent) Feb 24 11:44:51 rhel5r300 kernel: sdb: Write Protect is off Feb 24 11:44:51 rhel5r300 kernel: SCSI device sdd: 52428800 512-byte hdwr sectors (26844 MB) Feb 24 11:44:51 rhel5r300 kernel: SCSI device sdc: drive cache: write back w/ FUA Feb 24 11:44:51 rhel5r300 kernel: sdc:<5>SCSI device sdb: drive cache: write back w/ FUA Feb 24 11:44:51 rhel5r300 kernel: sdb:<5>sdd: Write Protect is off Feb 24 11:44:51 rhel5r300 kernel: SCSI device sdd: drive cache: write back w/ FUA Feb 24 11:44:51 rhel5r300 kernel: sdd: sdc1 Feb 24 11:44:51 rhel5r300 kernel: sd 14:0:0:61: Attached scsi disk sdc Feb 24 11:44:51 rhel5r300 kernel: sd 14:0:0:61: Attached scsi generic sg8 type 0 Feb 24 11:44:51 rhel5r300 kernel: sdb1 Feb 24 11:44:51 rhel5r300 kernel: sd 16:0:0:61: Attached scsi disk sdb Feb 24 11:44:51 rhel5r300 kernel: sd 16:0:0:61: Attached scsi generic sg9 type 0 Feb 24 11:44:51 rhel5r300 kernel: SCSI device sde: 52428800 512-byte hdwr sectors (26844 MB) Feb 24 11:44:51 rhel5r300 kernel: sde: Write Protect is off Feb 24 11:44:51 rhel5r300 kernel: SCSI device sde: drive cache: write back w/ FUA Feb 24 11:44:51 rhel5r300 kernel: SCSI device sde: 52428800 512-byte hdwr sectors (26844 MB) Feb 24 11:44:51 rhel5r300 kernel: sde: Write Protect is off Feb 24 11:44:51 rhel5r300 kernel: SCSI device sde: drive cache: write back w/ FUA Feb 24 11:44:51 rhel5r300 iscsid: transport class version 2.0-724. iscsid version 2.0-868 Feb 24 11:44:51 rhel5r300 iscsid: iSCSI daemon with pid=5007 started! Feb 24 11:44:51 rhel5r300 iscsid: received iferror -38 Feb 24 11:44:51 rhel5r300 last message repeated 2 times Feb 24 11:44:51 rhel5r300 iscsid: connection2:0 is operational now Feb 24 11:44:51 rhel5r300 iscsid: received iferror -38 Feb 24 11:44:51 rhel5r300 last message repeated 2 times Feb 24 11:44:51 rhel5r300 iscsid: connection1:0 is operational now Feb 24 11:44:51 rhel5r300 iscsid: received iferror -38 Feb 24 11:44:51 rhel5r300 last message repeated 2 times Feb 24 11:44:51 rhel5r300 iscsid: connection4:0 is operational now Feb 24 11:44:51 rhel5r300 iscsid: received iferror -38 Feb 24 11:44:51 rhel5r300 last message repeated 2 times Feb 24 11:44:51 rhel5r300 iscsid: connection3:0 is operational now Feb 24 11:44:51 rhel5r300 kernel: sde:<6>rdac: device handler registered Feb 24 11:44:51 rhel5r300 kernel: device-mapper: multipath: Using scsi_dh module scsi_dh_rdac for failover/failback and device management. Feb 24 11:44:51 rhel5r300 kernel: sd 14:0:0:61: rdac: LUN 61 (owned) Feb 24 11:44:51 rhel5r300 multipathd: 360022190009252d80000149a498a71ff: load table [0 52428800 multipath 0 1 rdac 1 1 round-robin 0 1 1 8:32 1000] Feb 24 11:44:51 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:44:51 rhel5r300 kernel: Buffer I/O error on device sdd, logical block 0 Feb 24 11:44:51 rhel5r300 multipathd: 360022190009252d80000149a498a71ff: event checker started Feb 24 11:44:51 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:44:51 rhel5r300 kernel: Buffer I/O error on device sde, logical block 0 Feb 24 11:44:51 rhel5r300 kernel: device-mapper: multipath: Using scsi_dh module scsi_dh_rdac for failover/failback and device management. Feb 24 11:44:51 rhel5r300 multipathd: sdb: add path (uevent) Feb 24 11:44:51 rhel5r300 multipathd: 360022190009252d80000149a498a71ff: load table [0 52428800 multipath 0 1 rdac 1 1 round-robin 0 2 1 8:32 1000 8:16 1000] Feb 24 11:44:51 rhel5r300 multipathd: dm-2: add map (uevent) Feb 24 11:44:51 rhel5r300 multipathd: dm-2: devmap already registered Feb 24 11:44:51 rhel5r300 multipathd: dm-2: add map (uevent) Feb 24 11:44:51 rhel5r300 multipathd: dm-2: devmap already registered Feb 24 11:44:51 rhel5r300 kernel: sd 16:0:0:61: rdac: LUN 61 (owned) Feb 24 11:44:51 rhel5r300 kernel: sd 14:0:0:61: rdac Dettached Feb 24 11:44:52 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:44:52 rhel5r300 kernel: Buffer I/O error on device sdd, logical block 0 Feb 24 11:44:52 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:44:52 rhel5r300 kernel: Buffer I/O error on device sde, logical block 0 Feb 24 11:44:52 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:44:52 rhel5r300 kernel: Buffer I/O error on device sdd, logical block 0 Feb 24 11:44:52 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:44:52 rhel5r300 kernel: Buffer I/O error on device sde, logical block 0 Feb 24 11:44:52 rhel5r300 kernel: device-mapper: multipath: Cannot failover device because scsi_dh_rdac was not loaded. Feb 24 11:44:52 rhel5r300 kernel: device-mapper: multipath: Failing path 8:32. Feb 24 11:44:52 rhel5r300 kernel: device-mapper: multipath: Could not failover device. Error 15. Feb 24 11:44:52 rhel5r300 multipathd: 8:32: mark as failed Feb 24 11:44:52 rhel5r300 multipathd: 360022190009252d80000149a498a71ff: remaining active paths: 1 Feb 24 11:44:52 rhel5r300 multipathd: dm-2: add map (uevent) Feb 24 11:44:52 rhel5r300 multipathd: dm-2: devmap already registered Feb 24 11:44:52 rhel5r300 multipathd: dm-3: add map (uevent) Feb 24 11:44:53 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:44:53 rhel5r300 kernel: Buffer I/O error on device sdd, logical block 0 Feb 24 11:44:53 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:44:53 rhel5r300 kernel: Buffer I/O error on device sde, logical block 0 Feb 24 11:44:53 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:44:53 rhel5r300 kernel: Buffer I/O error on device sdd, logical block 0 Feb 24 11:44:53 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:44:53 rhel5r300 kernel: Buffer I/O error on device sde, logical block 0 Feb 24 11:44:54 rhel5r300 udevd-event[5082]: wait_for_sysfs: waiting for '/sys/devices/platform/host13/session1/target13:0:0/13:0:0:61/ioerr_cnt' failed Feb 24 11:44:54 rhel5r300 udevd-event[5083]: wait_for_sysfs: waiting for '/sys/devices/platform/host15/session3/target15:0:0/15:0:0:61/ioerr_cnt' failed Feb 24 11:44:54 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:44:54 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:44:54 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:44:54 rhel5r300 kernel: Dev sdd: unable to read RDB block 0 Feb 24 11:44:54 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:44:54 rhel5r300 kernel: Dev sde: unable to read RDB block 0 Feb 24 11:44:55 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:44:55 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:44:56 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:44:56 rhel5r300 kernel: unable to read partition table Feb 24 11:44:56 rhel5r300 kernel: sd 13:0:0:61: Attached scsi disk sdd Feb 24 11:44:56 rhel5r300 kernel: sd 13:0:0:61: Attached scsi generic sg10 type 0 Feb 24 11:44:56 rhel5r300 multipathd: sdd: add path (uevent) Feb 24 11:44:56 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:44:56 rhel5r300 kernel: unable to read partition table Feb 24 11:44:56 rhel5r300 kernel: sd 15:0:0:61: Attached scsi disk sde Feb 24 11:44:56 rhel5r300 kernel: sd 15:0:0:61: Attached scsi generic sg11 type 0 Feb 24 11:44:56 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 52428672 Feb 24 11:44:56 rhel5r300 kernel: end_request: I/O error, dev sde, sector 52428672 Feb 24 11:44:57 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 52428672 Feb 24 11:44:57 rhel5r300 kernel: printk: 10 messages suppressed. Feb 24 11:44:57 rhel5r300 kernel: Buffer I/O error on device sdd, logical block 6553584 Feb 24 11:44:57 rhel5r300 kernel: end_request: I/O error, dev sde, sector 52428672 Feb 24 11:44:57 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 52428792 Feb 24 11:44:57 rhel5r300 kernel: end_request: I/O error, dev sde, sector 52428792 Feb 24 11:44:58 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 52428792 Feb 24 11:44:58 rhel5r300 kernel: end_request: I/O error, dev sde, sector 52428792 Feb 24 11:44:58 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 52428792 Feb 24 11:44:58 rhel5r300 kernel: end_request: I/O error, dev sde, sector 52428792 Feb 24 11:44:59 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 52428792 Feb 24 11:44:59 rhel5r300 kernel: end_request: I/O error, dev sde, sector 52428792 Feb 24 11:44:59 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 52428792 Feb 24 11:44:59 rhel5r300 kernel: end_request: I/O error, dev sde, sector 52428792 Feb 24 11:45:00 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 52428792 Feb 24 11:45:00 rhel5r300 kernel: end_request: I/O error, dev sde, sector 52428792 Feb 24 11:45:00 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 52428736 Feb 24 11:45:00 rhel5r300 kernel: end_request: I/O error, dev sde, sector 52428736 Feb 24 11:45:01 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 52428784 Feb 24 11:45:01 rhel5r300 kernel: end_request: I/O error, dev sde, sector 52428784 Feb 24 11:45:01 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 52428792 Feb 24 11:45:01 rhel5r300 kernel: printk: 17 messages suppressed. Feb 24 11:45:01 rhel5r300 kernel: Buffer I/O error on device sdd, logical block 6553599 Feb 24 11:45:01 rhel5r300 kernel: end_request: I/O error, dev sde, sector 52428792 Feb 24 11:45:02 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 52428792 Feb 24 11:45:02 rhel5r300 kernel: end_request: I/O error, dev sde, sector 52428792 Feb 24 11:45:03 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:03 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:03 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:03 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:04 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:04 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:04 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:04 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:05 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:05 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:05 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:05 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:06 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:06 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:06 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:06 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:07 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:07 rhel5r300 kernel: printk: 19 messages suppressed. Feb 24 11:45:07 rhel5r300 kernel: Buffer I/O error on device sdd, logical block 0 Feb 24 11:45:07 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:07 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:07 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:08 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:08 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:08 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:08 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:09 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:09 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:09 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:09 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:10 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:10 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:11 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:11 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:11 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:11 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:12 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:12 rhel5r300 kernel: printk: 17 messages suppressed. Feb 24 11:45:12 rhel5r300 kernel: Buffer I/O error on device sdd, logical block 0 Feb 24 11:45:12 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:12 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:12 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:13 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:13 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:13 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:13 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:14 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:14 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:14 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:14 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:15 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:15 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:15 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:15 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:16 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:16 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:16 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:16 rhel5r300 kernel: printk: 17 messages suppressed. Feb 24 11:45:16 rhel5r300 kernel: Buffer I/O error on device sdd, logical block 0 Feb 24 11:45:16 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:17 rhel5r300 kernel: end_request: I/O error, dev sdd, sector 0 Feb 24 11:45:17 rhel5r300 kernel: end_request: I/O error, dev sde, sector 0 Feb 24 11:45:17 rhel5r300 kernel: device-mapper: multipath: Using scsi_dh module scsi_dh_rdac for failover/failback and device management. Feb 24 11:45:17 rhel5r300 kernel: sd 14:0:0:61: rdac: LUN 61 (owned) Feb 24 11:45:17 rhel5r300 kernel: sd 13:0:0:61: rdac: LUN 61 (unowned) Feb 24 11:45:17 rhel5r300 kernel: device-mapper: multipath: Using scsi_dh module scsi_dh_rdac for failover/failback and device management. Feb 24 11:45:17 rhel5r300 kernel: sd 15:0:0:61: rdac: LUN 61 (unowned) Feb 24 11:45:17 rhel5r300 kernel: sd 14:0:0:61: rdac Dettached Feb 24 11:45:17 rhel5r300 kernel: sd 16:0:0:61: rdac Dettached -------------------END LOG--------------- Expected results: Here's again the iSCSI logon with patched kernel module: -------------------BEGIN LOG--------------- Feb 25 11:29:33 rhel5r300 multipathd: path checkers start up Feb 25 11:29:44 rhel5r300 kernel: scsi5 : iSCSI Initiator over TCP/IP Feb 25 11:29:44 rhel5r300 kernel: scsi6 : iSCSI Initiator over TCP/IP Feb 25 11:29:44 rhel5r300 kernel: scsi7 : iSCSI Initiator over TCP/IP Feb 25 11:29:44 rhel5r300 kernel: scsi8 : iSCSI Initiator over TCP/IP Feb 25 11:29:44 rhel5r300 kernel: Vendor: DELL Model: MD3000i Rev: 0670 Feb 25 11:29:44 rhel5r300 kernel: Type: Direct-Access ANSI SCSI revision: 05 Feb 25 11:29:44 rhel5r300 kernel: Vendor: DEL<5> Vendor: DELL Model: MD3000i Rev: 0670 Feb 25 11:29:44 rhel5r300 kernel: Type: Direct-Access ANSI SCSI revision: 05 Feb 25 11:29:44 rhel5r300 kernel: Vendor: DELL Model: MD3000i Rev: 0670 Feb 25 11:29:44 rhel5r300 kernel: Type: Direct-Access ANSI SCSI revision: 05 Feb 25 11:29:44 rhel5r300 kernel: L Model: MD3000i Rev: 0670 Feb 25 11:29:44 rhel5r300 kernel: Type: Direct-Access ANSI SCSI revision: 05 Feb 25 11:29:44 rhel5r300 kernel: scsi 6:0:0:0: rdac: LUN 0 (unowned) Feb 25 11:29:44 rhel5r300 kernel: scsi 6:0:0:0: Attached scsi generic sg4 type 0 Feb 25 11:29:44 rhel5r300 kernel: Vendor: DELL Model: MD3000i <5>scsi 7:0:0:0: rdac: LUN 0 (unowned) Feb 25 11:29:44 rhel5r300 kernel: scsi 7:0:0:0: Attached scsi generic sg5 type 0 Feb 25 11:29:44 rhel5r300 kernel: Rev: 0670 Feb 25 11:29:44 rhel5r300 kernel: Type: Direct-Access ANSI SCSI revision: 05 Feb 25 11:29:44 rhel5r300 kernel: scsi 6:0:0:61: rdac: LUN 61 (unowned) Feb 25 11:29:44 rhel5r300 kernel: Vendor: DELL Model: MD3000i Rev: 0670 Feb 25 11:29:44 rhel5r300 kernel: Type: Direct-Access ANSI SCSI revision: 05 Feb 25 11:29:44 rhel5r300 kernel: SCSI device sdb: 52428800 512-byte hdwr sectors (26844 MB) Feb 25 11:29:44 rhel5r300 kernel: scsi 8:0:0:0: rdac: LUN 0 (unowned) Feb 25 11:29:44 rhel5r300 kernel: scsi 8:0:0:0: Attached scsi generic sg6 type 0 Feb 25 11:29:44 rhel5r300 kernel: sdb: Write Protect is off Feb 25 11:29:44 rhel5r300 kernel: SCSI device sdb: drive cache: write back w/ FUA Feb 25 11:29:44 rhel5r300 kernel: scsi 5:0:0:0: rdac: LUN 0 (unowned) Feb 25 11:29:44 rhel5r300 kernel: scsi 5:0:0:0: Attached scsi generic sg7 type 0 Feb 25 11:29:44 rhel5r300 kernel: scsi 7:0:0:61: rdac: LUN 61 (owned) Feb 25 11:29:44 rhel5r300 kernel: SCSI device sdb: 52428800 512-byte hdwr sectors (26844 MB) Feb 25 11:29:44 rhel5r300 kernel: Vendor: DELL <5>sdb: Write Protect is off Feb 25 11:29:44 rhel5r300 kernel: Model: MD300<5>SCSI device sdb: drive cache: write back w/ FUA Feb 25 11:29:44 rhel5r300 kernel: sdb:<3>Buffer I/O error on device sdb, logical block 0 Feb 25 11:29:44 rhel5r300 kernel: Buffer I/O error on device sdb, logical block 0 Feb 25 11:29:44 rhel5r300 last message repeated 2 times Feb 25 11:29:44 rhel5r300 multipathd: sdb: add path (uevent) Feb 25 11:29:44 rhel5r300 kernel: Buffer I/O error on device sdb, logical block 0 Feb 25 11:29:44 rhel5r300 last message repeated 2 times Feb 25 11:29:44 rhel5r300 kernel: Dev sdb: unable to read RDB block 0 Feb 25 11:29:44 rhel5r300 kernel: Buffer I/O error on device sdb, logical block 0 Feb 25 11:29:44 rhel5r300 kernel: Buffer I/O error on device sdb, logical block 0 Feb 25 11:29:44 rhel5r300 kernel: unable to read partition table Feb 25 11:29:44 rhel5r300 kernel: SCSI device sdc: 52428800 512-byte hdwr sectors (26844 MB) Feb 25 11:29:44 rhel5r300 kernel: sdc: Write Protect is off Feb 25 11:29:44 rhel5r300 kernel: SCSI device sdc: drive cache: write back w/ FUA Feb 25 11:29:44 rhel5r300 kernel: Vendor: DELL Model: MD3000i Rev: 0670 Feb 25 11:29:44 rhel5r300 kernel: Type: Direct-Access ANSI SCSI revision: 05 Feb 25 11:29:44 rhel5r300 kernel: 0<5>sd 6:0:0:61: Attached scsi disk sdb Feb 25 11:29:44 rhel5r300 kernel: sd 6:0:0:61: Attached scsi generic sg8 type 0 Feb 25 11:29:44 rhel5r300 kernel: scsi 5:0:0:61: rdac: LUN 61 (owned) Feb 25 11:29:44 rhel5r300 kernel: i Rev: 0670 Feb 25 11:29:44 rhel5r300 kernel: SCSI device sdc: 52428800 512-byte hdwr sectors (26844 MB) Feb 25 11:29:44 rhel5r300 kernel: sdc: Write Protect is off Feb 25 11:29:44 rhel5r300 kernel: SCSI device sdd: 52428800 512-byte hdwr sectors (26844 MB) Feb 25 11:29:44 rhel5r300 kernel: sdd: Write Protect is off Feb 25 11:29:44 rhel5r300 kernel: SCSI device sdc: drive cache: write back w/ FUA Feb 25 11:29:44 rhel5r300 kernel: sdc: sdc1 Feb 25 11:29:44 rhel5r300 kernel: SCSI device sdd: drive cache: write back w/ FUA Feb 25 11:29:44 rhel5r300 kernel: Type: Direct-Access <5>sd 7:0:0:61: Attached scsi disk sdc Feb 25 11:29:44 rhel5r300 kernel: sd 7:0:0:61: Attached scsi generic sg9 type 0 Feb 25 11:29:44 rhel5r300 kernel: ANSI SCSI revision: 05 Feb 25 11:29:44 rhel5r300 kernel: SCSI device sdd: 52428800 512-byte hdwr sectors (26844 MB) Feb 25 11:29:44 rhel5r300 kernel: sdd: Write Protect is off Feb 25 11:29:44 rhel5r300 kernel: scsi 8:0:0:61: rdac: LUN 61 (unowned) Feb 25 11:29:44 rhel5r300 kernel: SCSI device sdd: drive cache: write back w/ FUA Feb 25 11:29:44 rhel5r300 kernel: sdd: sdd1 Feb 25 11:29:44 rhel5r300 kernel: sd 5:0:0:61: Attached scsi disk sdd Feb 25 11:29:44 rhel5r300 kernel: sd 5:0:0:61: Attached scsi generic sg10 type 0 Feb 25 11:29:44 rhel5r300 kernel: SCSI device sde: 52428800 512-byte hdwr sectors (26844 MB) Feb 25 11:29:44 rhel5r300 kernel: sde: Write Protect is off Feb 25 11:29:44 rhel5r300 kernel: SCSI device sde: drive cache: write back w/ FUA Feb 25 11:29:44 rhel5r300 kernel: SCSI device sde: 52428800 512-byte hdwr sectors (26844 MB) Feb 25 11:29:44 rhel5r300 kernel: sde: Write Protect is off Feb 25 11:29:44 rhel5r300 kernel: SCSI device sde: drive cache: write back w/ FUA Feb 25 11:29:44 rhel5r300 kernel: sde:<3>Buffer I/O error on device sde, logical block 0 Feb 25 11:29:44 rhel5r300 kernel: Dev sde: unable to read RDB block 0 Feb 25 11:29:44 rhel5r300 kernel: unable to read partition table Feb 25 11:29:44 rhel5r300 kernel: sd 8:0:0:61: Attached scsi disk sde Feb 25 11:29:44 rhel5r300 kernel: sd 8:0:0:61: Attached scsi generic sg11 type 0 Feb 25 11:29:44 rhel5r300 iscsid: received iferror -38 Feb 25 11:29:44 rhel5r300 last message repeated 2 times Feb 25 11:29:44 rhel5r300 iscsid: connection2:0 is operational now Feb 25 11:29:44 rhel5r300 iscsid: received iferror -38 Feb 25 11:29:44 rhel5r300 last message repeated 2 times Feb 25 11:29:44 rhel5r300 iscsid: connection1:0 is operational now Feb 25 11:29:44 rhel5r300 iscsid: received iferror -38 Feb 25 11:29:44 rhel5r300 last message repeated 2 times Feb 25 11:29:44 rhel5r300 iscsid: connection3:0 is operational now Feb 25 11:29:44 rhel5r300 iscsid: received iferror -38 Feb 25 11:29:44 rhel5r300 last message repeated 2 times Feb 25 11:29:44 rhel5r300 iscsid: connection4:0 is operational now Feb 25 11:29:45 rhel5r300 kernel: device-mapper: multipath: Using scsi_dh module scsi_dh_rdac for failover/failback and device management. Feb 25 11:29:45 rhel5r300 kernel: device-mapper: multipath: Using scsi_dh module scsi_dh_rdac for failover/failback and device management. Feb 25 11:29:45 rhel5r300 multipathd: 360022190009252d80000149a498a71ff: failed in domap for addition of new path sdb Feb 25 11:29:45 rhel5r300 multipathd: uevent trigger error Feb 25 11:29:45 rhel5r300 multipathd: sdc: add path (uevent) Feb 25 11:29:45 rhel5r300 multipathd: 360022190009252d80000149a498a71ff: failed in domap for addition of new path sdc Feb 25 11:29:45 rhel5r300 multipathd: uevent trigger error Feb 25 11:29:45 rhel5r300 multipathd: sdd: add path (uevent) Feb 25 11:29:45 rhel5r300 multipathd: 360022190009252d80000149a498a71ff: failed in domap for addition of new path sdd Feb 25 11:29:45 rhel5r300 multipathd: uevent trigger error Feb 25 11:29:45 rhel5r300 multipathd: sde: add path (uevent) Feb 25 11:29:45 rhel5r300 multipathd: 360022190009252d80000149a498a71ff: failed in domap for addition of new path sde Feb 25 11:29:45 rhel5r300 multipathd: uevent trigger error Feb 25 11:29:45 rhel5r300 multipathd: dm-2: add map (uevent) Feb 25 11:29:45 rhel5r300 multipathd: 360022190009252d80000149a498a71ff: event checker started Feb 25 11:29:45 rhel5r300 kernel: sd 6:0:0:61: queueing MODE_SELECT command. Feb 25 11:29:47 rhel5r300 multipathd: dm-3: add map (uevent) Feb 25 11:29:55 rhel5r300 multipathd: 8:16: reinstated Feb 25 11:29:55 rhel5r300 multipathd: 8:32: reinstated Feb 25 11:29:55 rhel5r300 multipathd: 8:48: reinstated Feb 25 11:29:55 rhel5r300 multipathd: 8:64: reinstated -------------------END LOG--------------- Additional info: Attached the patch ( scsi-dh-rdac-add-dell-md3000i.patch ) to append the new entry on rdac_dev_list.
Having the same issue here, on the 5.3 kernel I get major IO errors and pretty much unable to use the iscsi devices: [root@uxlabi ~]# uname -a Linux uxlabi 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux [root@uxlabi ~]# dmesg device-mapper: multipath: Could not failover device. Error 15. device-mapper: multipath: Cannot failover device because scsi_dh_rdac was not loaded. device-mapper: multipath: Failing path 8:208. device-mapper: multipath: Could not failover device. Error 15. device-mapper: multipath: Cannot failover device because scsi_dh_rdac was not loaded. device-mapper: multipath: Failing path 8:192. device-mapper: multipath: Could not failover device. Error 15. [root@uxlabi ~]# lsmod | grep dh_rdac scsi_dh_rdac 40897 0 scsi_dh 41665 2 scsi_dh_rdac,dm_multipath scsi_mod 196569 10 mptctl,scsi_dh_rdac,sg,ib_iser,iscsi_tcp,libiscsi,scsi_transport_iscsi,scsi_dh,cciss,sd_mod [root@uxlabi ~]# pvscan device-mapper: multipath: Cannot failover device because scsi_dh_rdac was not loaded. device-mapper: multipath: Failing path 8:48. device-mapper: multipath: Could not failover device. Error 15. device-mapper: multipath: Cannot failover device because scsi_dh_rdac was not loaded. device-mapper: multipath: Failing path 8:0. device-mapper: multipath: Could not failover device. Error 15. /dev/mapper/mpath4p1: read failedevice-mapper: multipath: Cannot failover device because scsi_dh_rdac was not loaded. d after 0 of 512 at 107372675072device-mapper: multipath: Failing path 8:112. : Input/output error /dev/mapdevice-mapper: multipath: Could not failover device. Error 15. per/mpath4p1: read failed after device-mapper: multipath: Cannot failover device because scsi_dh_rdac was not loaded. 0 of 512 at 107372761088: Input/device-mapper: multipath: Failing path 8:96. output error /dev/mapper/mpatdevice-mapper: multipath: Could not failover device. Error 15. h4p1: read failed after 0 of 512device-mapper: multipath: Cannot failover device because scsi_dh_rdac was not loaded. at 0: Input/output error /dedevice-mapper: multipath: Failing path 8:176. v/mapper/mpath4p1: read failed adevice-mapper: multipath: Could not failover device. Error 15. fter 0 of 512 at 4096: Input/outdevice-mapper: multipath: Cannot failover device because scsi_dh_rdac was not loaded. put error /dev/mapper/mpath4pdevice-mapper: multipath: Failing path 8:160. 1: read failed after 0 of 2048 adevice-mapper: multipath: Could not failover device. Error 15. t 0: Input/output error /dev/device-mapper: multipath: Cannot failover device because scsi_dh_rdac was not loaded. mapper/mpath4: read failed afterdevice-mapper: multipath: Failing path 8:208. 0 of 4096 at 107374116864: Inpudevice-mapper: multipath: Could not failover device. Error 15. t/output error /dev/mapper/mpdevice-mapper: multipath: Cannot failover device because scsi_dh_rdac was not loaded. ath4: read failed after 0 of 409device-mapper: multipath: Failing path 8:192. 6 at 107374174208: Input/output device-mapper: multipath: Could not failover device. Error 15. error /dev/mapper/mpath4: read failed after 0 of 4096 at 0: Input/output error /dev/mapper/mpath4: read failed after 0 of 4096 at 4096: Input/output error /dev/mapper/mpath4: read failed after 0 of 4096 at 0: Input/output error /dev/mapper/mpath2: read failed after 0 of 4096 at 107374116864: Input/output error /dev/mapper/mpath2: read failed after 0 of 4096 at 107374174208: Input/output error /dev/mapper/mpath2: read failed after 0 of 4096 at 0: Input/output error /dev/mapper/mpath2: read failed after 0 of 4096 at 4096: Input/output error /dev/mapper/mpath2: read failed after 0 of 4096 at 0: Input/output error /dev/mapper/mpath5: read failed after 0 of 4096 at 107374116864: Input/output error /dev/mapper/mpath5: read failed after 0 of 4096 at 107374174208: Input/output error /dev/mapper/mpath5: read failed after 0 of 4096 at 0: Input/output error /dev/mapper/mpath5: read failed after 0 of 4096 at 4096: Input/output error /dev/mapper/mpath5: read failed after 0 of 4096 at 0: Input/output error /dev/mapper/mpath3: read failed after 0 of 4096 at 107373592576: Input/output error /dev/mapper/mpath3: read failed after 0 of 4096 at 107373649920: Input/output error /dev/mapper/mpath3: read failed after 0 of 4096 at 0: Input/output error /dev/mapper/mpath3: read failed after 0 of 4096 at 4096: Input/output error /dev/mapper/mpath3: read failed after 0 of 4096 at 0: Input/output error /dev/mapper/mpath2p1: read failed after 0 of 512 at 107372675072: Input/output error /dev/mapper/mpath2p1: read failed after 0 of 512 at 107372761088: Input/output error /dev/mapper/mpath2p1: read failed after 0 of 512 at 0: Input/output error /dev/mapper/mpath2p1: read failed after 0 of 512 at 4096: Input/output error /dev/mapper/mpath2p1: read failed after 0 of 2048 at 0: Input/output error /dev/mapper/mpath5p1: read failed after 0 of 512 at 107372675072: Input/output error /dev/mapper/mpath5p1: read failed after 0 of 512 at 107372761088: Input/output error /dev/mapper/mpath5p1: read failed after 0 of 512 at 0: Input/output error /dev/mapper/mpath5p1: read failed after 0 of 512 at 4096: Input/output error /dev/mapper/mpath5p1: read failed after 0 of 2048 at 0: Input/output error /dev/mapper/mpath3p1: read failed after 0 of 512 at 107372675072: Input/output error /dev/mapper/mpath3p1: read failed after 0 of 512 at 107372761088: Input/output error /dev/mapper/mpath3p1: read failed after 0 of 512 at 0: Input/output error /dev/mapper/mpath3p1: read failed after 0 of 512 at 4096: Input/output error /dev/mapper/mpath3p1: read failed after 0 of 2048 at 0: Input/output error PV /dev/cciss/c0d0p3 VG vg1 lvm2 [66.25 GB / 38.97 GB free] Total: 1 [66.25 GB] / in use: 1 [66.25 GB] / in no VG: 0 [0 ] If I reboot back to the old 5.2 kernel everything works fine: [root@uxlabi ~]# uname -a Linux uxlabi 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux [root@uxlabi ~]# pvscan PV /dev/mapper/mpath4p1 VG vg2 lvm2 [100.00 GB / 0 free] PV /dev/mapper/mpath2p1 VG vg2 lvm2 [100.00 GB / 1016.00 MB free] PV /dev/mapper/mpath5p1 VG vg2 lvm2 [100.00 GB / 100.00 GB free] PV /dev/mapper/mpath3p1 VG vg2 lvm2 [100.00 GB / 100.00 GB free] PV /dev/cciss/c0d0p3 VG vg1 lvm2 [66.25 GB / 38.97 GB free] Total: 5 [466.23 GB] / in use: 5 [466.23 GB] / in no VG: 0 [0 ]
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Created attachment 341005 [details] patch adding Dell MD3000 and MD3000i to scsi_dh_rdac I tested this patch, though I added the MD3000 in as well. This worked with kernel-2.6.18-140.el5 source.
in kernel-2.6.18-144.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified.
Multipath works for me with this test kernel (-144).
Actually, with further testing, failover doesn't seem to be working. I see similar messages as listed above: device-mapper: multipath: Failing path 8:112. device-mapper: multipath: Cannot failover device because scsi_dh_rdac was not loaded. device-mapper: multipath: Failing path 8:32. device-mapper: multipath: Could not failover device. Error 15. device-mapper: multipath: Cannot failover device because scsi_dh_rdac was not loaded. device-mapper: multipath: Failing path 8:48. device-mapper: multipath: Could not failover device. Error 15. scsi_dh_rdac 40897 4 scsi_dh 41537 2 scsi_dh_rdac,dm_multipath scsi_mod 196569 16 ib_iser,iscsi_tcp,libiscsi2,scsi_transport_iscsi2,st,sr_mod,scsi_dh_rdac,dm_rdac,scsi_dh,libata,mptsas,mptscsih,scsi_transport_sas,megaraid_sas,sg,sd_mod
I have been trying to configure RHEL5 to use MD3000i. I was able to eliminate some errors in the output of `multipath -v2` by using the configurations specified in the following two messages. Are these configurations correct? http://www.linux-archive.org/device-mapper-development/224799-multipathd-sdc-readsector0-checker-reports-path-down.html http://www.linux-archive.org/device-mapper-development/224799-multipathd-sdc-readsector0-checker-reports-path-down.html Without this patch, I am still getting errors in dmesg when I log in and log out. How safe is it to use the MD3000i without this patch?
(In reply to comment #9) > I have been trying to configure RHEL5 to use MD3000i. I was able to eliminate > some errors in the output of `multipath -v2` by using the configurations > specified in the following two messages. Are these configurations correct? > I think you want to contect Dell or LSI or IBM or whoever you bought the box from to be 100% sure what they reccomend, but the for the patch checker you want rdac instead of read sector and for the prio callout you want rdac too. If you do not use the rdac path checker it meantions in those threads you will get a lot of errors about the read sector failing paths. > http://www.linux-archive.org/device-mapper-development/224799-multipathd-sdc-readsector0-checker-reports-path-down.html > > http://www.linux-archive.org/device-mapper-development/224799-multipathd-sdc-readsector0-checker-reports-path-down.html > > Without this patch, I am still getting errors in dmesg when I log in and log > out. How safe is it to use the MD3000i without this patch? Do you mean you are not using this patch but you are using the updated settings from those threads? Are you seeing "Buffer I/O error" and "end_request: I/O error" when you login? Those are expected from non-active paths. What errors are you seeing on logout? Are you seeing errors during failovers?
Created attachment 346455 [details] text file containing errors from dmesg when logging in and logging out with iscsiadm Thanks very much for your response. I am not using this patch but I am using the updated settings. I do see a lot of "Buffer I/O error" messages and "end_request: I/O error" messages when logging in. The other error messages that I see on login and logout are in the attached file. I have not tested failover because my application does not require high availability.
I would like to point out that I am aware of at least one other set of changes that went into scsi_dh_rdac.c in addition to the patch for this bz. This other bz (bz489582) included several upstream patches to scsi_dh_rdac.c. I would ask that kernel-2.6.18-150.el5 (or newer) be downloaded and tested because RHEL5.4 will be including patches for both bugzillas.
Running kernel-2.6.18-151.el5 now, which seems more robust. Have done some brief failover tests which have worked but will need to do some more. Will report my progress in this bz.
Mounted an LV created from one of 3 virtual disks and started copying files in there. Removed one of the iSCSI connections. Failover seemed to work as expected: kernel: igb: eth3 NIC Link is Down Jun 4 16:18:59 kernel: connection4:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4311571523, last ping 4311576523, now 4311581523 Jun 4 16:18:59 kernel: connection4:0: detected conn error (1011) Jun 4 16:19:00 multipathd: sdd: rdac checker reports path is down Jun 4 16:19:00 multipathd: checker failed path 8:48 in map mpath4 Jun 4 16:19:00 multipathd: mpath4: remaining active paths: 3 Jun 4 16:19:00 kernel: device-mapper: multipath: Failing path 8:48. Jun 4 16:19:00 iscsid: Kernel reported iSCSI connection 4:0 error (1011) state (3) Jun 4 16:19:39 iscsid: connect failed (113) Jun 4 16:20:14 last message repeated 6 times Jun 4 16:21:20 last message repeated 11 times Jun 4 16:21:24 kernel: session4: session recovery timed out after 144 secs Jun 4 16:21:24 kernel: sd 7:0:0:2: SCSI error: return code = 0x000f0000 Jun 4 16:21:24 kernel: end_request: I/O error, dev sdq, sector 105996296 Jun 4 16:21:24 multipathd: sdm: rdac checker reports path is down Jun 4 16:21:24 multipathd: checker failed path 8:192 in map mpath5 Jun 4 16:21:24 kernel: device-mapper: multipath: Failing path 8:192. Jun 4 16:21:24 kernel: device-mapper: multipath: Failing path 65:0. Jun 4 16:21:24 multipathd: mpath5: remaining active paths: 3 then when I reconnected the cable: Jun 4 16:25:13 kernel: igb: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX Jun 4 16:25:14 iscsid: connection4:0 is operational after recovery (59 attempts) Jun 4 16:25:15 multipathd: sdm: rdac checker reports path is down Jun 4 16:25:15 multipathd: sdq: rdac checker reports path is down Jun 4 16:25:15 multipathd: sdd: rdac checker reports path is down Jun 4 16:25:15 multipathd: 8:48: reinstated Jun 4 16:25:15 multipathd: mpath4: remaining active paths: 4 Jun 4 16:25:15 multipathd: sdm: rdac checker reports path is down Jun 4 16:25:15 multipathd: 8:192: reinstated Jun 4 16:25:15 multipathd: mpath5: remaining active paths: 4 Jun 4 16:25:15 multipathd: sdq: rdac checker reports path is down Jun 4 16:25:15 multipathd: 65:0: reinstated Jun 4 16:25:15 multipathd: mpath6: remaining active paths: 4 Jun 4 16:25:15 multipathd: dm-0: add map (uevent) Jun 4 16:25:15 multipathd: dm-0: devmap already registered Jun 4 16:25:15 multipathd: dm-1: add map (uevent) Jun 4 16:25:15 multipathd: dm-1: devmap already registered Jun 4 16:25:15 multipathd: dm-2: add map (uevent) Jun 4 16:25:15 multipathd: dm-2: devmap already registered There were lots of other SCSI error messages before the link came back up - not sure whether that's expected.
(In reply to comment #14) > There were lots of other SCSI error messages before the link came back up - not > sure whether that's expected. You should see errors there. It is the iscsi layer failing the IO that was running on those paths to the scsi layer then the scsi layer failing the IO to the dm-multipath layer so dm-multipath can retry on a new path while the iscsi layer tries to fix the connection. It depends on the kernel version and when the error occurs, but you should see SCSI error: return code = 0x and/or end_request: I/O error
Yes, I thought something like that was going on. Those were the errors I saw and they stopped once dm-multipath noticed the connection had gone.
I discovered this problem last night on new Xen domain 0 systems I was trying to deploy. Just wanted to report that today I'm running kernel-xen-2.6.18-155.el5.x86_64 (from http://people.redhat.com/dzickus/el5/155.el5/ ) on a couple of my dom0 hosts and it appears to be working well. At least I'm not seeing the failed failovers :).
Thought it worth noting here that I've had some problems when testing the addition and removal of devices on the MD3000i, though it's not quite clear to me whether this is fully supported by multipath. The details are in https://bugzilla.redhat.com/show_bug.cgi?id=509396.
I would likt to add that the patch is working fine for me. However, please note the following: I had to put scsi_dh_rdac in modprobe.conf. It wouldn't load automatically. I also had to use a slightly different mulitpath configuration. Most MD3000i configuration examples say you have to use path_grouping_policy "group_by_prio". Both "ready" paths will have the same priority assigned. However, just 1 of the 2 "ready" paths is actually accessible. The other "ready" path will fail instantly when you try to use it. In my case, multipath kept trying to use the inaccessible path, which caused multipath to continuously fail and recover that path. Use path_grouping_policy "failover" instead.
You are using "rdac" as a path checker and rdac (or tpc) as a prio_callout, right ? In that case, "ready" means that the path is connected to the active controller and IO should succeed without any issues. changing the path_grouping_policy to "failover" is not a good option. "failover" would create one path group for each path, which is not what you want (as you would be reducing the throughput), since you have 2 paths leading to the active controller.
You seem to be absolutely correct. However, I did experience the behaviour I described above, where a path that was ready would keep failing once accessed and recover immediately. I incorrectly assumed the cause and the fix I mentioned. I'll try to recreate the issue I had to do some further testing, this will be hard though as everything is in production atm. But I think I can free up some hardware and a virtual disk to test with. Thanks for pointing out my mistake, it would be a waste not to load balance those ethernet interaces.
respond here with what you find. I assume you are using RHEL 5.4 (if not, please use RHEL 5.4, which has few bugs fixed.)
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1243.html