Description of problem: Testing on Intel Modular Server with dual redundant storage controllers shows intermittent kernel panics during storage controller failover/failback testing. How reproducible: This reliably happens about 1 in 20 failover/failback cycles on RHEL5.3 RC2. Steps to Reproduce: 1. Install RHEL5.3 RC2 x64 on Intel Modular Server with dual redundant storage controllers and configure multipath storage (see attached patches and BKM) 2. Perform storage controller failover/failback by alternately resetting each storage controller 3. Watch for kernel panic Actual results: Kernel panic experienced about 1 in 20 failover/failback cycles. Expected results: Kernel panic should never be caused by failover/failback. More details on kernel panic will be posted momentarily.
Note this bug BLOCKS support of RHEL5.3 on Intel Modular Server product line.
Is this the box using ALUA? Is it using the scsi_dh_alua module in this setup?
Procedure for Configuring MPIO on RHEL 5.3 1. Install RHEL 5.3, with ‘linux mpath’ typed immediately when CD is chosen for installation. Make sure the ‘Virtualization’ is not selected when prompted for choosing the packages to load. 2. Install the following rpm’s kernel-2.6.18-120_INTEL_ALUA_MPIO.el5.i686.rpm using rpm -ivh --force <rpm name as given in the link above> for RHEL 32 bit and kernel-2.6.18-120_INTEL_ALUA_MPIO_SUPPORT.el5.x86_64.rpm using rpm -ivh --force <rpm name as given in the link above> for RHEL 64 bit 3. Reboot, and make sure the new kernel (installed above as .rpm) is selected for boot. 4. Run ‘uname –a’ and check to make sure the kernel name is same as the RPM that was installed. 5. Then install the mpath_prio_intel-1.0.0140-4.i386.rpm for 32 bit and mpath_prio_intel-1.0.0140-4.x86_64.rpm for 64 bit. 6. Create multipath.conf under /etc/ and edit to have entries as mentioned below. 7. Reboot and make sure the machine boots up fine. 8. Run ‘multipath –ll’ to see the devices configured as multipath devices. Description of /etc/multipath.conf defaults { user_friendly_names yes } blacklist { devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" devnode "^(hd|xvd)[a-z][[0-9]*]" devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]" } devices { device { vendor "Intel" product "Multi-Flex" path_grouping_policy "group_by_prio" getuid_callout "/sbin/scsi_id -g -u -s /block/%n" prio_callout "/sbin/mpath_prio_intel /dev/%n" path_checker tur path_selector "round-robin 0" hardware_handler "1 alua" failback immediate rr_weight uniform rr_min_io 100 no_path_retry queue features "1 queue_if_no_path" } }
Created attachment 330026 [details] kernel-2.6.18-120_INTEL_ALUA_MPIO_SUPPORT.el5.x86_64.rpm
Created attachment 330027 [details] mpath_prio_intel-1.0.0140-3.x86_64.rpm
Hi Mike, in response to your question, yes, this is with scsi_dh_alua. Please also note the kernel panic failures are new to us in RHEL5.3 RC2. Earlier testing (such as against beta and snapshot2) did not show the kernel panic problems.
Created attachment 330042 [details] zip file containing console logs of kernel panics
(In reply to comment #6) > Hi Mike, in response to your question, yes, this is with scsi_dh_alua. > > Please also note the kernel panic failures are new to us in RHEL5.3 RC2. > Earlier testing (such as against beta and snapshot2) did not show the kernel > panic problems. There were several bugs in scsi_dh_alua so we ended up dropping it from the final release. In some email or one of the other BZs about this or the call, we told you that we were shooting for RHEL 5.4 with this. We were only going to tech preview this for 5.3, but due to its instability we could not do that. What is up with this BZ vs the others that you guys made? Didn't you guys make a bugzilla or feature request shooting for 5.4 support for scsi_dh_alua?
(In reply to comment #8) > (In reply to comment #6) > > Hi Mike, in response to your question, yes, this is with scsi_dh_alua. > > > > Please also note the kernel panic failures are new to us in RHEL5.3 RC2. > > Earlier testing (such as against beta and snapshot2) did not show the kernel > > panic problems. > > > There were several bugs in scsi_dh_alua so we ended up dropping it from the > final release. In some email or one of the other BZs about this or the call, we > told you that we were shooting for RHEL 5.4 with this. We were only going to > tech preview this for 5.3, but due to its instability we could not do that. > > What is up with this BZ vs the others that you guys made? Didn't you guys make > a bugzilla or feature request shooting for 5.4 support for scsi_dh_alua? I am little lost on what is going on, because it seemed like you are using RHEL 5.3 GA in some other comments. Are you guys distributing the scsi_dh_alua module yourselves, because we dropped i, and so this bugzilla is just asking for help with the version you guys are distributing? If so could you attach it here. Also did you guys send any of these boxes to Red Hat ever?
Did you guys want to do a real quick call so we can sync up on all this?
(In reply to comment #8) > (In reply to comment #6) > > Hi Mike, in response to your question, yes, this is with scsi_dh_alua. > > > > Please also note the kernel panic failures are new to us in RHEL5.3 RC2. > > Earlier testing (such as against beta and snapshot2) did not show the kernel > > panic problems. > > > There were several bugs in scsi_dh_alua so we ended up dropping it from the > final release. In some email or one of the other BZs about this or the call, we > told you that we were shooting for RHEL 5.4 with this. We were only going to > tech preview this for 5.3, but due to its instability we could not do that. > Oh yeah one clarification so peoples heads do not explode :) We only did not ship our own scsi_dh_alua module. We did ship the underlying scsi_dh infrastructure so you can load in your own scsi_dh_alua module. For 5.3 you guys had to ship your own scsi_dh_alua module (you had to do this because we never got patches/request/bz/whatever-you-call-it to add your boxes to the default device table that attached the module to the device) and this will hook into our scsi_dh code fine.
Dear, Mike. We only used patch for scsi_dh_alua which shiped with kernel source rpm of yours. So, we need your latest scsi_dh_alua code to make a patch. I checked you RHEL 5.3 GA kernel source rpm. If there is no change in scsi_dh_alua, we can use scsi_dh_alua by disabling Linux-2.6-scsi-remove-scsi_dh_alua.path (line number 3302 (patch23461)) in kernel.spec file. Can you confirm that we can get latest scsi_dh_alua code with upper procedure ? thanks.
(In reply to comment #12) > Dear, Mike. > > We only used patch for scsi_dh_alua which shiped with kernel source rpm of > yours. > > So, we need your latest scsi_dh_alua code to make a patch. > I am still working on this. I said Mon or Tues, so give me a couple extra hours :) I was also waiting to hear back from upstream about the patches I sent. I will just send you what I have assuming it will be accepted. For some of the other issues of using the new code: I am still trying to find info on how to load it for the installer. Are you guys familiar with making driver disks? Is this what you guys were doing before? And for the boot from the device it is crazy. It was one of the issues that prevented it from being a solid solution (it should be fixed for 5.4). To do this now you have to make a initramfs manually. Are you familar with this operation or do you need instructions for that too?
Dear, Mike. Previous, we modified initramfs. but I want to get information from you. It will be more helpful to me. thanks.
Here is some info about distributing your own driver: http://dup.et.redhat.com/ You do not have to use it. You can build it however you want. If you are going to use the disks attached to the box for partitions used during install you will want to build a driver disk http://dup.et.redhat.com/ddiskit/ (see the README and INSTALL in the tarball for instructions). When you boot the OS install disk then you pass it the driver disk argument. I think you do linux dd When the command prompt comes up initially (do a help if that does not work). The installer will ask you for a driver disk during the install. If you have root on a disk accessed through your box or you are going to do multipath root with paths on the box, then you will have to make a initramfs/initrd for the boot. The problem is that mkinitrd did not get change to be able to handle this for the scsi_dh modules in 5.3. I would basically take the mkinitrd script from 5.3 and hack in some code to just stick in your scsi_dh module. And then just distribute the modified mkinitrd with the other stuff. I will attach an example in the next comment.
Created attachment 330779 [details] hacky patch to always throw in alua Here is a really hacky patch to always do multipath and always throw in scsi_dh_alua. I have not scripting skills. You guys can modify this better than me, but this gives you an idea of what you need to do.
Dear, Mike. Thanks for your information.
Created attachment 330784 [details] clear request before using Could you try this patch with the scsi_dh_alua module from here: http://people.redhat.com/mchristi/scsi_dh/rhel5.4/testing/0001-Add-scsi_dh_alua.patch This patch is just the code that was reverted in that patch you referenced with your fixes integrated.
Oh yeah, could you also send the oops?
Created attachment 331447 [details] new build with scsi_dh_alua's dmesg This is dmesg for system RHEL 5.3 GA with scsi_dh_alus. It shows that it fail to configure multipath for the system device. Please, check mike. Thanks.
Hi, Mike. I build kernel with new patch, and try to configure multipath. There is some problem to configure boot device as multipath device. Can you help me? This is a build step. 1. Patch mkinitrd with patch (always-throw-in-alua.patch) which you give to me. 2. Download kernel source rpm. 3. Copy patchs (0001-Add-scsi_dh_alua.patch, clear-everything.patch) to source directory. 4. Change kernel.spec file to add this patch. 5. Add CONFIG_SCSI_DH_ALUA in .config file in source directory. 6. Build with rpmbuild. In test machine, I allocate 3 logical driver - one for system (OS install), two for data Configure multipath.conf. Multipath tools only show only two data logical drives are configured as multipath device. <multipath –ll result> multipath -ll mpath2 (222c6000155b1e629) dm-3 Intel,Multi-Flex [size=5.0G][features=1 queue_if_no_path][hwhandler=1 alua][rw] \_ round-robin 0 [prio=50][active] \_ 0:0:0:2 sdc 8:32 [active][ready] \_ round-robin 0 [prio=1][enabled] \_ 0:0:1:2 sdf 8:80 [active][ready] mpath1 (2221a000155fa44d4) dm-2 Intel,Multi-Flex [size=5.0G][features=1 queue_if_no_path][hwhandler=1 alua][rw] \_ round-robin 0 [prio=50][active] \_ 0:0:1:1 sde 8:64 [active][ready] \_ round-robin 0 [prio=1][enabled] \_ 0:0:0:1 sdb 8:16 [active][ready] <multipath –v4 result> multipath -v4 dm-0: blacklisted dm-1: blacklisted dm-2: blacklisted dm-3: blacklisted md0: blacklisted ram0: blacklisted ram10: blacklisted ram11: blacklisted ram12: blacklisted ram13: blacklisted ram14: blacklisted ram15: blacklisted ram1: blacklisted ram2: blacklisted ram3: blacklisted ram4: blacklisted ram5: blacklisted ram6: blacklisted ram7: blacklisted ram8: blacklisted ram9: blacklisted sda: not found in pathvec sda: mask = 0x1f sda: bus = 1 sda: dev_t = 8:0 sda: size = 283115520 sda: vendor = Intel sda: product = Multi-Flex sda: rev = 0302 sda: h:b:t:l = 0:0:0:0 sda: serial = 4C202020000000000000000025E811AA84D6FFEB sda: path checker = tur (controller setting) sda: state = 2 sda: getprio = /sbin/mpath_prio_intel /dev/%n (controller setting) sda: prio = 50 sda: getuid = /sbin/scsi_id -g -u -s /block/%n (controller setting) sda: uid = 2222b0001552b7635 (callout) sdb: not found in pathvec sdb: mask = 0x1f sdb: bus = 1 sdb: dev_t = 8:16 sdb: size = 10485760 sdb: vendor = Intel sdb: product = Multi-Flex sdb: rev = 0302 sdb: h:b:t:l = 0:0:0:1 sdb: serial = 4C202020000000000000000073564A1C82E53AA8 sdb: path checker = tur (controller setting) sdb: state = 2 sdb: getprio = /sbin/mpath_prio_intel /dev/%n (controller setting) sdb: prio = 1 sdb: getuid = /sbin/scsi_id -g -u -s /block/%n (controller setting) sdb: uid = 2221a000155fa44d4 (callout) sdc: not found in pathvec sdc: mask = 0x1f sdc: bus = 1 sdc: dev_t = 8:32 sdc: size = 10485760 sdc: vendor = Intel sdc: product = Multi-Flex sdc: rev = 0302 sdc: h:b:t:l = 0:0:0:2 sdc: serial = 4C2020200000000000000000537C97820BE19F77 sdc: path checker = tur (controller setting) sdc: state = 2 sdc: getprio = /sbin/mpath_prio_intel /dev/%n (controller setting) sdc: prio = 50 sdc: getuid = /sbin/scsi_id -g -u -s /block/%n (controller setting) sdc: uid = 222c6000155b1e629 (callout) sdd: not found in pathvec sdd: mask = 0x1f sdd: bus = 1 sdd: dev_t = 8:48 sdd: size = 283115520 sdd: vendor = Intel sdd: product = Multi-Flex sdd: rev = 0302 sdd: h:b:t:l = 0:0:1:0 sdd: serial = 4C202020000000000000000025E811AA84D6FFEB sdd: path checker = tur (controller setting) sdd: state = 2 sdd: getprio = /sbin/mpath_prio_intel /dev/%n (controller setting) sdd: prio = 1 sdd: getuid = /sbin/scsi_id -g -u -s /block/%n (controller setting) sdd: uid = 2222b0001552b7635 (callout) sde: not found in pathvec sde: mask = 0x1f sde: bus = 1 sde: dev_t = 8:64 sde: size = 10485760 sde: vendor = Intel sde: product = Multi-Flex sde: rev = 0302 sde: h:b:t:l = 0:0:1:1 sde: serial = 4C202020000000000000000073564A1C82E53AA8 sde: path checker = tur (controller setting) sde: state = 2 sde: getprio = /sbin/mpath_prio_intel /dev/%n (controller setting) sde: prio = 50 sde: getuid = /sbin/scsi_id -g -u -s /block/%n (controller setting) sde: uid = 2221a000155fa44d4 (callout) sdf: not found in pathvec sdf: mask = 0x1f sdf: bus = 1 sdf: dev_t = 8:80 sdf: size = 10485760 sdf: vendor = Intel sdf: product = Multi-Flex sdf: rev = 0302 sdf: h:b:t:l = 0:0:1:2 sdf: serial = 4C2020200000000000000000537C97820BE19F77 sdf: path checker = tur (controller setting) sdf: state = 2 sdf: getprio = /sbin/mpath_prio_intel /dev/%n (controller setting) sdf: prio = 1 sdf: getuid = /sbin/scsi_id -g -u -s /block/%n (controller setting) sdf: uid = 222c6000155b1e629 (callout) sr0: blacklisted ===== paths list ===== uuid hcil dev dev_t pri dm_st chk_st vend/prod/rev 2222b0001552b7635 0:0:0:0 sda 8:0 50 [undef][ready] Intel,Multi-Flex 2221a000155fa44d4 0:0:0:1 sdb 8:16 1 [undef][ready] Intel,Multi-Flex 222c6000155b1e629 0:0:0:2 sdc 8:32 50 [undef][ready] Intel,Multi-Flex 2222b0001552b7635 0:0:1:0 sdd 8:48 1 [undef][ready] Intel,Multi-Flex 2221a000155fa44d4 0:0:1:1 sde 8:64 50 [undef][ready] Intel,Multi-Flex 222c6000155b1e629 0:0:1:2 sdf 8:80 1 [undef][ready] Intel,Multi-Flex params = 1 queue_if_no_path 1 alua 2 1 round-robin 0 1 1 8:32 100 round-robin 0 1 1 8:80 100 status = 2 0 1 0 2 1 A 0 1 0 8:32 A 0 E 0 1 0 8:80 A 0 *word = 1, len = 1 *word = queue_if_no_path, len = 16 *word = 1, len = 1 *word = alua, len = 4 *word = 2, len = 1 *word = 1, len = 1 *word = round-robin, len = 11 *word = 0, len = 1 *word = 1, len = 1 *word = 1, len = 1 *word = 8:32, len = 4 *word = 100, len = 3 *word = 1, len = 1 *word = 1, len = 1 *word = 8:80, len = 4 *word = 100, len = 3 *word = 2, len = 1 *word = 1, len = 1 *word = 0, len = 1 *word = 2, len = 1 *word = A, len = 1 *word = 1, len = 1 *word = 0, len = 1 *word = A, len = 1 *word = 0, len = 1 *word = E, len = 1 *word = 1, len = 1 *word = 0, len = 1 *word = A, len = 1 *word = 0, len = 1 params = 1 queue_if_no_path 1 alua 2 1 round-robin 0 1 1 8:64 100 round-robin 0 1 1 8:16 100 status = 2 0 1 0 2 1 A 0 1 0 8:64 A 0 E 0 1 0 8:16 A 0 *word = 1, len = 1 *word = queue_if_no_path, len = 16 *word = 1, len = 1 *word = alua, len = 4 *word = 2, len = 1 *word = 1, len = 1 *word = round-robin, len = 11 *word = 0, len = 1 *word = 1, len = 1 *word = 1, len = 1 *word = 8:64, len = 4 *word = 100, len = 3 *word = 1, len = 1 *word = 1, len = 1 *word = 8:16, len = 4 *word = 100, len = 3 *word = 2, len = 1 *word = 1, len = 1 *word = 0, len = 1 *word = 2, len = 1 *word = A, len = 1 *word = 1, len = 1 *word = 0, len = 1 *word = A, len = 1 *word = 0, len = 1 *word = E, len = 1 *word = 1, len = 1 *word = 0, len = 1 *word = A, len = 1 *word = 0, len = 1 Found matching wwid [2222b0001552b7635] in bindings file. Setting alias to mpath0 sda: ownership set to mpath0 sda: not found in pathvec sda: mask = 0xc sda: state = 2 sda: prio = 50 sdd: ownership set to mpath0 sdd: not found in pathvec sdd: mask = 0xc sdd: state = 2 sdd: prio = 1 mpath0: pgfailback = -2 (controller setting) mpath0: pgpolicy = group_by_prio (controller setting) mpath0: selector = round-robin 0 (controller setting) mpath0: features = 1 queue_if_no_path (controller setting) mpath0: hwhandler = 1 alua (controller setting) mpath0: rr_weight = 1 (internal default) mpath0: minio = 100 (controller setting) mpath0: no_path_retry = -2 (controller setting) pg_timeout = NONE (internal default) mpath0: set ACT_CREATE (map does not exist) libdevmapper: ioctl/libdm-iface.c(1634): device-mapper: reload ioctl failed: Invalid argument libdevmapper: ioctl/libdm-iface.c(1634): device-mapper: reload ioctl failed: Invalid argument mpath0: domap (0) failure for create/reload map mpath0: remove multipath map sda: orphaned sdd: orphaned Found matching wwid [2221a000155fa44d4] in bindings file. Setting alias to mpath1 sdb: ownership set to mpath1 sdb: not found in pathvec sdb: mask = 0xc sdb: state = 2 sdb: prio = 1 sde: ownership set to mpath1 sde: not found in pathvec sde: mask = 0xc sde: state = 2 sde: prio = 50 mpath1: pgfailback = -2 (controller setting) mpath1: pgpolicy = group_by_prio (controller setting) mpath1: selector = round-robin 0 (controller setting) mpath1: features = 1 queue_if_no_path (controller setting) mpath1: hwhandler = 1 alua (controller setting) mpath1: rr_weight = 1 (internal default) mpath1: minio = 100 (controller setting) mpath1: no_path_retry = -2 (controller setting) pg_timeout = NONE (internal default) mpath1: set ACT_NOTHING (map unchanged) Found matching wwid [222c6000155b1e629] in bindings file. Setting alias to mpath2 sdc: ownership set to mpath2 sdc: not found in pathvec sdc: mask = 0xc sdc: state = 2 sdc: prio = 50 sdf: ownership set to mpath2 sdf: not found in pathvec sdf: mask = 0xc sdf: state = 2 sdf: prio = 1 mpath2: pgfailback = -2 (controller setting) mpath2: pgpolicy = group_by_prio (controller setting) mpath2: selector = round-robin 0 (controller setting) mpath2: features = 1 queue_if_no_path (controller setting) mpath2: hwhandler = 1 alua (controller setting) mpath2: rr_weight = 1 (internal default) mpath2: minio = 100 (controller setting) mpath2: no_path_retry = -2 (controller setting) pg_timeout = NONE (internal default) mpath2: set ACT_NOTHING (map unchanged) Found matching wwid [2222b0001552b7635] in bindings file. Setting alias to mpath0 sda: ownership set to mpath0 sda: not found in pathvec sda: mask = 0xc sda: path checker = tur (controller setting) sda: state = 2 sda: getprio = /sbin/mpath_prio_intel /dev/%n (controller setting) sda: prio = 50 sdd: ownership set to mpath0 sdd: not found in pathvec sdd: mask = 0xc sdd: path checker = tur (controller setting) sdd: state = 2 sdd: getprio = /sbin/mpath_prio_intel /dev/%n (controller setting) sdd: prio = 1 mpath0: pgfailback = -2 (controller setting) mpath0: pgpolicy = group_by_prio (controller setting) mpath0: selector = round-robin 0 (controller setting) mpath0: features = 1 queue_if_no_path (controller setting) mpath0: hwhandler = 1 alua (controller setting) mpath0: rr_weight = 1 (internal default) mpath0: minio = 100 (controller setting) mpath0: no_path_retry = -2 (controller setting) pg_timeout = NONE (internal default) mpath0: set ACT_CREATE (map does not exist) libdevmapper: ioctl/libdm-iface.c(1634): device-mapper: reload ioctl failed: Invalid argument libdevmapper: ioctl/libdm-iface.c(1634): device-mapper: reload ioctl failed: Invalid argument mpath0: domap (0) failure for create/reload map mpath0: remove multipath map sda: orphaned sdd: orphaned <dmsetup table result> dmsetup table mpath2: 0 10485760 multipath 1 queue_if_no_path 1 alua 2 1 round-robin 0 1 1 8:32 100 round-robin 0 1 1 8:80 100 mpath1: 0 10485760 multipath 1 queue_if_no_path 1 alua 2 1 round-robin 0 1 1 8:64 100 round-robin 0 1 1 8:16 100 VolGroup00-LogVol01: 0 8257536 linear 8:2 274596224 VolGroup00-LogVol00: 0 274595840 linear 8:2 384 <dmsetup -v table result> Name: mpath2 State: ACTIVE Read Ahead: 256 Tables present: LIVE Open count: 0 Event number: 1 Major, minor: 253, 3 Number of targets: 1 UUID: mpath-222c6000155b1e629 0 10485760 multipath 1 queue_if_no_path 1 alua 2 1 round-robin 0 1 1 8:32 100 round-robin 0 1 1 8:80 100 Name: mpath1 State: ACTIVE Read Ahead: 256 Tables present: LIVE Open count: 0 Event number: 1 Major, minor: 253, 2 Number of targets: 1 UUID: mpath-2221a000155fa44d4 0 10485760 multipath 1 queue_if_no_path 1 alua 2 1 round-robin 0 1 1 8:64 100 round-robin 0 1 1 8:16 100 Name: VolGroup00-LogVol01 State: ACTIVE Read Ahead: 256 Tables present: LIVE Open count: 1 Event number: 0 Major, minor: 253, 1 Number of targets: 1 UUID: LVM-Z0fLniuSVFhdqMPj243e85zZw585dT3959TFjr7VnGjLfVoISYPRH5b117Kyy7kf 0 8257536 linear 8:2 274596224 Name: VolGroup00-LogVol00 State: ACTIVE Read Ahead: 256 Tables present: LIVE Open count: 1 Event number: 0 Major, minor: 253, 0 Number of targets: 1 UUID: LVM-Z0fLniuSVFhdqMPj243e85zZw585dT399OY2xjUgaioyvlCTG92NampWSZcRimaK 0 274595840 linear 8:2 384 I also attach dmesg log before, you can find that system logical drive filed in configure multipath. Thanks.
Is mpath0 the problem here? I saw this: mpath0: set ACT_CREATE (map does not exist) libdevmapper: ioctl/libdm-iface.c(1634): device-mapper: reload ioctl failed: Invalid argument libdevmapper: ioctl/libdm-iface.c(1634): device-mapper: reload ioctl failed: Invalid argument mpath0: domap (0) failure for create/reload map mpath0: remove multipath map I am not familiar with multipath tools. I am not sure if that is cause. Where was this log taken? Was it for a normal boot up? Was it during the initramfs part of the boot up? Was it from anaconda startup? Did the simple case work, where you boot from a local drive, then start up multipath when the system is booted up?
Hi, Mike Below message is result of "multipath -v4". >> mpath0: set ACT_CREATE (map does not exist) libdevmapper: ioctl/libdm-iface.c(1634): device-mapper: reload ioctl failed: Invalid argument libdevmapper: ioctl/libdm-iface.c(1634): device-mapper: reload ioctl failed: Invalid argument mpath0: domap (0) failure for create/reload map mpath0: remove multipath map We do following step. 1. building new kernel witch include scsi_dh_alua modules. a. please, refer upper comment #21. 2. boot from new image. 3. configure multipath.conf a. please, refer upper comment #3. 4. install mpath_prio_intel which attached in the upper list. 5. run multipath -v4. I already attached boot message in comment #20. And we use new initramfs(which was create step 1) when boot up.
Note: This BZ corresponds to IT 267309.
Any response from RH to comment #23? No update in the past week.
Hi, Joe. Currently, we did not get any response from RH. And Pradeep is working on setup for remote access.
(In reply to comment #23) > Hi, Mike > Below message is result of "multipath -v4". > >> > mpath0: set ACT_CREATE (map does not exist) > libdevmapper: ioctl/libdm-iface.c(1634): device-mapper: reload ioctl failed: > Invalid argument > libdevmapper: ioctl/libdm-iface.c(1634): device-mapper: reload ioctl failed: > Invalid argument > mpath0: domap (0) failure for create/reload map > mpath0: remove multipath map > I was asking if you had traced this to being the reason for the device not being added. Ben, do you know? Maybe promise/intel guys, you could dig a little deeper and give the multipath-tools guys more info to help them out.
You need the corresponding log from dmesg. That should show why the mpath constructor is returning -EINVAL.
I think it is garbbled up in one of the logs then attached, but there is this: device-mapper: multipath: Using scsi_dh module scsi_dh_alua for failover/failback and device management. device-mapper: table: 253:4: multipath: error getting device device-mapper: ioctl: error adding target to table I do not think this is useful enough, right? Some debug printing is needed from intel/promise?
Oh yeah, promise/intel guys, is this the setup where you are trying to do root on the dm device using the scsi dh alua handler? Are these logs from a install startup, or are these logs from a normal boot up? Did you install to a alau device, then now are trying to boot from a dm device using alua?
Oh yeah, promise/intel guys, you should modify the initramfs so that it loads the scsi modules like sd_mod and scsi_mod, then loads the scsi_dh_alua so it is there when the scsi devices are getting scanned and setup. It looked like it was getting loaed when dm multipath start up. I do not think it will fix this. It should just remove some of the error messages during startup.
Sorry, didn't realise we already had the matching logs. No, there's not much to go on in that set of output. It'd be useful to see the table that multipath is trying to load but iirc we only log the table as a string if the reload succeeds..
Sorry, for late response. In promise site, system setup is completed but we need more time IT engineer to setup network for remote access. we asked, but we cannot estimate the time. I wish it will be done in next two days. If we finished, I will let you know. Thanks.
I am going to mark that last comment private because anyone can view this bz currently.
In latest testing we are able setup multipath and run failover/failback testing and the only problem we experience is bug 455678. We will do more testing to see if we can mark this bug as resolved.
(In reply to comment #36) > In latest testing we are able setup multipath and run failover/failback testing > and the only problem we experience is bug 455678. We will do more testing to > see if we can mark this bug as resolved. what is the status of your testing?
We still see the failure described in bug 455678 but this bug (kernel panic) has not been observed in quite some time. We can close this bug.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2009-1377.html