Description of problem: We've got a rig of four machines, connected via dual cables to a SAN through a QLogic Corp. QLA2312 adapter. We recently disconnected one of the switches (thus failing one of the paths) to move it physically in preparation of moving the servers themselves. After failing the path, one server has gotten some problems with its multipath (the three others simply marked it as failed and continued as normal). The problematic server continues to use the SAN devices as it should, so the service is not affected (we didn't fail the active path, after all), but after about one day "multipath -l" no longer works (it just hangs). Version-Release number of selected component (if applicable): device-mapper-multipath-0.4.5-12.0.RHEL4 How reproducible: Not very reproducible. This happened only to one of the four machines, using the same hardware, connected to the same SAN, with the same fibre channel adapter, drivers, and kernel. Steps to Reproduce: 1.Set up dual connections to a SAN, and set up multipath to handle them. 2.Disconnect one of the paths (the non-active one in this case) 3.Watch "multipath -l" hang after 1-2 days. Actual results: multipath -l hangs; Not sure if multipathd will still be able to handle a reconnect of the old path or if it is hanging as well. Expected results: The failed path being marked as failed, and "multipath -l" continuing to work. Additional info: We failed the path on 2007-01-24. We killed and restarted multipathd on 2007-01-29 in a (failed) attempt to make the problem go away. The following is cut from a ps list with processes that might or might not be relevant: root 8565 0.0 0.0 2380 200 ? S Jan26 0:00 /sbin/mpath_prio_emc /dev/sde root 9275 0.0 0.0 5896 684 ? S Jan26 0:00 /sbin/multipath -l root 9292 0.0 0.0 5896 684 ? S Jan26 0:00 /sbin/multipath -l root 9315 0.0 0.0 5896 684 ? S Jan26 0:00 /sbin/multipath -l root 10077 0.0 0.0 5896 684 ? S Jan26 0:00 /sbin/multipath -l root 19746 0.0 0.0 5900 688 ? S Jan26 0:00 multipath -v 3 -l root 17640 0.0 0.0 9744 2596 ? SLl Jan29 0:00 /sbin/multipathd root 17647 0.0 0.0 3644 544 ? S Jan29 0:00 \_ /sbin/scsi_id -g -u -s /block/sdd All the "multipath -l" from Jan26" was made by our monitoring system before we spotted the problem and disabled the monitoring. The monitoring system polls multipath every 5 minutes, and managed to work (i.e. not get hanging processes) for more than a day before the problems started. From /var/log/messages when we failed the path: Jan 24 13:24:00 pund kernel: end_request: I/O error, dev sdb, sector 20324440 Jan 24 13:24:01 pund kernel: device-mapper: dm-multipath: Failing path 8:16. Jan 24 13:24:01 pund kernel: SCSI error : <1 0 0 0> return code = 0x20000 Jan 24 13:24:01 pund kernel: end_request: I/O error, dev sdb, sector 78476736 Jan 24 13:24:01 pund multipathd: 8:16: mark as failed Jan 24 13:24:01 pund multipathd: nfs: remaining active paths: 0 Jan 24 13:24:05 pund multipathd: 8:16: emc_clariion_checker: Path healthy Jan 24 13:24:05 pund multipathd: 8:16: reinstated Jan 24 13:24:05 pund multipathd: nfs: remaining active paths: 1 Jan 24 13:24:05 pund kernel: device-mapper: dm-emc: emc_pg_init: sending switch-over command Jan 24 13:24:05 pund kernel: SCSI error : <2 0 0 0> return code = 0x10000 Jan 24 13:24:05 pund multipathd: nfs: switch to path group #2 Jan 24 13:24:05 pund multipathd: nfs: switch to path group #2 Jan 24 13:24:05 pund kernel: SCSI error : <2 0 0 0> return code = 0x10000 Jan 24 13:24:05 pund kernel: SCSI error : <2 0 0 0> return code = 0x10000 Jan 24 13:24:05 pund kernel: device-mapper: dm-emc: emc_pg_init: sending switch-over command Jan 24 13:24:05 pund multipathd: nfs: switch to path group #2 Jan 24 13:24:05 pund kernel: SCSI error : <2 0 0 1> return code = 0x10000 Jan 24 13:24:05 pund kernel: SCSI error : <2 0 0 1> return code = 0x10000 Jan 24 13:24:06 pund kernel: device-mapper: dm-emc: emc_pg_init: sending switch-over command there is no further logging from multipathd after this. Running a strace on multipathd shows that it's hanging on: read(8, <unfinished ...> Running strace on the scsi_id child of multipathd shows that it's hanging on: write(2, "2:0:0:0: sg_io failed status 0x0"..., 45 <unfinished ...> Running strace on any of the multipath -l processes show that they're hanging on: connect(4, {sa_family=AF_FILE, path="/var/run/multipathd.sock"}, 110 <unfinished> running strace on the mpath_prio_emc process shows that it's hanging on: write(2, "mmand indicates error", 21 <unfinished ...> Please let me know if there is any further information that might help in debugging this issue.
...and just as I had fired off a bug report about it, more information pops up. :-) We found the problem. FD 2 on the processes that were hanging in write(), was /dev/console. We connected to the remote management interface (the ILO - this is Sun machines) and opened the remote console, everything stopped hanging. /dev/console seems to be able to "fill up" on Sun X4100 with ILO (integrated lights-out) management interfaces. I don't know whether this is something you want to fix in multipath (i.e. make it not log to /dev/console) or hand the bug off to someone taking care of /dev/console, so I'll leave it to you at RH to decide which component the bug should be assigned to. :)
multipathing still worked right, just the "multipath -l" command didn't? I'm not sure what exactly we could do here. "multipath -l" by default goes to the console, display of output being the purpose. The blocking could be viewed as a feature, since it prompted you to find the problem right? What would be our alternative? Stop writing output if it's blocked? Write to some file? Neither option seems reasonable. Closing as wont_fix. Fee free to re-open if you disagree or I've misinterpreted the problem.
"multipath -l" output does _not_ go to the console by default, it goes to stdout. The reason multipath starts hanging is because multipathd is hanging, read on... That /dev/console might block could be considered a feature - I'm guessing flow control on the serial port, but if I understand Documentation/kernel-parameters.txt correctly the kernel should default to _not_ using RTS but maybe that's incorrect or that Red Hat patches change it. Doesn't matter. If it's a feature, multipathd needs to be able to cope with it, but right now it doesn't. That's the bug. The thing is that when /dev/console starts blocking, multipathd also start blocking, because in daemon mode it sends all output to /dev/console. From multipathd/main.c: out_fd = open("/dev/console", O_WRONLY); if (out_fd < 0){ fprintf(stderr, "cannot open /dev/console for output : %s\n", strerror(errno)); _exit(0); } [...] close(STDOUT_FILENO); dup(out_fd); close(STDERR_FILENO); dup(out_fd); [...] close(out_fd); So if /dev/console blocks you're in big trouble if you're writing to it without O_NONBLOCK. multipathd itself might not do this, but its helper programs inherits its file descriptors, and when there's an error at least scsi_id and mpath_prio_emc wants to print stuff as shown by the strace. Those calls never return, and multipathd gets stuck waiting for them. Which means that it won't be checking and reinstating paths either, which is really bad as this is the exact moment where that functionality is needed. That multipath also ends up hanging is because it attempts to contact multipathd through /var/run/multipathd.sock. This takes forever because multipathd is hanging and never gets around to accept()ing multipath's connect(). Not really critical, but that's the way we noticed it because we run multipath periodically from our monitoring system to look for failed paths. The fix is to stop multipathd from dup()ing stdout and stderr to /dev/console (/dev/null should be used instead) - output from its child processes should just be discarded (or even better, captured and logged). I'd like to reopen the bug but Bugzilla won't let me. And Jimmy's on sick leave right now. Can you reopen it, Dave? Regards Tore
Messages should be written via syslog so the sysadmin can configure where they go with syslog.conf. /dev/console should not be used directly!
"Those calls never return, and multipathd gets stuck waiting for them." That's also a bug - it should time out.
Hi Alasdair, long time no see. :-) I'm afraid I don't really remember exactly what FD 8 was in the "read(8, <unfinished ...>" strace, but I _think_ it was a pipe to a child process. If I read callout.c correctly it grabs stdout from the child process here: while (1) { count = read(fds[0], value + i, len - i-1); if (count <= 0) break; i += count; if (i >= len-1) { retval = -1; break; } } ...and I see no alarms set up around it either so I guess that's where it got stuck. The child process got stuck on writing to stderr a.k.a. /dev/console because no pipe is set up for stderr and thus the /dev/console fd was inherited. At least so I think.. My C skills are not only rusty, they're almost gone. :-( Another thing though, since I see it's you. :-) I see that you've got a dm-mpath-rdac.patch in your kernel.org people subdir. I'm drooling... Been trying to set up several clusters against Engenio/LSI gear and it's hopeless because only AVT mode is working. Every time one cluster members boots or reloads the HBA module, causing the kernel to scan for partition tables, or mount goes looking for filesystem labels, or LVM go looking for PVs, and so on, volumes bounce around and paths are failing like crazy on all the other cluster members. It's just pain, pain, pain. Is there a chance of such a RDAC hardware handler making its way into the RHEL and Linus' kernel any time soon? Regards Tore
Just tried to get multipathing up and running on a StorageTek 6140 and it breaks the same way. 6140 LSI-based indeed. Multipathd cannot call out to the path, the diskarray fails the controller and all sorts of disasters as soon as something tries to access the array. Googling comes up with device-mapper and kernel-patches, I guess we will be unsupported as far as RedHat is concerned - if we decide to go that route. We are running RHEL 4.5 x86_64 and this is a major showstopper right now. Regards Magnus
Any hope of this bug being fixed? It is quite nasty as it prevents multipathd from working correctly when /dev/console blocks, which seems to be the case with SunFire X4x00 machines console outputted to the serial port of the ILO card. We've had problems with this bug causing I/O interruptions several times now. Magnus - I suspect you're having other issues. This bug is not specific to the 6140, it will hit no matter what array you have. There's a thread about the 6140 at the dm-devel list you might want to take a look at. https://www.redhat.com/archives/dm-devel/2007-September/msg00064.html Tore
> Any hope of this bug being fixed? Not fixed as of RHEL 5.2.
Is anyone still hitting this issue on RHEL4. There's not much chance of getting this fixed there. Fixing it on RHEL5 and RHEL6 are definitely doable, since multipathd is still opening dev/console on both of them. Looking through the current code, it doesn't look like there are many multipathd messages that don't go through multipath's condlog interface, which should handle printing correctly, and most of the ones that there are obviously happen during the initial startup, before multipathd is completely daemonized. But a couple look wrong, and a review of multipthd's output is probably overdue. Does anyone have an objection if I change this from a RHEL4 bug to a RHEL5 bug? If people are still running into this problem on RHEL4, I'll clone it, and leave this bug open for RHEL4.
Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. Please See https://access.redhat.com/support/policy/updates/errata/ If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.