Red Hat Bugzilla – Bug 624985
multipathd fails to run "getuid_callout" if that is a script
Last modified: 2012-01-18 13:13:20 EST
multipathd fails to run the "getuid_callout", for example we have defined
getuid_callout "/opt/ddn/bin/ddn_multipath_alias %n"
# Use the following directive to utilize DDN S2A getuid callout script
# which uses a predefined lookup table to generate uid name based
# on S2A LUN name.
# getuid_callout "/sbin/scsi_id -g -p 0x80 -s /block/%n"
prio_callout "/sbin/mpath_prio_alua %d"
# Review DDN Firmware Release notes and match polling_interval to
# host timeout value
/opt/ddn/bin/ddn_multipath_alias will return the serial number or an alias of the device and is a shell script. However, multipathd fails to run it, in syslog we get messages like this
Aug 18 11:57:05 localhost multipathd: cannot get the the wwid for sdx
Aug 18 11:57:05 localhost multipathd: /opt/ddn/bin/ddn_multipath_alias exitted with 255
Now if I run it manually or if it is run by "multipath" (note the missing 'd'), it works fine
root@rhel5-nfs@phys-mds0:~# /opt/ddn/bin/ddn_multipath_alias sdx
I have straced it and looked through the code and noticed that multipathd goes into a chroot. It seems multipathd bind mounts all binaries and libs it need into that chroot. But that does not work for shell scripts. As our alias script got rather complex we might switch to python, but I don't expect multipathd to handle that any better.
While it is certainly good for a daemon to go into a chroot, it does not really to be suitable for multipathd to do that.
Multipathd has its own namespace on a ramfs, and it copies any callouts into that ramfs, so that it can execute them even if it loses access to the underlying device. However it isn't smart enough to copy the script interpreter for scripts. To make it do that, you need a dummy device configuration like this in the devices section of /etc/multipath.conf
You can replace bash with whatever script interpreter you need for multipathd to copy into its ramfs. Does this fix the problem for you?
The vendor and product fields don't matter, as long as they are something that no real device is going to use.
Thanks for your help Ben. So all the trouble only for the rare case that the root filesystem is located on a multipath device itself? Shouldn't that be an option then? The only use case I can see for the root filesystem on a multipath device, would still work better by using drbd...
Unfortunately, it did not help to simply add /bin/bash. I also tried to add several other binaries called our script, but it still failed. I didn't try to add '[' and ']', though.
So we have the choice to
a) Rewrite multipathd. I have looked into the sources and won't be a fast solution.
b) Rewrite our script into C or C++. That then leaves the problem of other call outs done by that program. For example our script calls scsi_id itself again.
c) Use the bindings file, but see bug 624987.
We just ran into the same problem. It turns out that multipathd doesn't chroot - it just has it's own namespace for /sbin, /bin, and /tmp. One of our scripts is written in perl and works just fine, since the interpreter is in /usr/bin/perl. The other is in bash and execve fails with ENOENT due to the lack of /bin/bash in multipathd's private namespace.
The ticket is a little old and I imagine that you've already addressed this, but I suspect your idea about rewriting in python with work just fine. Assuming, of course, it doesn't reference any files - executable or otherwise - in /sbin, /bin, and /tmp that multipathd doesn't pull in by default. Luckily, /sbin/scsi_id is one of those.
> a) Rewrite multipathd. I have looked into the sources and won't be a fast
If you're really considering that, why not just switch to the upstream multipathd? The offending code has disappeared.
This BZ has been around for a while with no updates. Closing NOTABUG. If this is still an issue please open a case with Red Hat Support via the Customer Portal.