Bug 156280

Summary: multipath-tools tests active paths but never uses the status to fail them
Product: [Fedora] Fedora Reporter: Lars Marowsky-Bree <lmb>
Component: device-mapper-multipathAssignee: Alasdair Kergon <agk>
Status: CLOSED RAWHIDE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: agk, christophe.varoqui, dmo, egoggin, lmb, tranlan
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-09-04 23:05:59 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lars Marowsky-Bree 2005-04-28 16:27:23 UTC
multipath-tools does check all paths. But when it finds that a path previously
active has failed, it never tells the kernel about it.

(Reported by Edward.)

Comment 1 Christophe Varoqui 2005-05-02 21:08:57 UTC
Candidate fix in 0.4.5-pre2
Please confirm the new behaviour is what is expected.

Comment 2 Lan Tran 2005-05-02 22:33:49 UTC
Hm, I just tried 0.4.5-pre2, but it doesn't appear fixed... 

After doing a switch port disable, multipathd path checker detects the 2 paths 
are down, but the internel dm path state is  still 'active'. I would expect it
to go to 'failed'. 

1IBM     2105            739FCA30
[size=953 MB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [enabled][first]
  \_ 1:0:0:1 sdbn 68:16   [ready ][active]
  \_ 1:0:1:1 sdbv 68:144  [ready ][active]
  \_ 0:0:0:1 sdb  8:16    [faulty][active]
  \_ 0:0:1:1 sdj  8:144   [faulty][active]
 


Comment 3 Christophe Varoqui 2005-05-02 22:46:58 UTC
The framework is in place, certainly needs debugging now :

multipathd/main.c:checkerloop() calls fail_path(), which calldm_fail_path() upon
path going down events.

I just verified the log received the "checker failed path %s in map %s" message
when removing a path through sysfs.

If you don't beat me to it, I'll see what I can do tomorrow.

Comment 4 Lan Tran 2005-05-03 13:08:40 UTC
Hi Christophe, 

It turns out that dm_fail_path() is never being called in my setup because the
check for !pp->mpp always fails. Not sure why you removed the intial multipath
reconfiguration from multipathd (in patch below), because otherwise, the
multipath maps are not created. And as you removed the signal handling, moving
to uevents I believe, I'm not quite sure how the multipath daemon's allpaths
gets updated when multipath is run. It looks like uevent is triggered only when
removing/adding underlying sd devices?  

I also get this odd behavior that multipathd process will just die on me if I
try to restart it when there are already maps configured. Not sure why, as I see
no debug messages. 

(BTW, I was running 0.4.5-pre2 on the RHEL4 U1 beta1 kernel.) 

--- multipath-tools-0.4.5-pre2/multipathd/main.c        2005-04-28
16:52:56.000000000 -0700
+++ multipath-tools-0.4.5-pre2-patched/multipathd/main.c        2005-05-03
05:50:47.203453424 -0700
@@ -468,7 +471,7 @@
        }
                                                                               
                 
        log_safe(LOG_NOTICE, "initial reconfigure multipath maps");
-//     execute_program(conf->multipath, buff, 1);
+       execute_program(conf->multipath, buff, 1);
                                                                               
                 
        while (1) {


Comment 5 Christophe Varoqui 2005-05-03 16:18:21 UTC
> And as you removed the signal handling, moving to uevents I believe,
> I'm not quite sure how the multipath daemon's allpaths gets updated when
> multipath is run. It looks like uevent is triggered only when
> removing/adding underlying sd devices?

multipath is run from hotplug/udev, and only for "add"s.
For each hotplug "add" event, the daemon will receive a "add" uevent.
The signal thing was safe to kill.

As for the initial multipath run, if your hotplug/udev setup is right, you don't
need it : the maps should already be configured when starting the daemon.

Even if your setup has multipath.dev disabled, you'd better put the multipath
run either in intrd or in multipathd startup script.

Now, multipathd dying on you needs to be fixed. But as far as I can see the
design directions are right.

Comment 7 Alasdair Kergon 2005-05-05 21:46:44 UTC
MODIFIED means it should be fixed but we're awaiting confirmation of that.
If there are no comments in a week or so, then we assume it is fixed and close
the bug.  If subsequently it's found not to have been fixed, then we simply
reopen it.

Comment 8 Lan Tran 2005-05-26 15:29:47 UTC
Using the multipath-tools git snapshot from May 16, 2005. 
Without any I/O running, I disabled a port (bringing down half the paths for
each multipath device), and the state of the disabled paths were correctly put
into '[faulty][failed]' state.

However, I also just tried this on the May 26 git snapshot, and it doesn't work
anymore. It seems that multipathd keeps dying whenever I try to start it  up
using RH4, U1's '/etc/init.d/multipathd start' script. Not sure what's going on.

(I'm using RH4 U1 beta 2.6.9-9.ELsmp kernel.)


   


Comment 9 Lan Tran 2005-05-27 01:07:55 UTC
> 
> Instinctively I would say I messed the case where no "failback" keyword
> is provided in the config file, meaning the culprit is the last commit.
> 

Just checked out the latest from the git repository and tried failing paths with
and without I/O running. The paths are failed and recovered as expected under
both scenarios. Thanks Christophe. 


Comment 10 Rahul Sundaram 2005-09-04 23:05:59 UTC

Closing as per previous comment.