User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.5) Gecko/2008121622 Fedora/3.0.5-1.fc10 Firefox/3.0.5 After reviewing the code and doing some testing I have noticed that polling_interval did not work as expected. I had reviewed the description of the option for multipath.conf and it conflicted with the results that I had got testing device-mapper-multipath on RHEL4/RHEL5. $ cat /usr/share/doc/device-mapper-multipath-0.4.7/multipath.conf.annotated # # name : polling_interval # # scope : multipathd # # desc : interval between two path checks in seconds # # default : 5 # # # polling_interval 10 --------- The behaviour that I had expected based on the option's description above: check path 1 wait polling_interval check path 2 wait polling_interval check path 1 wait polling_interval check path 2 wait polling_interval However after testing the results that I got was(with multipathd -v4): example: check path 1 check path 2 wait polling_interval check path 1 check path 2 wait polling_interval --------- The behaviour I seen in RHEL4 and RHEL5 was working as design after reviewing the code and talking to a couple engineers. The problem it seems is how I was reading the description of the option. Most users read the word path as being "path" to a multipath device and not path as in all possible paths to all possible mpaths. From my results in testing and talking with some engineers the "polling_interval" option actually means: "The interval between checking all possible paths for all multipath paths" I believe the man page and sample config files need to be updated to reflect a more accurate and simpler description. Reproducible: Always Steps to Reproduce: None Actual Results: None Expected Results: None
Thread with patch attached: https://www.redhat.com/archives/dm-devel/2009-January/msg00197.html
Created attachment 330376 [details] patch to update docs on polling_interval
I can see how the original wording could confuse a person, however the wording in you attachment is not correct. multipathd doesn't always check all the paths every polling interval. A path is checked every polling_interval seconds after it was added. If paths were added at different times, they may not be checked at the same time. Also, If a path is usable, the time between path checks will gradually increase to (4 * polling_interval). This is because it is much more important to recover a failed path, than it is to preemptively fail an active path. If any IO attempts to use a path that is broken, but marked active, the kernel will automatically switch the path to a failed state. However the kernel is not able to try a failed path until multipathd has marked it as active again (well, that is not completely true, but it's close enough). This change in actual time between checks is yet another reason why different paths won't always be checked at the same time. Here is the change I made. Let me know if you think it is still problematic Index: multipath-tools-rhel5_4/multipath.conf.annotated =================================================================== --- multipath-tools-rhel5_4.orig/multipath.conf.annotated +++ multipath-tools-rhel5_4/multipath.conf.annotated @@ -28,7 +28,9 @@ # # # # name : polling_interval # # scope : multipathd -# # desc : interval between two path checks in seconds +# # desc : How often a path's state is checked, in seconds. For +# # paths that are usable, the time between checks will +# # gradually increase to (4 * polling_interval). # # default : 5 # # # polling_interval 10
~~ Attention - RHEL 5.4 Beta Released! ~~ RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner! If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity. Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value. Questions can be posted to this bug or your customer or partner representative.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2009-1377.html