Bug 247579
Summary: | LUN (configured with multipath) removal on Clariion storage cause lv commands freeze for 10 minutes | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Jian Wang <jiawang> |
Component: | lvm2 | Assignee: | Ben Marzinski <bmarzins> |
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 5.0 | CC: | agk, bmarzins, bmr, coughlan, dwysocha, heinzm, mbroz, prockai |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2010-09-14 14:49:11 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jian Wang
2007-07-10 02:53:40 UTC
After 10 minutes when it resume and show the result, /var/log/messages also has the following log information: "Jun 4 14:31:47 xeon3 multipathd: mpath2: Disable queueing" Seems queueing is causing the display to wait for such a long time. So I changed multipath.conf: no_path_retry 300 ==> no_path_retry fail It have no effect on the waiting time, instead I have to change the source file /multipath-tools-0.4.7.rhel5.2/libmultipath/hwtable.c line 176 from 60 to NO_PATH_RETRY_UNDEF, to make it fail directly when there are no other paths to retry. It solves the problem temporary. It should be a bug that configuration file doesn't take effect on the no_path_retry entry, and it seems to me that multipath.conf entries are overwritten by "hwtable" in config.c(method load_config) for this case. I guess loading "hwtable" first and then load user configuration would solve this problem too, but not sure it will have other side-effects. I changed multipath.conf from # no_path_retry 300 to no_path_retry fail and then do /etc/init.d/multipathd restart The change never affect the result of the following commands: [root@xeon3 ~]# dmsetup table mpath11 0 2179072 multipath 1 queue_if_no_path 1 emc 1 1 round-robin 0 1 1 71:704 1000 [root@xeon3 ~]# multipath -v3 |grep no_path_retry mpath77: no_path_retry = 60 (controller setting) mpath79: no_path_retry = 60 (controller setting) mpath524: no_path_retry = 60 (controller setting) .... mpath89: no_path_retry = 60 (controller setting) mpath529: no_path_retry = 60 (controller setting) mpath91: no_path_retry = 60 (controller setting) mpath530: no_path_retry = 60 (controller setting) By analysing the multipath-tools code in libmultipath/propsel.c select_no_path_retry(struct multipath *mp) { if (mp->mpe && mp->mpe->no_path_retry != NO_PATH_RETRY_UNDEF) { mp->no_path_retry = mp->mpe->no_path_retry; condlog(3, "%s: no_path_retry = %i (multipath setting)", mp->alias, mp->no_path_retry); return 0; } if (mp->hwe && mp->hwe->no_path_retry != NO_PATH_RETRY_UNDEF) { mp->no_path_retry = mp->hwe->no_path_retry; condlog(3, "%s: no_path_retry = %i (controller setting)", mp->alias, mp->no_path_retry); return 0; } if (conf->no_path_retry != NO_PATH_RETRY_UNDEF) { mp->no_path_retry = conf->no_path_retry; condlog(3, "%s: no_path_retry = %i (config file default)", mp->alias, mp->no_path_retry); return 0; } mp->no_path_retry = NO_PATH_RETRY_UNDEF; condlog(3, "%s: no_path_retry = NONE (internal default)", mp->alias); return 0; } we found that multipath will first select controller default setting from hwtable.c and then select configuration file's default value. Multipath is doing this way for all the configurable parameters like get_prio/get_uid/pgpolicy etc. So my questions is: Is this sequence of loading reasonable? And by the way, all controllers' default value for no_path_retry are set to NO_PATH_RETRY_UNDEF except COMPAQ/HP HSV and EMC clariion. So HP HSV might have the same problem as well,with HP HSV's no_path_retry default value being 60. I have a few questions about this. Is there only one path to the device? The no_path_retry parameter should only effect operation when the last path is removed. If there are still active paths, and you are queuing IO until the no_path_retry limit is reached, then that is a problem all by itself. However, the configuration loading order in not a problem. First multipath tries to load from the multipath specific parameters, then the controller specific parameters, then the parameters specified in the defaults section of the config file. If none of these set a value for the parameter, then it uses a sensible compiled in default. hwtable.c sets some compiled in controller specific defaults. These are checked along with the user defined controller specific parameters. However, user supplied parameters are given priority. So, user supplied parameters in the devices section of the config file overwrite the compiled in ones for that controller. I am assuming that you set no_path_retry in the defaults section of the config file. No reply for over 2 years, and no other reports like this. Closing. |