Bug 500580 - [NetApp 4.9 bug]Target controller faults are resulting in underlying FC paths getting offlined/onlined on the host
[NetApp 4.9 bug]Target controller faults are resulting in underlying FC paths...
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: device-mapper-multipath (Show other bugs)
All Linux
medium Severity medium
: rc
: 4.9
Assigned To: Ben Marzinski
Cluster QE
: OtherQA
Depends On:
Blocks: 626414
  Show dependency treegraph
Reported: 2009-05-13 07:21 EDT by Naveen Reddy
Modified: 2011-02-16 09:24 EST (History)
17 users (show)

See Also:
Fixed In Version: device-mapper-multipath-0.4.5-40.el4
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2011-02-16 09:24:20 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Attaching the /etc/multipath.conf and /var/log/messages (1.09 MB, application/x-tar)
2009-05-13 07:28 EDT, Naveen Reddy
no flags Details

  None (edit)
Description Naveen Reddy 2009-05-13 07:21:14 EDT
Description of Problem:
On a RHEL4.8 host, Target controller faults are resulting in underlying FC paths getting offlined/onlined on the host

Version-Release number of selected component (if applicable):
OS : RHEL4.8

Steps to Reproduce:
1. Map few LUNs to RHEL4.8 host ,configure the LUNs and start I/O.
2. Do a target controller fault (say "takeover" and "giveback") on the netapp storage controller.
3. All paths are reinstated within 2 minutes.
4. Wait for around 30 to 35 minutes.
5. Few paths to the LUNs are getting dropped and reinstated back.

Actual Results:
Paths are dropping after 30 to 35 minutes after the target controller faults.

Expected Results:
Paths should not drop.

Additional Info:
This issue is not seen when "path_checker" is set to "readsector0" in /etc/multipath.conf.
Attaching the /etc/multipath.conf and /var/log/messages.
Comment 1 Naveen Reddy 2009-05-13 07:28:50 EDT
Created attachment 343746 [details]
Attaching the /etc/multipath.conf and /var/log/messages
Comment 2 Andrius Benokraitis 2009-05-13 08:40:56 EDT
Setting target for RHEL 4.9, since 4.8 is pretty well baked already.
Comment 4 Ben Marzinski 2010-05-04 17:34:29 EDT
Is this not a problem with directio in RHEL5?  Also, would it be possible to get
the results of starting multipathd with

# multipathd -v3

and reproducing the issue, instead of using the initscripts?
Comment 7 Andrius Benokraitis 2010-10-25 09:24:24 EDT
NetApp: Red Hat is hoping to have this completed this week.
Comment 8 Ben Marzinski 2010-10-28 00:01:16 EDT
This issue appears to be due to the fact that multipathd doesn't asychronously check the path state with directio in RHEL4.  That means that it can't wait very long for the path to respond, because it is stalling multipathd while it does.  I backported the code to make multipathd wait asynchronously for the IO to complete with the directio checker, this allows it to wait for 30 seconds for either a success or failure response.  This fixed a similiar problem on RHEL5.
Comment 10 errata-xmlrpc 2011-02-16 09:24:20 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.