RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 999915 - RHEL 6.4 ioctl() hangs on multipath devices with no paths after volume unmap
Summary: RHEL 6.4 ioctl() hangs on multipath devices with no paths after volume unmap
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: device-mapper-multipath
Version: 6.4
Hardware: All
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Ben Marzinski
QA Contact: Lin Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-08-22 11:18 UTC by eliranz
Modified: 2016-08-11 19:23 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-11 19:23:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Comment (590.19 KB, text/plain)
2013-09-01 14:51 UTC, eliranz
no flags Details

Description eliranz 2013-08-22 11:18:09 UTC
Description of problem:

### Research:

## Actions: I mapped -> rescanned -> unmapped -> rescanned 13 volumes to a host running RHEL 6.3 and a host running RHEL 6.4. On both platforms dead multipath devices remain after the unmap+rescan.


When executing natively implemented sg_inq on the device, on RHEL 6.4 the command hang until killing the process from other session:

## RHEL 6.4 - Executing sg_inq on the same device ##
root@royr-rhel64-x64:/ $ sg_inq /dev/mapper/mpathr
^C
root@royr-rhel64-x64:/ $

## RHEL 6.4 - strace on sg_inq ##
root@royr-rhel64-x64:/ $ strace sg_inq /dev/mapper/mpathr
execve("/usr/bin/sg_inq", ["sg_inq", "/dev/mapper/mpathr"], [/* 43 vars */]) = 0
brk(0)  = 0x2240000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f06730ce000
access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)  = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=82595, ...}) = 0
mmap(NULL, 82595, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f06730b9000
close(3)= 0
open("/usr/lib64/libsgutils2.so.2", O_RDONLY) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 \223\240x3\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=159288, ...}) = 0
mmap(0x3378a00000, 2252096, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3378a00000
mprotect(0x3378a22000, 2093056, PROT_NONE) = 0
mmap(0x3378c21000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x21000) = 0x3378c21000
close(3)= 0
open("/lib64/libc.so.6", O_RDONLY)  = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\355\341x3\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1922152, ...}) = 0
mmap(0x3378e00000, 3745960, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3378e00000
mprotect(0x3378f8a000, 2093056, PROT_NONE) = 0
mmap(0x3379189000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x189000) = 0x3379189000
mmap(0x337918e000, 18600, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x337918e000
close(3)= 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f06730b8000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f06730b7000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f06730b6000
arch_prctl(ARCH_SET_FS, 0x7f06730b7700) = 0
mprotect(0x3379189000, 16384, PROT_READ) = 0
mprotect(0x337881f000, 4096, PROT_READ) = 0
munmap(0x7f06730b9000, 82595)   = 0
brk(0)  = 0x2240000
brk(0x2261000)  = 0x2261000
open("/proc/devices", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f06730cd000
read(3, "Character devices:\n  1 mem\n  4 /"..., 1024) = 500
close(3)= 0
munmap(0x7f06730cd000, 4096)= 0
open("/dev/mapper/mpathr", O_RDONLY|O_NONBLOCK) = 3
fstat(3, {st_mode=S_IFBLK|0660, st_rdev=makedev(253, 6), ...}) = 0
ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[6]=[12, 00, 00, 00, 24, 00], mx_sb_len=32, iovec_count=0, dxfer_len=36, timeout=60000, flags=0

On RHEL 6.3 on the other hand, when running the exact same scenario an IOError is raised immediately:

## RHEL 6.3 - Executing sg_inq on the same device ##
root@royr-rhel63-x64:/ $ sg_inq /dev/mapper/mpathe
Both SCSI INQUIRY and fetching ATA information failed on /dev/mapper/mpathe
root@royr-rhel63-x64:/ $

## RHEL 6.3 - strace on sg_inq ##
execve("/usr/bin/sg_inq", ["sg_inq", "/dev/mapper/mpathe"], [/* 43 vars */]) = 0
brk(0)  = 0x20bd000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbbab5f8000
access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)  = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=36222, ...}) = 0
mmap(NULL, 36222, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fbbab5ef000
close(3)= 0
open("/usr/lib64/libsgutils2.so.2", O_RDONLY) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 \223\340\2467\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=159288, ...}) = 0
mmap(0x37a6e00000, 2252096, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x37a6e00000
mprotect(0x37a6e22000, 2093056, PROT_NONE) = 0
mmap(0x37a7021000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x21000) = 0x37a7021000
close(3)= 0
open("/lib64/libc.so.6", O_RDONLY)  = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\355!\2477\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1922152, ...}) = 0
mmap(0x37a7200000, 3745960, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x37a7200000
mprotect(0x37a738a000, 2093056, PROT_NONE) = 0
mmap(0x37a7589000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x189000) = 0x37a7589000
mmap(0x37a758e000, 18600, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x37a758e000
close(3)= 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbbab5ee000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbbab5ed000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbbab5ec000
arch_prctl(ARCH_SET_FS, 0x7fbbab5ed700) = 0
mprotect(0x37a7589000, 16384, PROT_READ) = 0
mprotect(0x37a6c1f000, 4096, PROT_READ) = 0
munmap(0x7fbbab5ef000, 36222)   = 0
brk(0)  = 0x20bd000
brk(0x20de000)  = 0x20de000
open("/proc/devices", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbbab5f7000
read(3, "Character devices:\n  1 mem\n  4 /"..., 1024) = 484
close(3)= 0
munmap(0x7fbbab5f7000, 4096)= 0
open("/dev/mapper/mpathe", O_RDONLY|O_NONBLOCK) = 3
fstat(3, {st_mode=S_IFBLK|0660, st_rdev=makedev(253, 6), ...}) = 0
ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[6]=[12, 00, 00, 00, 24, 00], mx_sb_len=32, iovec_count=0, dxfer_len=36, timeout=60000, flags=0}) = -1 EAGAIN (Resource temporarily unavailable)
ioctl(3, 0x30d, 0x7fffa25d2ff0) = -1 EAGAIN (Resource temporarily unavailable)
write(2, "Both SCSI INQUIRY and fetching A"..., 76) = 76
close(3)= 0
exit_group(99)  = ?


## Conclusions:
Dead unmapped volume devices in RHEL 6.4 don't respond with an error to ioctl(), rather hanging the process (both Python and native)


Version-Release number of selected component (if applicable):
RHEL 6.4 with device-mapper-multipath 0.4.9-64

How reproducible:
Consistence 

Steps to Reproduce:
#Map some volumes to the host
#Run rescan 
#Unmmap volumes from the host
#Run rescan 
#Run sg_inq

Actual results:
ioctl hangs

Expected results:
sg_inq should return an error/warning message

Comment 2 Ben Marzinski 2013-08-22 20:45:12 UTC
Can you give me the

# multipath -ll

output from both RHEL-6.3 and RHEL-6.4, as well what versions of the device-mapper-multipath and kernel packages you used in both (I see you're using device-mapper-multipath 0.4.9-64).

I'm not currently able to recreate your issue.

Comment 3 eliranz 2013-09-01 14:51:21 UTC
Created attachment 915760 [details]
Comment

(This comment was longer than 65,535 characters and has been moved to an attachment by Red Hat Bugzilla).

Comment 4 eliranz 2013-09-29 06:54:32 UTC
Any updates???

Comment 5 Ben Marzinski 2013-10-01 21:56:34 UTC
This issue appears to have been introduced by a change in the 2.6.32-319.el6 kernel.  I'm looking into it now.

Comment 6 Ben Marzinski 2013-10-02 20:34:24 UTC
The multipath kernel code was intentionally changed to queue ioctls when queue_if_no_paths is set, instead of returning an error.  If you change your device configuration to not queue indefinitely, after queueing is disabled, sg_inq will return like it did in RHEL-6.3

Comment 7 eliranz 2013-10-10 14:24:53 UTC
Hi Ben,

Can you please explain what are the reasons to change the behavior of the driver to queue requests instead of return an immediate error when queue_if_no_paths is set?

This change is a real problem for our applications. Instead of clearly returning an error and displaying the situation to the user, the application will just get stuck. 

There was a long process where the default multipath settings were set for XIV arrays and this change breaks this process.

Comment 9 Mike Snitzer 2013-10-11 20:25:19 UTC
(In reply to eliranz from comment #7)
> Hi Ben,
> 
> Can you please explain what are the reasons to change the behavior of the
> driver to queue requests instead of return an immediate error when
> queue_if_no_paths is set?
> 
> This change is a real problem for our applications. Instead of clearly
> returning an error and displaying the situation to the user, the application
> will just get stuck. 
> 
> There was a long process where the default multipath settings were set for
> XIV arrays and this change breaks this process.

The upstream patch in question is here:
http://git.kernel.org/linus/7ba10aa6fba

It is a patch that just fell out of testing a different mpath change
proposed by David Jeffery.

The related dm-devel thread starts here (this is midpoint in response to
David's initial patch proposal):
http://www.redhat.com/archives/dm-devel/2012-September/msg00020.html

And ultimately I posted the patch in question to dm-devel here:
http://www.redhat.com/archives/dm-devel/2012-September/msg00205.html

So anyway, taking a step back. This bug is all about the ioctl
hanging now if queue_if_no_path is set.  I'm surprised RHEL6.3 didn't
respond that way before my change.

The patch makes it so that if queue_if_no_path is _not_ set the ioctl
will fail immediately.  If queue_if_no_path is set, it'll queue the
ioctl.  As is evidenced from header from commit 7ba10aa6fba

Comment 11 Mike Snitzer 2013-10-12 14:36:41 UTC
(In reply to Mike Snitzer from comment #9)
> 
> So anyway, taking a step back. This bug is all about the ioctl
> hanging now if queue_if_no_path is set.  I'm surprised RHEL6.3 didn't
> respond that way before my change.

After looking closer, I'm not surprised.  IBM's case doesn't have m->queue_io set whereas the scenario where an mpath device doesn't have any paths on mpath table load does have m->queue_io.

> The patch makes it so that if queue_if_no_path is _not_ set the ioctl
> will fail immediately.  If queue_if_no_path is set, it'll queue the
> ioctl.  As is evidenced from header from commit 7ba10aa6fba

My broader point is an mpath ioctl, which is destined to an underlying path, should queue_if_no_path just like the IO path does.  That is why this fix was sent upstream and tagged for inclusion in all upstream stable linux kernel trees.

IBM's application should _not_ be sending ioctl to an mpath device if it doesn't have any paths.  I can appreciate that the change in question creates problems for them because they never had to worry about the ioctl hanging.  But the previous queue_if_no_path inconsistency of the IO path queuing but the ioctl path _not_ queuing when there are no paths was never something an application should rely on.

One possibility for a workaround for IBM is to add a new RHEL6-only dm-mpath configuration option that allows them to set 'fail_ioctl_if_no_path'.  I'd _really_ rather avoid doing that but if IBM cannot see a way forward we can consider it.  That said, RHEL7 also queues the ioctl if no paths and queue_if_no_path is set so unless IBM changes their application to only issue ioctls when there are paths available it'll just fail for them in RHEL7.

Comment 12 eliranz 2013-10-13 14:49:40 UTC
The way we test that an mpath is valid (and has valid paths) is by sending a request. What best practice does redhat recommend for identifying a device has a valid path?

Comment 13 Mike Snitzer 2013-10-14 00:59:38 UTC
(In reply to eliranz from comment #12)
> The way we test that an mpath is valid (and has valid paths) is by sending a
> request. What best practice does redhat recommend for identifying a device
> has a valid path?

What do you mean by "sending a request"?  You configure queue_if_no_path so the request will never complete if there aren't any valid paths available.

You can evaluate the mpath device's state with 'multipath -ll'.  Alternatively, you can issue dmsetup commands to see if the mapth device has configured paths, e.g.: dmsetup table <mpath device name>

Or to get finer grained path status info via dmsetup, use: dmsetup info <mpath device name>

Paths in a path group that are active are denoted with 'A', paths that are failed with 'F'.

Comment 14 eliranz 2013-10-21 13:37:22 UTC
We can see that the dmsetup status <device> identifies whether there is a valid path. However, it takes a few seconds to discover that there is no path available, and in case we fall under this period and call our scsi inquiry we will remain stuck for ever (or until the path is back). This does reduce the chances of being stuck, but does not fully solve the problem.

We want to emphasize that this problem is not just with our application. You can see exactly the same behavior with sg_utils package provided by redhat.

From what we understand if a device lost all its paths forever, any inquiry will just remain stuck forever. Did we get it wrong? Are we missing something? Maybe the solution should be some timeout that will cause the inquiry to return with an error.

Comment 15 Mike Snitzer 2013-10-21 14:22:30 UTC
(In reply to eliranz from comment #14)
> We can see that the dmsetup status <device> identifies whether there is a
> valid path. However, it takes a few seconds to discover that there is no
> path available, and in case we fall under this period and call our scsi
> inquiry we will remain stuck for ever (or until the path is back). This does
> reduce the chances of being stuck, but does not fully solve the problem.
> 
> We want to emphasize that this problem is not just with our application. You
> can see exactly the same behavior with sg_utils package provided by redhat.

Sure, but the point is the user asked for IO to be queue_if_no_path.  Path recovery is required to get that queued IO to complete.

The fix in question just applies the same policy to ioctls too -- which should've been the case from the start.

> From what we understand if a device lost all its paths forever, any inquiry
> will just remain stuck forever. Did we get it wrong? Are we missing
> something? Maybe the solution should be some timeout that will cause the
> inquiry to return with an error.

We are exploring the possibility of a timeout for queue_if_no_path as part of another thread on dm-devel, the latest RFC patch for this is here (and still requires testing):
https://patchwork.kernel.org/patch/3070391/

NOTE: this patch focuses on timing out outstanding IO requests, not on ioctls.  If/when an IO request timeouts it'll disable m->queue_if_no_path, so the next retry of the ioctl will fail with -EIO.

Comment 16 Ben Marzinski 2013-10-21 16:30:03 UTC
If you set

no_path_retry <some_number>

Then you won't queue your ioctls forever.  After the last path fails, multipathd will retry for a set number of times, and then fail the ioctls along with the IO.

The only reason this wouldn't work is if you needed your IO to continue to be queued while your ioctls failed.

What would be more useful would probably be a stable library interface to get information from multipath about the devices.

Comment 17 eliranz 2013-10-28 13:40:16 UTC
We will investigate your recommendations.
Thanks a lot for your help.

Comment 19 Ben Marzinski 2014-03-26 20:36:11 UTC
Have you tried setting no_path_retry, to see if it resolved your issue?

Comment 20 eliranz 2014-03-30 13:15:41 UTC
Hi Ben,

The proposed change requires changing the multipath.conf file. 
Our standards are to use the default OS settings, especially on Red Hat which adjusted to XIV not long ago. Therefore, this proposed solution is not fit for us.

Currently, we're looking for workaround to solve this issue.

- Eliran

Comment 21 Ben Marzinski 2014-04-04 16:58:13 UTC
I'm not sure what solution there will be in RHEL6, short of changing that configuration.  I understand your issue.  Our default configs are simply what the vendors have given us.  But IBM is not very proactive on updating these configurations.  Would it be possible for you to contact IBM to see if this is an allowable change.  I know a large number of customer do change this. While you are at it, you should ask them if you can change the path_selector to
service-time, since this can result in higher performance on some setups.

I would recommend a configuration of

devices {
    device {
        vendor "IBM"
        product "2810XIV"
        path_selector "service-time 0"
        features "0
        no_path_retry "12"
    }
}


setting no_path_retry to 12 will do 12 retries (which at 5 seconds apart will take one minute) and then fail over the device.

If your issue with changing the defaults is not due to vendor or OS support, but to make installations more standard, then you still may consider talking to IBM. We simply use the configuration that IBM gives us, and if people complain to them, they will be more likely to update their configuration.


Note You need to log in before you can comment on or make changes to this bug.