584742 – IO is stuck due to active device that moved to suspend mode and never resumed

Bug 584742 - IO is stuck due to active device that moved to suspend mode and never resumed

Summary: IO is stuck due to active device that moved to suspend mode and never resumed

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	device-mapper-multipath
Sub Component:
Version:	5.5
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Ben Marzinski
QA Contact:	Gris Ge
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	607487 607869
TreeView+	depends on / blocked

Reported:	2010-04-22 10:26 UTC by Ayal Baron
Modified:	2013-03-06 17:12 UTC (History)
CC List:	23 users (show)
Fixed In Version:	device-mapper-multipath-0.4.7-36.el5
Doc Type:	Bug Fix
Doc Text:	Previously, to resize a multipath device, users had to run the command "multipathd -k'resize map <mapname>'", as running "multipath" instead caused the multipath to become stuck in the SUSPENDED state. In this state, all input/output to the device hung. Now users can resize the device with either command, and the multipath device will no longer get stuck in the SUSPENDED state.
Clone Of:
Clones:	607869 (view as bug list)
Environment:
Last Closed:	2011-01-13 23:03:39 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
systemtap script to figure out why multipath is getting stuck in suspend. (846 bytes, text/plain) 2010-06-02 03:36 UTC, Ben Marzinski	no flags	Details
output (33.41 KB, text/plain) 2010-06-03 05:05 UTC, Moran Goldboim	no flags	Details
Corrected-output - relevant /var/log/messages (125.70 KB, text/plain) 2010-06-03 05:21 UTC, Moran Goldboim	no flags	Details
Updated systemtap test with more information from dm_swap_table (1.10 KB, text/plain) 2010-06-03 21:25 UTC, Ben Marzinski	no flags	Details
script, var/log/messages output (29.68 KB, text/plain) 2010-06-06 11:46 UTC, Moran Goldboim	no flags	Details
Show Obsolete (3) View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2011:0074	0	normal	SHIPPED_LIVE	device-mapper-multipath bug fix and enhancement update	2011-01-12 17:22:03 UTC

Description Ayal Baron 2010-04-22 10:26:40 UTC

Description of problem:
Although all devices are active and ready (see multipath -ll output below), device moved to suspend mode and never resumed, causing all lvm operations on the machine to hang.
This is not the first time this has happened.

The machine can be accessed, contact me in irc for details (#tlv - abaron_)

multipath -ll:
14f504e46494c4500414d4e584c572d48724d612d68647965 dm-8 OPNFILER,VIRTUAL-DISK
[size=10G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 449:0:0:6  sdh 8:112 [active][ready]
14f504e46494c4500467161665a582d546170422d456b3473 dm-5 OPNFILER,VIRTUAL-DISK
[size=32M][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 449:0:0:3  sde 8:64  [active][ready]
14f504e46494c45005174675632782d4d61307a2d4f4c474b dm-11 OPNFILER,VIRTUAL-DISK
[size=10G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 449:0:0:9  sdk 8:160 [active][ready]
14f504e46494c4500616e415977372d746247672d7149614d dm-12 OPNFILER,VIRTUAL-DISK
[size=10G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 449:0:0:10 sdl 8:176 [active][ready]
14f504e46494c45003577326a66512d62594c432d58625a6c dm-10 OPNFILER,VIRTUAL-DISK
[size=10G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 449:0:0:8  sdj 8:144 [active][ready]
14f504e46494c450055554f4b4d502d74464d592d67743531 dm-7 OPNFILER,VIRTUAL-DISK
[size=10G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 449:0:0:5  sdg 8:96  [active][ready]
14f504e46494c450034653353396b2d755262472d73514e43 dm-6 OPNFILER,VIRTUAL-DISK
[size=13G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 449:0:0:4  sdf 8:80  [active][ready]
14f504e46494c45004f376e564d312d786c74692d58786b6b dm-2 OPNFILER,VIRTUAL-DISK
[size=13G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 449:0:0:0  sdb 8:16  [active][ready]
14f504e46494c4500466e674650712d7a4667442d726e4f65 dm-9 OPNFILER,VIRTUAL-DISK
[size=10G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 449:0:0:7  sdi 8:128 [active][ready]
14f504e46494c4500524f3034656d2d635056472d6b74646e dm-4 OPNFILER,VIRTUAL-DISK
[size=13G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 449:0:0:2  sdd 8:48  [active][ready]
14f504e46494c45005643555130572d574236462d5349764b dm-3 OPNFILER,VIRTUAL-DISK
[size=13G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 449:0:0:1  sdc 8:32  [active][ready]
14f504e46494c4500733134424d542d3744576f2d654a585a dm-13 OPNFILER,VIRTUAL-DISK
[size=10G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 449:0:0:11 sdm 8:192 [active][ready]

dmsetup info:
Name:              14f504e46494c4500414d4e584c572d48724d612d68647965
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        6
Event number:      0
Major, minor:      253, 8
Number of targets: 1
UUID: mpath-14f504e46494c4500414d4e584c572d48724d612d68647965

Name:              14f504e46494c4500467161665a582d546170422d456b3473
State:             SUSPENDED
Read Ahead:        256
Tables present:    LIVE
Open count:        3
Event number:      0
Major, minor:      253, 5
Number of targets: 1
UUID: mpath-14f504e46494c4500467161665a582d546170422d456b3473

Name:              9df1d62c--5dd9--4067--921e--8b48e396c723-leases
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        0
Event number:      0
Major, minor:      253, 45
Number of targets: 1
UUID: LVM-1eob0fWq5x7HDxtapDvqFbyPvXqEqzUTY1bNIFhmcyfa6q4eUVbROCwYU6ncBsKS

Name:              e2a1a9ce--0744--415f--b514--3a27f83e49ee-master
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        0
Event number:      0
Major, minor:      253, 31
Number of targets: 1
UUID: LVM-99pSN0rYvhkBbepK8z8vy5vQ60kw3BIqXZoJZnQNOXM44Zw9TzCxIBI2EfTtB7vn

Name:              14f504e46494c45005174675632782d4d61307a2d4f4c474b
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        6

Comment 1 Ben Marzinski 2010-04-26 17:19:34 UTC

Huh. Does this happen after any specific action, say after resizing a device or after a path fails?  Are there any log messages related to this.  I'd be happy to get on the machine and check it out.  You can either email me the details, or I can try to ping you when I get in tomorrow, which would be at 9AM CDT (UTC -5), or 5PM your time.  I'm nimh on irc.

Comment 2 Ben Marzinski 2010-05-12 17:16:05 UTC

Has this recreated?  Any lock finding out whan was happening when this occured?

Comment 3 Ayal Baron 2010-05-14 10:36:36 UTC

On the same machine this is now constantly happening (we resume the device and it is suspended soon after).
If you want the machine info, mail me.

Comment 4 Ben Marzinski 2010-05-25 23:39:19 UTC

I think the best course of action might be to make a patched device-mapper package, the will let us find out who is supending the device.

Comment 5 Ben Marzinski 2010-06-02 03:36:02 UTC

Created attachment 418890 [details]
systemtap script to figure out why multipath is getting stuck in suspend.

This is a systemtap script that will print out information when processes suspend and resume dm devices.  Hopefully, this will help us narrow down what is going wrong here.  To run this script, you need the systemtap package installed. You may also need some kernel packages. You run the script with

# stap dm_suspend.stp

When it prints

starting

you can run your tests.  If you see the hang while this script is running, please attach the output from this script along with /var/log/messages to this bugzilla.

Comment 6 Moran Goldboim 2010-06-02 13:55:14 UTC

after installing systemtap and debuginfo for device mapper i can't run the script getting the following output:

[root@east-04 ~]# stap dm_suspend.stp
WARNING: cannot find module dm_mod debuginfo: No DWARF information found
semantic error: no match while resolving probe point module("dm_mod").function("dm_resume")
WARNING: cannot find module dm_mod debuginfo: No DWARF information found
semantic error: no match while resolving probe point module("dm_mod").function("dm_suspend")
WARNING: cannot find module dm_mod debuginfo: No DWARF information found
semantic error: no match while resolving probe point module("dm_mod").function("dm_swap_table")
WARNING: cannot find module dm_mod debuginfo: No DWARF information found
semantic error: no match while resolving probe point module("dm_mod").function("dev_suspend")
WARNING: cannot find module dm_mod debuginfo: No DWARF information found
semantic error: no match while resolving probe point module("dm_mod").function("dm_resume").return
WARNING: cannot find module dm_mod debuginfo: No DWARF information found
semantic error: no match while resolving probe point module("dm_mod").function("dm_suspend").return
WARNING: cannot find module dm_mod debuginfo: No DWARF information found
semantic error: no match while resolving probe point module("dm_mod").function("dev_suspend").return
WARNING: cannot find module dm_mod debuginfo: No DWARF information found
semantic error: no match while resolving probe point module("dm_mod").function("dm_swap_table").return
Pass 2: analysis failed.  Try again with another '--vp 01' option.
[root@east-04 ~]# rpm -qa device*
device-mapper-1.02.39-1.el5_5.2
device-mapper-multipath-0.4.7-34.el5
device-mapper-1.02.39-1.el5_5.2
device-mapper-multipath-debuginfo-0.4.7-34.el5
device-mapper-event-1.02.39-1.el5_5.2
device-mapper-debuginfo-1.02.39-1.el5_5.2

Comment 7 Ben Marzinski 2010-06-02 14:47:39 UTC

You need to have the appropriate kernel-debuginfo, kernel-debuginfo-common, kernel-headers, and kernel-devel packages.

Comment 8 Moran Goldboim 2010-06-03 05:05:34 UTC

Created attachment 419244 [details]
output

Comment 9 Moran Goldboim 2010-06-03 05:07:08 UTC

script output and /var/log/messages are attached
Script output:
======================================================
1275540843: multipath[8277] entered dev_suspend for 36090a068a074eb5824c09403000040c3 with flags 3588
1275540843: multipath[8277] entered dm_suspend
1275540843: multipath[8277] exitted dm_suspend with 0
1275540843: multipath[8277] entered dm_swap_table
1275540843: multipath[8277] exitted dm_swap_table with -22
1275540843: multipath[8277] exitted dev_suspend with -22
1275540843: multipath[8277] entered dev_suspend for 36090a068a074fbed60c0c403000010c9 with flags 3588
1275540843: multipath[8277] entered dm_suspend
1275540843: multipath[8277] exitted dm_suspend with 0
1275540843: multipath[8277] entered dm_swap_table
1275540843: multipath[8277] exitted dm_swap_table with -22
1275540843: multipath[8277] exitted dev_suspend with -22
1275540843: multipath[8277] entered dev_suspend for 36090a068a0745b5624c0640300005077 with flags 3588
1275540843: multipath[8277] entered dm_suspend
1275540843: multipath[8277] exitted dm_suspend with 0
1275540843: multipath[8277] entered dm_swap_table
1275540843: multipath[8277] exitted dm_swap_table with -22
1275540843: multipath[8277] exitted dev_suspend with -22

Comment 10 Moran Goldboim 2010-06-03 05:21:30 UTC

Created attachment 419245 [details]
Corrected-output - relevant /var/log/messages

Comment 11 Ben Marzinski 2010-06-03 21:25:06 UTC

Created attachment 421040 [details]
Updated systemtap test with more information from dm_swap_table

Comment 13 Moran Goldboim 2010-06-06 11:46:14 UTC

Created attachment 421582 [details]
script, var/log/messages output

Attached is the script+var/log/messages output.

Comment 14 Ben Marzinski 2010-06-07 18:37:32 UTC

I can see that the size of your multipath device is changing when you try to reload the table.  However, looking at the flags, it doesn't look like it was called by with

multipathd -k"resize map <name>"

Are you trying to resize the map?  If so, what are you running?

-Ben

Comment 15 Moran Goldboim 2010-06-10 08:44:31 UTC

i haven't resized the map directly and either Eyal is saying that he isn't doing such a thing, the only thing i'm doing is creating a disconnection using iptable between server and storage.

Comment 16 Ayal Baron 2010-06-10 09:10:08 UTC

No, we do not meddle with the map at all.
Ben, the server you saw wrt the other issue is set up exactly the same as Moran's server if that makes any difference to you.
Network disconnect is simulated by running iptables reject to the storage machine IP

Comment 17 Ayal Baron 2010-06-14 13:46:01 UTC

We have another host Where this happened after lun extend.  Resuming the devices doesn,t help, to go back to being SUSPENDED (although multipath reports everything is fine)

Comment 18 Ben Marzinski 2010-06-14 18:42:46 UTC

It would be really helpful if you could run the debug. package that I built.  In case you didn't grab it before, it's now available at

http://porkchop.devel.redhat.com/brewroot/scratch/bmarzins/task_2515625/

And if you could add:

  verbosity  3

To the defaults section of /etc/multipath.conf.  With that package, the verbosity option set, and the systemtap script running. If you could reproduce this issue and collect the output from

multipath -ll

and

dmsetup table

from before you hit this issue, the systemtap and /var/log/messages output from when this issue occurs, and the output from

multipath -ll

and

dmsetup table

from after the error has occured, that would be really helpful

Comment 19 Ben Marzinski 2010-06-15 20:06:40 UTC

So it appears that comment #14 was correct. The underlying path device has changed size, and multipath wasn't told to resize the device with

multipathd -k"resize map <name>"

and when the multipath command got run after the resume (I assume by vdsm) It reloaded the device, but the size was wrong, and the kernel failed during the suspend.  By an odd chance, this bug is only visible with el5, the el5.1 package shouldn't see this.  However that is because the 584343 fix happened to interact with a long unnoticed multipath bug in odd ways.

The solution is to:
1. take care of the long unnoticed multipath bug
2. make multipath do a resize instead of a regular reload when it notices that the size has changed.

Now that I understand what going on, I am able to recreate this without vdsm. This one is probably also a candidate for a zstream fix.

Comment 20 Ben Marzinski 2010-06-16 03:37:54 UTC

I have a test package that should fix this issue:

You can download it from:http://porkchop.devel.redhat.com/brewroot/scratch/bmarzins/task_2519380/

Comment 22 Moran Goldboim 2010-06-21 06:53:20 UTC

Tested the fix with the problematic scenario and the devices are functioning well.

Comment 23 Ben Marzinski 2010-06-23 19:24:31 UTC

This fix is in now.

Comment 25 Jaromir Hradilek 2010-06-29 08:54:59 UTC

Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.

New Contents:
Previously, to resize a multipath device, users had to run the command "multipathd -k'resize map <mapname>'", as running "multipath" instead caused the multipath to become stuck in the SUSPENDED state. In this state, all input/output to the device hung. Now users can resize the device with either command, and the multipath device will no longer get stuck in the SUSPENDED state.

Comment 31 Gris Ge 2010-11-10 07:35:49 UTC

With device-mapper-multipath-0.4.7-39.el5

multipathd -k'resize map <mpath_name>' will not put mpath into SUSPEND mode.

Verified resize LUN when fs is not mounted.
Verified resize LUN with fs mounted.
Also verified resize LUN which is for / partition.
Multipath resize works well.

Also reproduced the SUSPEND issue with device-mapper-multipath-0.4.7-34.el5

Comment 33 errata-xmlrpc 2011-01-13 23:03:39 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2011-0074.html

Note You need to log in before you can comment on or make changes to this bug.

agk
a.v.aleksandrov
bdonahue
benl
bmarzins
bmr
christophe.varoqui
cmarthal
cpelland
dwysocha
fge
heinzm
iheim
junichi.nomura
kueda
lmb
mbroz
mgoldboi
prajnoha
prockai
raud
tao
ykaul