Bug 212265 - lvm'd multipath'd iscsi luns don't shutdown properly
lvm'd multipath'd iscsi luns don't shutdown properly
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: iscsi-initiator-utils (Show other bugs)
4.4
All Linux
medium Severity medium
: rc
: ---
Assigned To: Mike Christie
Gris Ge
:
: 212266 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-10-25 18:05 EDT by Josh Hildebrand
Modified: 2013-01-10 21:28 EST (History)
26 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, devices for Device-Mapper Multipath(dm-multipath) and the logical volume manager (LVM) were not cleanly shutdown when they contained iSCSI devices. With this update, the iSCSI init script now shuts down devices for dm-multipath and LVM before the iSCSI service stops.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-02-16 09:26:39 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
screen shot of reboot process 1/3 (82.61 KB, image/jpeg)
2006-10-25 18:05 EDT, Josh Hildebrand
no flags Details
screen shot of reboot process 2/3 (117.73 KB, image/jpeg)
2006-10-25 18:14 EDT, Josh Hildebrand
no flags Details
screen shot of reboot process 3/3 (111.66 KB, image/jpeg)
2006-10-25 18:15 EDT, Josh Hildebrand
no flags Details
linux_info tarball (100.27 KB, application/octet-stream)
2006-10-26 12:26 EDT, Josh Hildebrand
no flags Details
script to deactivate iscsi-based lv's (3.18 KB, patch)
2007-01-16 17:30 EST, Dave Wysochanski
no flags Details | Diff
Patch to iscsi init script to deactive lvs before flushing multipath maps when iscsi is stopped (446 bytes, patch)
2007-01-16 17:32 EST, Dave Wysochanski
no flags Details | Diff
Fix obvious minor bugs and improve performance (3.27 KB, text/plain)
2007-01-22 16:33 EST, Dave Wysochanski
no flags Details
Patch against iscsi init script that incorporates improvements in multipath and lvm device flushing on stopping iscsi (5.00 KB, patch)
2007-01-22 20:03 EST, Dave Wysochanski
no flags Details | Diff
Latest script to remove all iscsi based dm device maps (5.19 KB, text/plain)
2007-01-24 15:34 EST, Dave Wysochanski
no flags Details
Simple patch to "lvchange -an" any lv that is based on an iscsi map, before flushing the map on shutdown (697 bytes, patch)
2010-05-03 20:26 EDT, Dave Wysochanski
no flags Details | Diff
Updated lvchange patch - only issue lvchange one time (783 bytes, patch)
2010-05-04 16:16 EDT, Dave Wysochanski
no flags Details | Diff

  None (edit)
Description Josh Hildebrand 2006-10-25 18:05:54 EDT
Description of problem:
When a NetApp LUN is mounted via iscsi, and then device-mapper-multipath maps 
it to /dev/dm-#, and then lvm2 uses it as a physical device and subsequently 
in a logical volume and volume group, it seems to complicate the 
standard "shutdown -rf" reboot process.

I've taken a few snapshots of the console while my machine reboots in order to 
catch the errors.  But I'll do my best to retype them below.



Luckily, it doesn't HANG, but I do wonder how good it is on the 
data/filesystem of this LUN.


Version-Release number of selected component (if applicable):
device-mapper-multipath-0.4.5-16.1.RHEL4
iscsi-initiator-utils-4.0.3.0-4
lvm2-2.02.06-6.0.RHEL4
kernel-smp-2.6.9-42.0.3.EL

How reproducible:
Every time

Steps to Reproduce:
1. Mount up a Netapp LUN over iSCSI using multipath-dm
2. Add it to a LVM
3. shutdown -rf
  
Actual results:
Stopping multipathd daemon: device-mapper: waitevent ioctl failed: Interrupted 
system call
....
Searching for iscsi-based multipath maps
Found 1 maps
Flushing iscsi-based multipath map, mpath60
mpath60: map in use     (I imagine because of LVM)
Stopping iscsid:            [OK]
iscsi-sfnet:host1: Session dropped
iscsi-sfnet:host2: Session dropped
iscsi-sfnet:host3: Session dropped
iscsi-sfnet:host4: Session dropped
Removing iscsi driver: ERROR: Module iscsi_sfnet is in use   [FAILED]
.....
(i may have missed something important here)
....
scsi4 (0:9): rejecting I/O to dead device
device-mapper: dm-multipath: Failing path 8:32.
scsi3 (0:9): rejecting I/O to dead device
device-mapper: dm-multipath: Failing path 8:16.
Buffer I/O error on device dm-1, logical blcok 13098993
Buffer I/O error on device dm-1, logical blcok 13098994
Buffer I/O error on device dm-1, logical blcok 13098995
....etc,etc...
Restarting system.

Expected results:
A good clean shutdown, w/o all the warnings.

Additional info:
I haven't tested this scenario W/O lvm2 being involved.
Comment 1 Josh Hildebrand 2006-10-25 18:05:56 EDT
Created attachment 139412 [details]
screen shot of reboot process 1/3
Comment 2 Josh Hildebrand 2006-10-25 18:14:49 EDT
Created attachment 139417 [details]
screen shot of reboot process 2/3
Comment 3 Josh Hildebrand 2006-10-25 18:15:51 EDT
Created attachment 139418 [details]
screen shot of reboot process 3/3
Comment 4 Dave Wysochanski 2006-10-26 11:45:18 EDT
At first glance, this may be the key error:
"Stopping multipathd daemon: device-mapper: waitevent ioctl failed: Interrupted 
system call"
Comment 5 Dave Wysochanski 2006-10-26 11:55:44 EDT
Scratch that last comment - this looks like a dependency issue between LVM maps
and multipath maps.  The iscsi shutdown tries to flush multipath maps but
there's LVM maps that depend on the multipath ones so that fails.

Ideally if we are going to flush the multipath maps the LVM maps need to be
flushed first.

I am not sure about all the details of the shutdown yet.  In the past, I
remember seeing these messages:
....
scsi4 (0:9): rejecting I/O to dead device
device-mapper: dm-multipath: Failing path 8:32.
scsi3 (0:9): rejecting I/O to dead device
device-mapper: dm-multipath: Failing path 8:16.
...

when using filesystem labels.

Are you using labels, and is everything unmounted properly?  If you have
Netapp's "linux_info" script you can run that and attach the tarball which will
have the /etc/fstab and stuff in it.
Comment 6 Dave Wysochanski 2006-10-26 12:08:29 EDT
Just to be clear, I think
Comment 7 Josh Hildebrand 2006-10-26 12:21:34 EDT
I am not using filesystem labels.. other than whatever LVM puts on the 
drives/luns.

My fstab is below.. the filesystem in question is the quest_vg line.. it's the 
netapp iscsi LUN.

LABEL=/                 /                       ext3    defaults        1 1
LABEL=/boot             /boot                   ext3    defaults        1 2
none                    /dev/pts                devpts  gid=5,mode=620  0 0
none                    /dev/shm                tmpfs   defaults        0 0
none                    /proc                   proc    defaults        0 0
none                    /sys                    sysfs   defaults        0 0
LABEL=SWAP-sda2         swap                    swap    defaults        0 0
techora1.corpdom1.com:/oracle/patches /oracle/patches  nfs 
noauto,rsize=8192,wsize=8192,timeo=14,intr,_netdev
/dev/quest_vg/quest_lv /oracle/quest ext3 _netdev
/dev/hda                /media/cdrom            auto    
pamconsole,fscontext=system_u:object_r:removable_t,exec,noauto,managed 0 0
/dev/fd0                /media/floppy           auto    
pamconsole,fscontext=system_u:object_r:removable_t,exec,noauto,managed 0 0

My belief is what you said in comment #5.. a dependancy between lvm maps and 
multipath maps.

I did not manually unmount the filesystem before issuing the shutdown -rf.  I 
should not have to.  I would imagine it unmounts itself just fine, but the lvm 
map holds it hostage still.

How else can I assist?
Comment 8 Josh Hildebrand 2006-10-26 12:26:22 EDT
Created attachment 139480 [details]
linux_info tarball

Here is the output of linux_info.
Comment 9 Dave Wysochanski 2006-10-26 12:30:28 EDT
From what I understand of this, provided you have unmounted filesystems and
such, I don't think this shutdown sequence problem will cause data problems on
the LUN.  It is probably annoying though.

Ah - you are using labels though.  I didn't mean "using labels on the iSCSI
LUNs", I meant "using labels at all".

There is another bz related to this one:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=177355
In it is an explanation for the "I/O" error messages (related to reading all
block devices search for labels on shutdown).

One workaround suggested in that bz is to turn off iscsi and network shutdown
scripts.

In the case of using LVM with dm-mp over iSCSI, this may be the only viable
option.  You may also need to check this script:
/etc/rc6.d/S00killall

and add a line for iscsi similar to the nfs/network line:
        # Networking could be needed for NFS root.
        [ $subsys = network ] && continue
Comment 10 Dave Wysochanski 2006-10-26 12:36:38 EDT
Thanks for sending the tarball.

Actually the tarball you sent, the /etc/fstab file doesn't have any labels, and
it's different from the above.

Comment 11 Josh Hildebrand 2006-10-26 14:04:27 EDT
I'm unable to view bug 177355 (not authorized).  Why are some bug id's not 
viewable?  There are some other iscsi related ones that I can't view, either.

I've added this part to S00killall, but I can't reboot the box yet to test 
it.  I don't see how that would fix the LVM map unloading before the multipath 
map unloads, though.  I guess you are just pointing out that I'm showing signs 
of two diferent issues here.  Thanks.


        # iscsi could be needed for other stuff
        [ $subsys = iscsi ] && continue

Regarding comment #10, I don't see a difference in the fstab I posted versus 
the one in the tarball.  Are you sure?
Comment 12 Alasdair Kergon 2006-10-26 14:14:51 EDT
If you can't view a bug, it's usually because it contains customer- or
partner-specific information.  Such bugs can sometimes be reviewed and made
public with such information removed from view.
Comment 13 Dave Wysochanski 2006-10-26 14:32:10 EDT
Sorry Josh - try viewing that bug now.  I put you on the CC list so I think you
should be able to access it.

You're right about the fstab file - I must have been looking somewhere else, my
mistake.

Did you also turn off iscsi and network shutdown scripts?  The fix we're leaning
towards right now involves not shutting the network and iSCSI down, and not
flushing any maps.  With this approach, I think there may also be an issue with
/etc/init.d/halt (where it kills everything - need to make sure iscsid doesn't
get killed).

Comment 15 Josh Hildebrand 2006-10-26 15:20:13 EDT
I did not turn off the iscsi and network shutdown scripts..  Perhaps I need 
to.  Unfortunately, my test box has suddenly become unavailable to me.

Your approach sounds more like a workaround.  Wouldn't it be better to make 
lvm deactivate a volumegroup if it's a _netdev before it tries to shut down 
multipaths?

is a "vgchange -a n <vg>" enough to make this work, or do we need to fully 
export the vg first?
Comment 16 Dave Wysochanski 2006-10-26 15:46:47 EDT
I would imagine the vgchange would work in your case since that removes the LVM
maps from the kernel.

You may be right about the approach.  The reason we were moving towards the
"don't shut things down" solution is because of iscsi root disks - in this case
you don't want to turn things off.  

Were you thinking about doing this in the iscsi script's shutdown path, similar
to where the multipath maps are flushed?
Comment 17 Josh Hildebrand 2006-10-26 16:03:12 EDT
I haven't really dissected the iscsi shutdown paths since RHEL3 had a very 
similar lvm/iscsi shutdown issue.  So, in other words, I haven't really 
thought much about where this should be put in the path.

I understand the need to head towards a iscsi root disk path.. it still seems 
like everything that can be unmounted and shutdown should be before the final 
reboot.  So, perhaps having some sort of _root flag in the fstab to specify 
devices that are _netdev and _root need special handling.. otherwise unload 
all the _netdev's.
Comment 18 Dave Wysochanski 2006-10-26 16:17:29 EDT
I agree with the ideal of shutting things down cleanly and in the proper order.
  I will try to come up with something for RHEL4.5

I need to be careful of more complex setups though - I'm wondering about someone
doing snapshots, etc, over iSCSI now.  Seems like we need to flush a whole dm
tree where there's iSCSI leaf nodes.

A simpleton approach would be something like this:
- look in /etc/fstab for _netdev devices
- do "ls -lL" on the device and grab major,minor #
- figure out if there's any iSCSI dependencies on this major,minor **
  - flush all maps involving iscsi devices in the right order

** might be tricky and end up dangerous to non-iscsi maps if there's a bug in
the logic finding dependencies (need to be careful here)

It seems like something like the above would be needed even w/out multipath on
iSCSI - just LVM over iSCSI will have the same problem.
Comment 19 Alasdair Kergon 2006-11-01 13:37:20 EST
*** Bug 212266 has been marked as a duplicate of this bug. ***
Comment 20 Dave Wysochanski 2007-01-15 19:06:51 EST
Using "dmsetup deps", I am getting closer to a possible solution, but I'm not
sure about the safety of it and will have to think through some more setups.
Comment 21 Dave Wysochanski 2007-01-16 17:30:18 EST
Created attachment 145748 [details]
script to deactivate iscsi-based lv's

Script I've been testing that is close to what is needed for this bug (see
FIXME comments).  This can be called in the iscsi shutdown path before we flush
the multipath maps.  Still not clean though and more complex setups may fail. 
I'm testing striped iscsi with a snapshot and it seems to work ok.
Comment 22 Dave Wysochanski 2007-01-16 17:32:58 EST
Created attachment 145749 [details]
Patch to iscsi init script to deactive lvs before flushing multipath maps when iscsi is stopped
Comment 24 Dave Wysochanski 2007-01-22 16:33:04 EST
Created attachment 146233 [details]
Fix obvious minor bugs and improve performance

Make the script less brain-dead.  ;-)
Comment 25 Dave Wysochanski 2007-01-22 20:03:03 EST
Created attachment 146264 [details]
Patch against iscsi init script that incorporates improvements in multipath and lvm device flushing on stopping iscsi

Tested against netapp sim with 256 iscsi luns that were multipathed (single
path though), 
Details of single test machine:
- i386 running latest lvm2, device mapper, and multipath-tools, and kernel in
brew for rhel4 u5
- local LVM root and swap partition (cciss based)
- fcp luns (HP MSA1000)
- 256 iscsi luns hosted on netapp sim

Without this patch, shutdown takes ~9.5min's, just to flush the multipath maps.
 with this patch, we check for LVM maps as well as flush multipath maps and
total time drops fairly dramatically to a little over 1min.
Comment 26 Dave Wysochanski 2007-01-23 11:31:39 EST
I am seeing an issue on shutdown though when targets are unreachable (shutdown
hangs).  Looks like it's mainly related to 'lvs'.

I guess xen networking bugs can actually be a good thing...
Comment 27 Dave Wysochanski 2007-01-24 11:30:00 EST
Just a design note.  The original thought about deactivating LVs that were
marked with _netdev was good but is not a complete solution.  The reason is you
could have LVs that do not have filesystems on them, so they wouldn't be in
/etc/fstab yet you'd still need to deactivate them.

The solution I'm working towards now is to find all dm maps that contain iscsi
devices at leaf nodes and during iscsi shutdown, use dmsetup to remove the maps
in the correct order.  Removal may fail if a device in the tree isn't
flushed/clean at that point (someone forgets to put an entry in fstab, or there
are outstanding I/Os, etc), but this should only cause an error message, nothing
severe (BTW, I think root on iscsi would just be a special case of this - we'll
just fail to remove the map).  In the flushed case it should all work though.  
Comment 28 Dave Wysochanski 2007-01-24 15:29:04 EST
The main problem with the above approach is how do you script it?  I have code
that determines whether a given map is iscsi based or not - it's recursive and
starts with a list of dm maps and goes down to the root node.  But ideally you
need to start with the root nodes in the dm tree and "dmsetup remove" on the way
down, not the other way around.  The current code works the other way.  I think
I know how to fix it though.
Comment 29 Dave Wysochanski 2007-01-24 15:34:26 EST
Created attachment 146457 [details]
Latest script to remove all iscsi based dm device maps

This is the latest code I'm working with.  I've tried to make it a bit more
generic and cleaned up some things.  Still needs work and I'm not yet removing
the maps.
Comment 30 Dave Wysochanski 2007-01-24 18:45:50 EST
I'm having second thoughts about this latest approach.  Deactivating the dm tree
outright seems wrong.  If the LVs are still active, they need shutdown with the
LV tools b/c there could be more complicated things in the environment (such as
clustering).  Same goes for other constructs that depend on device mapper.  In
one sense we don't care but it seems like the wrong design.

I went down this route b/c a command I was using to get the list of LVs, 'lvs',
would hang in an uninterruptible sleep if an iscsi target was not reachable. 
Looks like I should probably track that down rather than abandoning the approach.
Comment 31 Dave Wysochanski 2007-02-22 09:35:37 EST
I think the 'lvs' problem may be related to the cache and the fact that lvm2
doesn't know what 'vg0' is when it's invoked (has to read the pvs to get the
metadata, so it issues I/O to all the devices listed, if some of them have
unreachable targets I/O might hang, etc).
Comment 33 Dave Wysochanski 2010-01-04 21:22:01 EST
Maybe all we need here is something in the netfs script to deactivate any LVs that are iscsi based - maybe an lvchange -an on _netdev devices.
Comment 36 Dave Wysochanski 2010-05-03 20:26:55 EDT
Created attachment 411163 [details]
Simple patch to "lvchange -an" any lv that is based on an iscsi map, before flushing the map on shutdown

This is the best approach I could come up with to solve this problem.  We need to use the LVM tools to determine whether a given multipath map is part of an LV.  The best way I could determine to do this was by building a custom 'filter' line and displaying the lvname for a given multipath map.  Then, based on a non-empty output of 'lvs', we call 'lvchange -an' on the LV before flushing the multipath maps.
Comment 37 Dave Wysochanski 2010-05-03 20:42:18 EDT
Patch needs at least one more iteration to check for lvm installation, etc.
Comment 38 Dave Wysochanski 2010-05-04 13:20:58 EDT
Another approach I considered was using "lvs -o name,devices" and making multiple passes through the list for complex LVs with hidden compoents (e.g. mirror) or stacked LVs.  That approach is more accurate but is more complex than using the filter line.  Filter line relies on metadata being stored on the iscsi/multipath PVs though, so is not really the right approach, but is simple and should catch the most common case.

A downside of using LVM tools over dmsetup is of course IO gets issued to the devices, which may result in hangs during shutdown.  Using the filter line limits the IO to one device at a time, but really does not answer the question of the LVs on the PVs.
Comment 39 Dave Wysochanski 2010-05-04 16:16:38 EDT
Created attachment 411399 [details]
Updated lvchange patch - only issue lvchange one time
Comment 40 Dave Wysochanski 2010-10-19 10:19:24 EDT
The patch in Comment #39 should be built into rhel4.9 iscsi to fix this bug.
Comment 41 Mike Christie 2010-10-20 01:10:43 EDT
Fixed in iscsi-initiator-utils-4.0.3.0-9. You can download from here:
http://people.redhat.com/mchristi/iscsi/rhel4.9/iscsi-initiator-utils/
Comment 48 Florian Nadge 2011-01-03 09:42:06 EST
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, devices for Device-Mapper Multipath(dm-multipath) and the logical volume manager (LVM) were not cleanly shutdown when they contained iSCSI devices. With this update, the iSCSI init script now shuts down devices for dm-multipath and LVM before the iSCSI service stops.
Comment 49 Gris Ge 2011-01-18 05:11:36 EST
Got similar error with iscsi-initiator-utils-4.0.3.0-10
======================================================
INIT: Switching to runlevel: 0
INIT: Sending processes the TERM signal
Stopping cups: [  OK  ]
Shutting down xfs: [  OK  ]
Shutting down console mouse services: [  OK  ]
Stopping sshd:[  OK  ]
Shutting down sm-client: [  OK  ]
Shutting down sendmail: [  OK  ]
Shutting down smartd: [FAILED]
Stopping crond: [  OK  ]
Stopping multipathd daemon: device-mapper: waitevent ioctl failed: Interrupted system call
[  OK  ]
Shutting down kernel logger: [  OK  ]
Shutting down system logger: [  OK  ]
0:0:0:1: expected length 24, got length 20
1:0:0:1: expected length 24, got length 20
2:0:0:1: expected length 24, got length 20
3:0:0:1: expected length 24, got length 20
Searching for iscsi-based multipath maps
Found 1 maps
Deactivating LVs vg_test/lv_test  for multipath maps mpath0
  /dev/cdrom: open failed: Read-only file system
  LV vg_test/lv_test in use: not deactivating
Flushing iscsi-based multipath map, mpath0
mpath0: map in use
Stopping iscsid: [  OK  ]
Synchronizing SCSI cache for disk sda: 
iscsi-sfnet:host0: Session dropped
Synchronizing SCSI cache for disk sdb: 
iscsi-sfnet:host1: Session dropped
Synchronizing SCSI cache for disk sdc: 
iscsi-sfnet:host2: Session dropped
Synchronizing SCSI cache for disk sdd: 
iscsi-sfnet:host3: Session dropped
Removing iscsi driver: ERROR: Module iscsi_sfnet is in use
[FAILED]
Shutting down interface eth0:  [  OK  ]
Shutting down loopback interface:  [  OK  ]
Stopping pcmcia:  unloading Kernel Card Services
[  OK  ]
Starting killall:  [  OK  ]
Sending all processes the TERM signal... 
Sending all processes the KILL signal... 
Saving random seed:  
Syncing hardware clock to system time 
Turning off swap:  
Turning off quotas:  
Unmounting file systems:  scsi1 (0:1): rejecting I/O to dead device
device-mapper: dm-multipath: Failing path 8:16.
scsi0 (0:1): rejecting I/O to dead device
device-mapper: dm-multipath: Failing path 8:0.
scsi2 (0:1): rejecting I/O to dead device
device-mapper: dm-multipath: Failing path 8:32.
scsi3 (0:1): rejecting I/O to dead device
device-mapper: dm-multipath: Failing path 8:48.
Buffer I/O error on device dm-3, logical block 1
lost page write due to I/O error on dm-3

Halting system...
md: stopping all md devices.
md: md0 switched to read-only mode.
iscsi-sfnet: Driver shutdown completed
Shutdown: hda
System halted.
===================================================

Mike, 
lvchange -an didn't work as the path is still mounted.
mpath and lvchange are still there after iscsi stoped.
Comment 50 Mike Christie 2011-01-19 15:58:02 EST
Do you have the FS mounted on the lv in /etc/fstab? And is it marked with the _netdev option in there? That should unmount the FS before the iscsi script runs.
Comment 51 Gris Ge 2011-01-20 22:21:20 EST
My fault,
It was in runlevel 2 which don't kickoff K87netfs when halting.

Switched to runlevel 3 and bug fixed in iscsi-initiator-utils-4.0.3.0-10.

Thanks for the great work.
Comment 52 errata-xmlrpc 2011-02-16 09:26:39 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0245.html

Note You need to log in before you can comment on or make changes to this bug.