Bug 268581

Summary: vgchange command fails to see volume group after installation of device-mapper-multipath-0.4.5-21.0.1.RHEL4.x86_64.rpm
Product: Red Hat Enterprise Linux 4 Reporter: bret goodfellow <bret.goodfellow>
Component: device-mapper-multipathAssignee: Ben Marzinski <bmarzins>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Corey Marthaler <cmarthal>
Severity: high Docs Contact:
Priority: medium    
Version: 4.5CC: agk, bmarzins, christophe.varoqui, dwysocha, egoggin, junichi.nomura, kueda, lmb, mbroz, prockai, tranlan
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-01-12 21:10:55 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
multipath.conf
none
output from command: multipath -v6 -ll none

Description bret goodfellow 2007-08-30 19:22:28 UTC
Description of problem:  
Volume Group fails to come Active


Version-Release number of selected component (if applicable):
device-mapper-multipath-0.4.5-21.0.1.RHEL4

How reproducible:
every time

Steps to Reproduce:
I have a Disaster Recivery system that has 2 identical groups of luns.  When the
D/R server is required for a test (or a real disaster), I need to bring it up
(pointing to one set of luns) for a particular volume group.  I will call them
lun-group A (16 25GB luns) and lun-group B (16 25GB luns).  Keep in mind that
they are absolutely identical, and therefore can not both be active on the
system at the same time. Below is the process I have followed to bring up either
lun-group A or lun-group B:

1) determine which group of luns should come up active on the D/R server.  I
this example I have decided to activate lun-group B (and therefore remove
lun-group A).
2) issue a "dd" command to wipe out the alternate lun-group (lun-group A) e.g.
  dd if=/dev/zero of=/dev/mpath/mpath6 bs=1000000000 count=1
  dd if=/dev/zero of=/dev/mpath/mpath7 bs=1000000000 count=1
  dd if=/dev/zero of=/dev/mpath/mpath8 bs=1000000000 count=1
  dd if=/dev/zero of=/dev/mpath/mpath9 bs=1000000000 count=1
  dd if=/dev/zero of=/dev/mpath/mpath10 bs=1000000000 count=1
  dd if=/dev/zero of=/dev/mpath/mpath11 bs=1000000000 count=1
  dd if=/dev/zero of=/dev/mpath/mpath12 bs=1000000000 count=1
  dd if=/dev/zero of=/dev/mpath/mpath13 bs=1000000000 count=1
  dd if=/dev/zero of=/dev/mpath/mpath14 bs=1000000000 count=1
  dd if=/dev/zero of=/dev/mpath/mpath15 bs=1000000000 count=1
  dd if=/dev/zero of=/dev/mpath/mpath16 bs=1000000000 count=1
  dd if=/dev/zero of=/dev/mpath/mpath17 bs=1000000000 count=1
  dd if=/dev/zero of=/dev/mpath/mpath18 bs=1000000000 count=1
  dd if=/dev/zero of=/dev/mpath/mpath19 bs=1000000000 count=1
  dd if=/dev/zero of=/dev/mpath/mpath20 bs=1000000000 count=1
  dd if=/dev/zero of=/dev/mpath/mpath21 bs=1000000000 count=1
3) run "pvscan"
4) run "vgscan"
5) run "lvscan"
6) run "vgchange" to activate volume group.  e.g.
  vgchange -ay /dev/nsc01
  THIS IS THE PROBLEM!!  I get the following error:
  volume group nsc01 not found

  
Actual results:
After running the vgchange command (vgchange -ay /dev/nsc01) I got the following
message:

volume group nsc01 not found



Expected results:
the volume group should have come active


Additional info:
I forced the old device-mapper-multipath package on:

rpm -ivh device-mapper-multipath-0.4.5-21.RHEL4 --oldpackage --force

I then removed the latest device-mapper-multipath:

rpm -e device-mapper-multipath-0.4.5-21.0.1.RHEL4

THE Vgchange NOW WORKS!  I believe that
device-mapper-multipath-0.4.5-21.0.1.RHEL4 doesn't allow me to look at my second
group of luns.  In effect, I have lost my second group of critical disk, and the
only way to see it is to remove the latest device mapper update and re-install
device-mapper-multipath-0.4.5-21.RHEL4.

Comment 1 bret goodfellow 2007-08-30 19:24:54 UTC
My target SAN (san attached to the D/R server) is a Hitachi 9570.

Comment 2 Ben Marzinski 2007-10-29 19:14:02 UTC
The only changes between device-mapper-multipath-0.4.5-21.RHEL4 and
device-mapper-multipath-0.4.5-21.0.1.RHEL4 were to the default configurations
for devices. There was a change to some HITACHI device default configurations,
which I assume is what you are seeing. The bugzilla that caused the change is bz
#240075. You can verifty that this effects your device by looking at the files:

# cat /sys/block/<device_node>/device/vendor

and

# cat /sys/block/<device_node>/device/model

Where <device_node> is the device node name of one of the devices served up by
your Hitachi 9570 (i.e. sda). If the vendor is "HITACHI" and the model starts
with the letters "DF", then this will effect you.

However the configuration change simply changes the priority callout program
from a typo to the correct program name. Previously, the priority callout
program was listed as /sbin/mpath/prio_hds_modular, which doesn't exist. It was
changed to /sbin/mpath_prio_hds_modular, which does exist.

You should check that /sbin/mpath_prio_hds_modular does actually exist on your
system.

If you are still seeing thins problem , could you please run the commands

# multipath
# multipath -ll

with each package installed, and post the results for both in this bugzilla.

Comment 3 bret goodfellow 2007-10-31 14:22:34 UTC
Hello there,

I checked out the files "vendor" and "model".  They have the following in them:

vendor:
HITACHI

model:
DF600F

The system doesn't have the patch on any more, since it corrupts our Disaster 
Recovery configuration.  Not sure where to go from here.  We have decided not 
to apply the multipath patch to any of our systems at this time until a 
resolution is provided. 



Comment 4 Ben Marzinski 2007-10-31 17:33:00 UTC
Can you check that /sbin/mpath_prio_hds_modular does exist on you system. Also,
can you please send me a copy of

/etc/multipath.conf

and the output of running

# multipath -v6 -ll

Comment 5 bret goodfellow 2007-10-31 17:45:29 UTC
Created attachment 244791 [details]
multipath.conf

Comment 6 bret goodfellow 2007-10-31 17:46:54 UTC
Created attachment 244801 [details]
output from command: multipath -v6 -ll

Comment 7 bret goodfellow 2007-10-31 17:47:58 UTC
Yes, the file "/sbin/mpath_prio_hds_modular" does exist.