Bug 156110

Summary: activating restored volumes after hardware failure can fail
Product: Red Hat Enterprise Linux 4 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
Status: CLOSED CURRENTRELEASE QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: agk, ccaulfie, cfeist, dwysocha, jbrassow, mbroz
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-05-14 22:28:11 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2005-04-27 16:46:07 UTC
Description of problem:
After having my MSA1000 go down and then restoring with a new drive from HP, I
went through the steps to get my data back.

I had 7 PVs on 7 luns, and those all into 1 VG, and the sliced into 5 LVs.


Here is the initial status of lvm after rebooting the nodes after restoring the MSA:

[root@tank-03 tmp]# vgscan
  Reading all physical volumes.  This may take a while...
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find all physical volumes for volume group gfs.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find all physical volumes for volume group gfs.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find all physical volumes for volume group gfs.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find all physical volumes for volume group gfs.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find all physical volumes for volume group gfs.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find all physical volumes for volume group gfs.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find all physical volumes for volume group gfs.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find all physical volumes for volume group gfs.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find all physical volumes for volume group gfs.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find all physical volumes for volume group gfs.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find all physical volumes for volume group gfs.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find all physical volumes for volume group gfs.
  Volume group "gfs" not found

[root@tank-03 tmp]# pvscan
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  PV /dev/sda1        VG gfs   lvm2 [135.66 GB / 0    free]
  PV /dev/sdb1        VG gfs   lvm2 [135.66 GB / 0    free]
  PV /dev/sdc1        VG gfs   lvm2 [135.66 GB / 0    free]
  PV unknown device   VG gfs   lvm2 [135.66 GB / 0    free]
  PV /dev/sde1        VG gfs   lvm2 [135.66 GB / 0    free]
  PV /dev/sdf1        VG gfs   lvm2 [135.66 GB / 0    free]
  PV /dev/sdg1        VG gfs   lvm2 [135.66 GB / 0    free]
  Total: 7 [949.59 GB] / in use: 7 [949.59 GB] / in no VG: 0 [0   ]

First I created a back up of the config I had:
[root@tank-03 tmp]# vgcfgbackup -P
  Partial mode. Incomplete volume groups will be activated read-only.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Volume group "gfs" successfully backed up.

Second I created a PV out of the newly restored lun:
[root@tank-03 backup]# pvcreate --restorefile /etc/lvm/backup/gfs --uuid
Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf /dev/sdd
  Couldn't find device with uuid 'Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf'.
  Physical volume "/dev/sdd" successfully created

[root@tank-03 backup]# pvscan
  PV /dev/sda1   VG gfs   lvm2 [135.66 GB / 0    free]
  PV /dev/sdb1   VG gfs   lvm2 [135.66 GB / 0    free]
  PV /dev/sdc1   VG gfs   lvm2 [135.66 GB / 0    free]
  PV /dev/sdd    VG gfs   lvm2 [135.66 GB / 0    free]
  PV /dev/sde1   VG gfs   lvm2 [135.66 GB / 0    free]
  PV /dev/sdf1   VG gfs   lvm2 [135.66 GB / 0    free]
  PV /dev/sdg1   VG gfs   lvm2 [135.66 GB / 0    free]
  Total: 7 [949.59 GB] / in use: 7 [949.59 GB] / in no VG: 0 [0   ]

[root@tank-03 backup]# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "gfs" using metadata type lvm2

When I tried the vg activate on all nodes it failed

[root@tank-03 backup]# vgchange -ay gfs
  Error locking on node tank-01.lab.msp.redhat.com: Internal lvm error, check syslog
  Error locking on node tank-04.lab.msp.redhat.com: Internal lvm error, check syslog
  Error locking on node tank-02.lab.msp.redhat.com: Internal lvm error, check syslog
  Error locking on node tank-05.lab.msp.redhat.com: Internal lvm error, check syslog
[...]

In the syslog it complains about the missing uuid:

Apr 27 09:33:45 tank-03 lvm[3098]: Volume group gfs metadata is inconsistent
Apr 27 09:33:45 tank-03 lvm[3098]: Volume group for uuid not found:
0ytQwKGjIB01ACCwicAQon4AB3tB1lMSXnBb816tMA0LbdcFSOrxAuNPeRlAY9v5
Apr 27 09:33:48 tank-03 lvm[3098]: Volume group gfs metadata is inconsistent
Apr 27 09:33:48 tank-03 lvm[3098]: Volume group for uuid not found:
0ytQwKGjIB01ACCwicAQon4AB3tB1lMSFoj7HlabxJFo8mmJeTGkV4f56mU4Phxq
Apr 27 09:33:48 tank-03 lvm[3098]: Volume group gfs metadata is inconsistent
Apr 27 09:33:48 tank-03 lvm[3098]: Volume group for uuid not found:
0ytQwKGjIB01ACCwicAQon4AB3tB1lMSELaSVIqMEmYTVatsdQFyELqx6a3414gt
Apr 27 09:33:48 tank-03 lvm[3098]: Volume group gfs metadata is inconsistent
Apr 27 09:33:48 tank-03 lvm[3098]: Volume group for uuid not found:
0ytQwKGjIB01ACCwicAQon4AB3tB1lMSlSU0gNZyl77aX33ECBFioGys3WpfphNL
Apr 27 09:33:48 tank-03 lvm[3098]: Volume group gfs metadata is inconsistent
Apr 27 09:33:48 tank-03 lvm[3098]: Volume group for uuid not found:
0ytQwKGjIB01ACCwicAQon4AB3tB1lMSXnBb816tMA0LbdcFSOrxAuNPeRlAY9v5
Apr 27 09:36:48 tank-03 clvmd: Cluster LVM daemon started - connected to CMAN
Apr 27 09:37:59 tank-03 lvm[3253]: Volume group gfs metadata is inconsistent
Apr 27 09:37:59 tank-03 lvm[3253]: Volume group for uuid not found:
0ytQwKGjIB01ACCwicAQon4AB3tB1lMSqnl7fuVgYhOxL0915LafYpzfxRtZTr8P
Apr 27 09:37:59 tank-03 lvm[3253]: Volume group gfs metadata is inconsistent
Apr 27 09:37:59 tank-03 lvm[3253]: Volume group for uuid not found:
0ytQwKGjIB01ACCwicAQon4AB3tB1lMSFoj7HlabxJFo8mmJeTGkV4f56mU4Phxq
Apr 27 09:37:59 tank-03 lvm[3253]: Volume group gfs metadata is inconsistent
Apr 27 09:37:59 tank-03 lvm[3253]: Volume group for uuid not found:
0ytQwKGjIB01ACCwicAQon4AB3tB1lMSELaSVIqMEmYTVatsdQFyELqx6a3414gt
Apr 27 09:37:59 tank-03 lvm[3253]: Volume group gfs metadata is inconsistent
Apr 27 09:37:59 tank-03 lvm[3253]: Volume group for uuid not found:
0ytQwKGjIB01ACCwicAQon4AB3tB1lMSlSU0gNZyl77aX33ECBFioGys3WpfphNL
Apr 27 09:37:59 tank-03 lvm[3253]: Volume group gfs metadata is inconsistent
Apr 27 09:37:59 tank-03 lvm[3253]: Volume group for uuid not found:
0ytQwKGjIB01ACCwicAQon4AB3tB1lMSXnBb816tMA0LbdcFSOrxAuNPeRlAY9v5

I then restarted clvmd and after that it still failed.
I then changed the backup file a bit, max_pvs = 0 -> 255 but according to agk,
the changes I made would not have affected anything.

After this I tried the vgchange again and it worked some how
[root@tank-02 archive]# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "gfs" using metadata type lvm2
[root@tank-02 archive]# vgchange -ay gfs
  4 logical volume(s) in volume group "gfs" now active
[root@tank-02 archive]# lvscan
  inactive          '/dev/gfs/gfs0' [189.92 GB] inherit
  ACTIVE            '/dev/gfs/gfs1' [189.92 GB] inherit
  ACTIVE            '/dev/gfs/gfs2' [189.92 GB] inherit
  ACTIVE            '/dev/gfs/gfs3' [189.92 GB] inherit
  ACTIVE            '/dev/gfs/gfs4' [189.92 GB] inherit
[root@tank-02 archive]# lvscan
  inactive          '/dev/gfs/gfs0' [189.92 GB] inherit
  ACTIVE            '/dev/gfs/gfs1' [189.92 GB] inherit
  ACTIVE            '/dev/gfs/gfs2' [189.92 GB] inherit
  ACTIVE            '/dev/gfs/gfs3' [189.92 GB] inherit
  ACTIVE            '/dev/gfs/gfs4' [189.92 GB] inherit

Here is the backup file as it stand now:

# Generated by LVM2: Wed Apr 27 09:29:31 2005

contents = "Text Format Volume Group"
version = 1

description = "Created *after* executing 'vgcfgbackup -P'"

creation_host = "tank-03.lab.msp.redhat.com"    # Linux
tank-03.lab.msp.redhat.com 2.6.9-prep #1 SMP Fri Apr 22 14:25:10 EDT 2005 i686
creation_time = 1114608571      # Wed Apr 27 09:29:31 2005

gfs {
        id = "0ytQwK-GjIB-01AC-Cwic-AQon-4AB3-tB1lMS"
        seqno = 6
        status = ["RESIZEABLE", "READ", "WRITE", "CLUSTERED"]
        extent_size = 8192              # 4 Megabytes
        max_lv = 255
        max_pv = 255

        physical_volumes {

                pv0 {
                        id = "2cTk9n-nXJ8-MEk7-wsuX-vWvL-r4CF-N34PGK"
                        device = "/dev/sda1"    # Hint only

                        status = ["ALLOCATABLE"]
                        pe_start = 384
                        pe_count = 34728        # 135.656 Gigabytes
                }

                pv1 {
                        id = "cDskCY-TDfD-iFJI-cXdW-g1xx-Dv0D-IXvbp0"
                        device = "/dev/sdb1"    # Hint only

                        status = ["ALLOCATABLE"]
                        pe_start = 384
                        pe_count = 34728        # 135.656 Gigabytes
                }

                pv2 {
                        id = "ctpxNZ-nWZu-AaEE-Cx28-hqtB-eKot-lIJNN8"
                        device = "/dev/sdc1"    # Hint only

                        status = ["ALLOCATABLE"]
                        pe_start = 384
                        pe_count = 34728        # 135.656 Gigabytes
                }

                pv3 {
                        id = "Xynm7y-q4us-32gx-b2Q2-523C-Fa3C-Um75Gf"
                        device = "unknown device"       # Hint only

                        status = ["ALLOCATABLE"]
                        pe_start = 384
                        pe_count = 34728        # 135.656 Gigabytes
                }

                pv4 {
                        id = "onfQu4-Zsz6-DUnh-mhaZ-oZs9-pZSx-3BkNaW"
                        device = "/dev/sde1"    # Hint only

                        status = ["ALLOCATABLE"]
                        pe_start = 384
                        pe_count = 34728        # 135.656 Gigabytes
                }

                pv5 {
                        id = "vDmgPx-hZB4-b5ZW-WF7p-WHkH-3Unm-oWMv5V"
                        device = "/dev/sdf1"    # Hint only

                        status = ["ALLOCATABLE"]
                        pe_start = 384
                        pe_count = 34728        # 135.656 Gigabytes
                }

                pv6 {
                        id = "O2QJsf-H8xe-dCYT-grmc-5PFJ-g3aT-UsKR16"
                        device = "/dev/sdg1"    # Hint only

                        status = ["ALLOCATABLE"]
                        pe_start = 384
                        pe_count = 34728        # 135.656 Gigabytes
                }
        }

        logical_volumes {

                gfs0 {
                        id = "qnl7fu-VgYh-OxL0-915L-afYp-zfxR-tZTr8P"
                        status = ["READ", "WRITE", "VISIBLE"]
                        segment_count = 2

                        segment1 {
                                start_extent = 0
                                extent_count = 34728    # 135.656 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv0", 0
                                ]
                        }
                        segment2 {
                                start_extent = 34728
                                extent_count = 13891    # 54.2617 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv1", 0
                                ]
                        }
                }

                gfs1 {
                        id = "Foj7Hl-abxJ-Fo8m-mJeT-GkV4-f56m-U4Phxq"
                        status = ["READ", "WRITE", "VISIBLE"]
                        segment_count = 2

                        segment1 {
                                start_extent = 0
                                extent_count = 20837    # 81.3945 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv1", 13891
                                ]
                        }
                        segment2 {
                                start_extent = 20837
                                extent_count = 27782    # 108.523 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv2", 0
                                ]
                        }
                }

                gfs2 {
                        id = "ELaSVI-qMEm-YTVa-tsdQ-FyEL-qx6a-3414gt"
                        status = ["READ", "WRITE", "VISIBLE"]
                        segment_count = 3

                        segment1 {
                                start_extent = 0
                                extent_count = 6946     # 27.1328 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv2", 27782
                                ]
                        }
                        segment2 {
                                start_extent = 6946
                                extent_count = 34728    # 135.656 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv3", 0
                                ]
                        }
                        segment3 {
                                start_extent = 41674
                                extent_count = 6945     # 27.1289 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv4", 0
                                ]
                        }
                }

                gfs3 {
                        id = "lSU0gN-Zyl7-7aX3-3ECB-FioG-ys3W-pfphNL"
                        status = ["READ", "WRITE", "VISIBLE"]
                        segment_count = 2

                        segment1 {
                                start_extent = 0
                                extent_count = 27783    # 108.527 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv4", 6945
                                ]
                        }
                        segment2 {
                                start_extent = 27783
                                extent_count = 20836    # 81.3906 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv5", 0
                                ]
                        }
                }

                gfs4 {
                        id = "XnBb81-6tMA-0Lbd-cFSO-rxAu-NPeR-lAY9v5"
                        status = ["READ", "WRITE", "VISIBLE"]
                        segment_count = 2

                        segment1 {
                                start_extent = 0
                                extent_count = 13892    # 54.2656 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv5", 20836
                                ]
                        }
                        segment2 {
                                start_extent = 13892
                                extent_count = 34728    # 135.656 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv6", 0
                                ]
                        }
                }
        }
}



Version-Release number of selected component (if applicable):
[root@tank-02 archive]# clvmd -V
Cluster LVM daemon version: 2.01.09 (2005-04-04)
Protocol version:           0.2.1

Comment 1 Christine Caulfield 2005-05-03 10:59:03 UTC
Looks like a job for agk.

Comment 2 Kiersten (Kerri) Anderson 2006-09-22 16:53:20 UTC
Devel ACK.  Is this one a cluster problem or a core rhel bug? If it is core RHEL
need to change the product and component fields.

Comment 3 Alasdair Kergon 2006-10-18 18:46:18 UTC
cluster-specific, but any fix would go into core lvm2

This simply looks like another manifestation of the 'clvmd internal cache not
getting updated' problem.

Comment 4 Milan Broz 2010-05-14 22:28:11 UTC
I think this was fixed with various changes in lvmcache code. If it is still reproducible, please reopen.