Bug 178384

Summary:

Cannot activate logical volumes using physical devices discovered after clvmd start.

Product:

[Retired] Red Hat Cluster Suite

Reporter:

Henry Harris <henry.harris>

Component:

lvm2-cluster

Assignee:

Alasdair Kergon <agk>

Status:

CLOSED DUPLICATE

QA Contact:

Cluster QE <mspqa-list>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

CC:

agk, ccaulfie

Target Milestone:

---

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2006-01-27 22:30:42 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
steps taken and results	none

Description Henry Harris 2006-01-19 22:39:06 UTC

Description of problem:  When new physical devices are discovered after clvmd
has started, these devices can be used in pvcreate and vgcreate, but lvcreate
will fail due to an Internal lvm error on every node in the cluster.

Version-Release number of selected component (if applicable):

Cluster LVM daemon version: 2.01.14 (2005-08-04)
Protcol version:            0.2.1

LVM version:     2.01.14 (2005-08-04)
Library version: 1.01.04 (2005-08-02)
Driver version:  4.4.0

How reproducible:
After the new devices have been added, but before clvmd is restarted, the steps
can be executed any number of times with the same results.

Steps to Reproduce:
1.  vgscan -v
2.  vgchange -a y
3.  pvcreate /dev/md16 /dev/md17 /dev/md18 /dev/md19 /dev/md20
4.  vgcreate -s 64k JoesPool /dev/md16 /dev/md17 /dev/md18 /dev/md19 /dev/md20
5.  lvcreate -L 10485760k -i 5 -I 16k -n JoesVol JoesPool
6.  lvremove /dev/JoesPool/JoesVol
7.  vgremove JoesPool
8.  pvremove /dev/md16 /dev/md17 /dev/md18 /dev/md19 /dev/md20
  
Actual results:
lvcreate produces the following:
  Error locking on node sqazero04: Internal lvm error, check syslog
  Error locking on node sqazero02: Internal lvm error, check syslog
  Error locking on node sqazero01: Internal lvm error, check syslog
  Error locking on node sqazero03: Internal lvm error, check syslog
  Failed to activate new LV.

Expected results:
lvcreate would complete successfully.

Additional info:
clvmd produces debug output such as:
Volume group for uuid not found:
Gcl4svzA7eybxdXNxfbiB3WcI4Kc2pH8qFkFyy5u0cf2p16K2DjGJ4OKa34jYBrr

The UUID for JoesPool is Gcl4sv-zA7e-ybxd-XNxf-biB3-WcI4-Kc2pH8
The UUID for JoesVol is qFkFyy-5u0c-f2p1-6K2D-jGJ4-OKa3-4jYBrr

As soon as clvmd is restarted on any node, that node will no longer produce the
internal lvm error.

After clvmd has been restarted on every node, the logical volume activation
succeeds.

Comment 1 Henry Harris 2006-01-19 22:39:06 UTC

Created attachment 123461 [details]
steps taken and results

Comment 2 Christine Caulfield 2006-01-20 09:51:13 UTC

This should have been fixed in the long-closed bug #138396

Comment 3 Henry Harris 2006-01-23 18:57:12 UTC

Bug #138396 was believed fixed in RHBA-2005-192, this is happening in 2.01.14
which is later.  The steps are slightly different than in 138396: we are not
restarting clvmd -- restarting clvmd makes the problem go away.  Also, this
behavior is exhibited reliably, not intermittently.

Comment 4 Alasdair Kergon 2006-01-24 15:43:30 UTC

Does 'md' mean these are software raid shared between nodes?  What's their
configuration?

If so, can you reproduce without using 'md'?

Also need to test with latest U3 beta packages.

Comment 5 Corey Marthaler 2006-01-27 22:30:42 UTC

This is a duplicate of bz 138396.

*** This bug has been marked as a duplicate of 138396 ***

Comment 6 Corey Marthaler 2006-02-03 23:08:27 UTC

A work around for this issue that we've tested was to stop clvmd on all the
nodes in the cluster, add your new devices, discover the new devices on all the
nodes, and then restart clvmd.

What we actually did:
1. We had a 3 disk/PV (400Gb) GFS filesystem with active I/O running from all
the nodes.

2. Stopped clvmd:
[root@link-02 ~]# service clvmd stop
Deactivating VG link1:   Can't deactivate volume group "link1" with 1 open
logical volume(s)
[FAILED]
Deactivating VG link2: [  OK  ]
Stopping clvm:[  OK  ]

(note that the de-activation will fail due to the mounted filesystem with
running I/O)

3. Took 3 other unused disks, repartitioned them, rediscovered them on all nodes
and then restarted clvmd.

4. Created PVs out of those new partitons

5. Grew the active VG and LV

6. Grew the GFS filesystem