Description of problem: I'm seeing the following message while doing looping mirror creation/deletions on one node in the cluster and while also running lvs on another node in the cluster. Volume group "vg" inconsistent Version-Release number of selected component (if applicable): [root@link-07 ~]# rpm -q lvm2 lvm2-2.02.13-1 [root@link-07 ~]# rpm -q lvm2-cluster lvm2-cluster-2.02.13-1 [root@link-07 ~]# rpm -q cmirror-kernel cmirror-kernel-2.6.9-13.0 [root@link-07 ~]# rpm -q device-mapper device-mapper-1.02.12-3
This can happen in a cluster when using clvmd, but operating on a NON-CLUSTERED vg. IOW, single machine mirroring is employed, but clvmd is the method of activation. Will try just single machine next.
Haven't seen it when bypassing clvmd.
Insufficient info here about what was done, but if VG 'vg' is non-clustered, it should not be accessible across a cluster. If it's visible on more than one machine and not marked CLUSTERED then expect problems.
I've isolated the code to here: lib/metadata/metadata.c:_vg_read /* FIXME Also ensure contents same - checksum compare? */ if (correct_vg->seqno != vg->seqno) { inconsistent = 1; if (vg->seqno > correct_vg->seqno) correct_vg = vg; } I don't know why the sequence numbers would be different if we are holding a lock... we are, aren't we?
Adding a simple print, we get: correct_vg->seqno(805) != vg->seqno(806) Volume group "vg" inconsistent
Created attachment 142406 [details] Output of a failed 'lvs' This 'lvs -vvvv' gives the inconsistent error
Created attachment 142418 [details] Metadata at seqno 3315
Created attachment 142419 [details] Metadata at seqno 3316 These are the two metadata sets that were committed, nothing seems wrong with them.
I took out the partitions and used the whole device for the PVs. I still get the "inconsistent" message, but it prints things out properly. Now I can remember if this is equivalent to what I was seeing before... Volume group "vg" inconsistent LV VG Attr LSize Origin Snap% Move Log Copy% LogVol00 VolGroup00 -wi-ao 36.62G LogVol01 VolGroup00 -wi-ao 512.00M lv vg mwi-a- 5.00G lv_mlog 1.80 In any case, it is still present.
Here are the locking operations made during 'create/change/delete': CREATING: Performing lock operation on V_vg: LCK_WRITE/VG (0x4) [3288] Performing lock operation on vvi2oi0LlyYIrYI0JdO4CR8otxQBN6oYNpgq8hAq2sB2Y0ycqavEyuEFPy8KRAeU: LCK_READ/LV/LCK_NONBLOCK/LCK_CLUSTER_VG (0x99) [3288] Performing lock operation on vvi2oi0LlyYIrYI0JdO4CR8otxQBN6oYNpgq8hAq2sB2Y0ycqavEyuEFPy8KRAeU: <UNKNOWN>/LV/LCK_NONBLOCK/LCK_CLUSTER_VG (0x98) [3288] Performing lock operation on vvi2oi0LlyYIrYI0JdO4CR8otxQBN6oYNpgq8hAq2sB2Y0ycqavEyuEFPy8KRAeU: LCK_UNLOCK/LV/LCK_NONBLOCK/LCK_CLUSTER_VG (0x9e) [3288] Performing lock operation on vvi2oi0LlyYIrYI0JdO4CR8otxQBN6oYMxodct0b1xUOEUKIzvaP4jCUiskjt8gx: LCK_READ/LV/LCK_NONBLOCK/LCK_CLUSTER_VG (0x99) [3288] Logical volume "lv" created Performing lock operation on V_vg: LCK_UNLOCK/VG (0x6) [3288] CHANGING: Performing lock operation on V_vg: LCK_WRITE/VG (0x4) [3359] Performing lock operation on vvi2oi0LlyYIrYI0JdO4CR8otxQBN6oYMxodct0b1xUOEUKIzvaP4jCUiskjt8gx: <UNKNOWN>/LV/LCK_NONBLOCK/LCK_CLUSTER_VG (0x98) [3359] Performing lock operation on vvi2oi0LlyYIrYI0JdO4CR8otxQBN6oYMxodct0b1xUOEUKIzvaP4jCUiskjt8gx: LCK_UNLOCK/LV/LCK_NONBLOCK/LCK_CLUSTER_VG (0x9e) [3359] Performing lock operation on V_vg: LCK_UNLOCK/VG (0x6) [3359] REMOVING: Performing lock operation on V_vg: LCK_WRITE/VG (0x4) [3405] Performing lock operation on vvi2oi0LlyYIrYI0JdO4CR8otxQBN6oYMxodct0b1xUOEUKIzvaP4jCUiskjt8gx: LCK_EXCL/LV/LCK_NONBLOCK/LCK_CLUSTER_VG (0x9d) [3405] Performing lock operation on vvi2oi0LlyYIrYI0JdO4CR8otxQBN6oYMxodct0b1xUOEUKIzvaP4jCUiskjt8gx: <UNKNOWN>/LV/LCK_NONBLOCK/LCK_CLUSTER_VG (0x98) [3405] Performing lock operation on vvi2oi0LlyYIrYI0JdO4CR8otxQBN6oYMxodct0b1xUOEUKIzvaP4jCUiskjt8gx: LCK_UNLOCK/LV/LCK_NONBLOCK/LCK_CLUSTER_VG (0x9e) [3405] Logical volume "lv" successfully removed Performing lock operation on V_vg: LCK_UNLOCK/VG (0x6) [3405] And here are the lock ops for an 'lvs': #cluster_locking.c:435 Performing lock operation on V_VolGroup00: LCK_READ/VG (0x1) [31631] #cluster_locking.c:435 Performing lock operation on V_VolGroup00: LCK_UNLOCK/VG (0x6) [31631] #cluster_locking.c:435 Performing lock operation on V_vg: LCK_READ/VG (0x1) [31631] #cluster_locking.c:435 Performing lock operation on V_vg: LCK_UNLOCK/VG (0x6) [31631] *** #toollib.c:348 Volume group "vg" inconsistent *** #cluster_locking.c:435 Performing lock operation on V_vg: LCK_WRITE/VG (0x4) [31631] #cluster_locking.c:435 Performing lock operation on V_vg: LCK_UNLOCK/VG (0x6) [31631]
Doesn't matter if O_DIRECT_SUPPORT is not defined (unused).
Interesting... I run lvs in ddd (aka gdb) and simply stop right after I aquire the VG read lock. This should have the effect of stopping the create/change/delete loop on the other machine, but it doesn't... it just keeps going.
locking problem in lvm vg locks should have been PR not CR
Have not seen inconsistent errors while executing the test case in comment #0. Marking verified.
Fixed in current release (4.7).