Bug 971765

Summary: pool-refresh met error on machine B, when a new volume is created on machine A on the same logical pool
Product: Red Hat Enterprise Linux 7 Reporter: chhu
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: CLOSED NOTABUG QA Contact: Virtualization Bugs <virt-bugs>
Severity: low Docs Contact:
Priority: low    
Version: 7.0CC: acathrow, agk, ajia, berrange, chhu, dyuan, jdenemar, mzhan, shyu, whuang, zkabelac
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-02 19:04:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description chhu 2013-06-07 08:47:41 UTC
Description of problem:
pool-refresh met error on machine B, when a new volume is created on machine A on the same logical pool.  

Version-Release number of selected component (if applicable):
libvirt-1.0.6-1.el7.x86_64
qemu-kvm-1.5.0-2.el7.x86_64
kernel: 3.9.0-0.55.el7.x86_64

How reproducible:
100%

Steps:
1. on two machine A and B, with the same libvirt, qemu-kvm, kernel virsion. create the iscsi pool used the same storage.
#virsh define test-iscsi.xml
<pool type='iscsi'>
  <name>test-iscsi</name>
......
  <source>
    <host name='****'/>
    <device path='****'/>
  </source>
  <target>
    <path>/dev/disk/by-path</path>
</pool>
#virsh pool-start test-iscsi
#virsh pool-list --all
Name                 State      Autostart
-----------------------------------------  
default              active     yes 
test-iscsi           active     no

2. on machine A, create the logical pool vgc on the iscsi storage.
#virsh pool-define vg.xml
<pool type='logical'>
  <name>vgc</name>
 .......
  <source>
    <device path='/dev/sdb1'/>
    <name>vg</name>
    <format type='lvm2'/>
  </source>
  <target>
    <path>/dev/vg</path>
.......
</pool>
#virsh pool-build vgc
#virsh pool-start vgc
Pool vgc started
#virsh pool-list --all
Name                 State      Autostart
-----------------------------------------
default              active     yes      
test-iscsi           active     no        
vgc                  active     no

3. on machine B, pool-define and start the same logical pool vgc.
# virsh pool-define vg.xml
# virsh pool-start vgc

4. on machine A, create a volume on the logical pool vgc.

# more vg-ctest.xml
<volume>
  <name>ctest</name>
  <key>/dev/sdb1</key>
  <source>
    <device path='/dev/sdb1'>
    </device>
  </source>
  <capacity unit='M'>512</capacity>
  <target>
    <path>/dev/vg/ctest</path>
  </target>
</volume>
# virsh vol-create vgc vg-ctest.xml
Vol ctest created from vg-ctest.xml
# virsh vol-list vgc
Name                 Path                                    
-----------------------------------------
ctest                /dev/vg/ctest


5. on machine B, vol-list the volume of vgc, get no information and pool-refresh vgc met error, the pool vgc status changed to inactive.
# virsh vol-list vgc
Name                 Path                                    
-----------------------------------------

# virsh pool-refresh vgc
error: Failed to refresh pool vgc
error: cannot stat file '/dev/vg/ctest': No such file or directory

# virsh vol-list vgc
error: Failed to list volumes
error: Requested operation is not valid: storage pool 'vgc' is not active

# virsh pool-list --all|grep vgc
vgc                  inactive   no

6. on machine B, pool-start the logical pool vgc, and vol-list the volume of vgc, got the volume list.
# virsh pool-start vgc
Pool vgc started

# virsh vol-list vgc
Name                 Path                                    
-----------------------------------------
ctest                /dev/vg/ctest


Actual results:
In step5: on machine B, vol-list the volume of vgc, get no volume list and pool-refresh vgc met error, the pool vgc status changed to inactive.

Expected results:
In step 5: pool-refresh with no error and then vol-list can return the volume list.

Comment 2 chhu 2013-06-07 10:57:09 UTC
(In reply to chhu from comment #0)
> Description of problem:
> pool-refresh met error on machine B, when a new volume is created on machine
> A on the same logical pool.  
> 
> Version-Release number of selected component (if applicable):
> libvirt-1.0.5-2.el7.x86_64
> qemu-kvm-1.4.0-4.el7.x86_64
> kernel: 3.9.0-0.55.el7.x86_64
> 
> How reproducible:
> 100%
> 
> Steps:
> 1. on two machine A and B, with the same libvirt, qemu-kvm, kernel virsion.
> create the iscsi pool used the same storage.
> #virsh define test-iscsi.xml
> <pool type='iscsi'>
>   <name>test-iscsi</name>
> ......
>   <source>
>     <host name='****'/>
>     <device path='****'/>
>   </source>
>   <target>
>     <path>/dev/disk/by-path</path>
> </pool>
> #virsh pool-start test-iscsi
> #virsh pool-list --all
> Name                 State      Autostart
> -----------------------------------------  
> default              active     yes 
> test-iscsi           active     no
> 
> 2. on machine A, create the logical pool vgc on the iscsi storage.
> #virsh pool-define vg.xml
> <pool type='logical'>
>   <name>vgc</name>
>  .......
>   <source>
>     <device path='/dev/sdb1'/>
>     <name>vg</name>
>     <format type='lvm2'/>
>   </source>
>   <target>
>     <path>/dev/vg</path>
> .......
> </pool>
> #virsh pool-build vgc
> #virsh pool-start vgc
> Pool vgc started
> #virsh pool-list --all
> Name                 State      Autostart
> -----------------------------------------
> default              active     yes      
> test-iscsi           active     no        
> vgc                  active     no
> 
> 3. on machine B, pool-define and start the same logical pool vgc.
> # virsh pool-define vg.xml
> # virsh pool-start vgc
> 
> 4. on machine A, create a volume on the logical pool vgc.
> 
> # more vg-ctest.xml
> <volume>
>   <name>ctest</name>
>   <key>/dev/sdb1</key>
>   <source>
>     <device path='/dev/sdb1'>
>     </device>
>   </source>
>   <capacity unit='M'>512</capacity>
>   <target>
>     <path>/dev/vg/ctest</path>
>   </target>
> </volume>
> # virsh vol-create vgc vg-ctest.xml
> Vol ctest created from vg-ctest.xml
> # virsh vol-list vgc
> Name                 Path                                    
> -----------------------------------------
> ctest                /dev/vg/ctest
> 
> 
> 5. on machine B, vol-list the volume of vgc, get no information and
> pool-refresh vgc met error, the pool vgc status changed to inactive.
> # virsh vol-list vgc
> Name                 Path                                    
> -----------------------------------------
> 
> # virsh pool-refresh vgc
> error: Failed to refresh pool vgc
> error: cannot stat file '/dev/vg/ctest': No such file or directory
> 
> # virsh vol-list vgc
> error: Failed to list volumes
> error: Requested operation is not valid: storage pool 'vgc' is not active
> 
> # virsh pool-list --all|grep vgc
> vgc                  inactive   no
> 
> 6. on machine B, pool-start the logical pool vgc, and vol-list the volume of
> vgc, got the volume list.
> # virsh pool-start vgc
> Pool vgc started
> 
> # virsh vol-list vgc
> Name                 Path                                    
> -----------------------------------------
> ctest                /dev/vg/ctest
> 
> 
> Actual results:
> In step5: on machine B, vol-list the volume of vgc, get no volume list and
> pool-refresh vgc met error, the pool vgc status changed to inactive.
> 
> Expected results:
> In step 5: pool-refresh with no error and then vol-list can return the
> volume list.

On new version packages:
libvirt-1.0.6-1.el7.x86_64
qemu-kvm-1.4.0-4.el7.x86_64

In step5: on machine B, vol-list the volume, ctest is not listed.
# virsh vol-list vgc
Name                 Path                                    
-----------------------------------------

# lvs
  LV    VG   Attr      LSize   Pool Origin Data%  Move Log Copy%  Convert
  home  rhel -wi-ao--- 309.42g                                           
  root  rhel -wi-ao---  50.00g                                           
  swap  rhel -wi-ao---   7.70g                                           
  ctest vg   -wi------ 512.00m                                           

# virsh pool-refresh vgc
Pool vgc refreshed

# virsh vol-list vgc
Name                 Path                                    
-----------------------------------------

Actual results:
In step5: on machine B, pool-refresh + vol-list can't list the new volume vgc, but it's listed in lvc.
 
Expected results:
In step 5: on machine B, pool-refresh + vol-list can list the new volume vgc.

Comment 3 Zdenek Kabelac 2013-06-24 14:09:32 UTC
This looks like some mis-usage of lvm.

To share same PV between multiple hosts you need to use some locking mechanism between different nodes - so the metadata access is properly locked.

It's not quite clear how the terminology in virsh is mapped on 'lvm' terminology (since we user pools for i.e. thin provisioning while here it seems to have completely different meaning)

So is the machine B seeing same 'PV' as the machine A?

Is there actually any locking mechanism ?

Comment 4 Daniel Berrangé 2013-06-24 14:13:39 UTC
Libvirt does not do any locking of LVM itself. It is the administrators responsibility to have configured cluster-LVM, or equivalent protection, if they want to manage the storage using libvirt on multiple hosts at once.

Comment 5 chhu 2013-06-27 06:34:22 UTC
(In reply to Zdenek Kabelac from comment #3)
> This looks like some mis-usage of lvm.
> 
> To share same PV between multiple hosts you need to use some locking
> mechanism between different nodes - so the metadata access is properly
> locked.
> 
> It's not quite clear how the terminology in virsh is mapped on 'lvm'
> terminology (since we user pools for i.e. thin provisioning while here it
> seems to have completely different meaning)
> 
> So is the machine B seeing same 'PV' as the machine A?
> 

Yes, the machine B seeing same 'PV' as the machine A, 

1. After the logical pool is created in Step2 on machine A, the "PV" can be seen on both machine A and B.
On A:
# pvs
  PV         VG   Fmt  Attr PSize   PFree
  /dev/sda2  rhel lvm2 a--   87.89g      0
  /dev/sda3       lvm2 a--  100.00g 100.00g
  /dev/sdb1  vg   lvm2 a--   30.00g  30.00g

On B:
# pvs
  PV         VG   Fmt  Attr PSize   PFree
  /dev/sda5  rhel lvm2 a--  367.12g     0
  /dev/sdb1  vg   lvm2 a--   30.00g 30.00g

2. Then, in step3, on machine B, pool-define and start the same logical pool. After step3, on machine B, the "PV" is still /dev/sdb1.
# pvs
  PV         VG   Fmt  Attr PSize   PFree 
  /dev/sda5  rhel lvm2 a--  367.12g     0 
  /dev/sdb1  vg   lvm2 a--   30.00g 30.00g

3. Then, in step4, on machine A, create a volume on the logical pool vg.
# virsh vol-list vg
Name                 Path                                    
-----------------------------------------
ctest                /dev/vg/ctest

# pvs
  PV         VG   Fmt  Attr PSize   PFree  
  /dev/sda2  rhel lvm2 a--   87.89g      0 
  /dev/sda3       lvm2 a--  100.00g 100.00g
  /dev/sdb1  vg   lvm2 a--   30.00g  26.09g
# lvs
  LV    VG   Attr      LSize  Pool Origin Data%  Move Log Copy%  Convert
  root  rhel -wi-a---- 78.12g                                           
  swap  rhel -wi-ao---  9.77g                                           
  ctest vg   -wi-a----  3.91g

4. In step5, on machine B, vol-list the volume, ctest is not listed.
# virsh pool-refresh vg
Pool vg refreshed

# virsh vol-list vg
Name                 Path                                    
-----------------------------------------

# pvs
  PV         VG   Fmt  Attr PSize   PFree 
  /dev/sda5  rhel lvm2 a--  367.12g     0 
  /dev/sdb1  vg   lvm2 a--   30.00g 26.09g

# lvs
  LV    VG   Attr      LSize   Pool Origin Data%  Move Log Copy%  Convert
  home  rhel -wi-ao--- 309.42g                                           
  root  rhel -wi-ao---  50.00g                                           
  swap  rhel -wi-ao---   7.70g                                           
  ctest vg   -wi------   3.91g

As the step2,4 may be done by Person A; step3, 5 may be done by Person B. The question is: on machine A and B, lvs, pvs shows the same(listed LV:ctest); but virsh vol-list shows the different pictures, one with LV:ctest, one without it. Is it acceptable ?


> Is there actually any locking mechanism ?

Comment 6 Jiri Denemark 2013-07-02 19:04:10 UTC
This use case is unsupported and very fragile unless cluster LVM is properly configured and clvmd is running on all hosts.