Bug 1247521

Summary: RFE: libvirt: support multiple volume hosts for gluster volumes
Product: Red Hat Enterprise Linux 7 Reporter: Ala Hino <ahino>
Component: libvirtAssignee: Peter Krempa <pkrempa>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.1CC: amureini, bao, dyuan, jinzhao, jsuchane, jtomko, lmen, mzhan, ndevos, pkrempa, prasanna.kalever, pzhang, rbalakri, rs, sabose, sankarshan, sherold, smohan, ssaha, v.astafiev, xuzhang
Target Milestone: rcKeywords: FutureFeature
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-2.0.0-4.el7 Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
: 1247933 (view as bug list) Environment:
Last Closed: 2016-11-03 18:20:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1247933    
Bug Blocks: 1022961, 1175800, 1277939, 1288337, 1298558, 1305606, 1313485, 1322852    

Description Ala Hino 2015-07-28 08:42:29 UTC
Description of problem:
Sending multiple hosts to libvirt fails libvirt with following error:
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 731, in _startUnderlyingVm
    self._run()
  File "/usr/share/vdsm/virt/vm.py", line 1902, in _run
    self._connection.createXML(domxml, flags),
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 124, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3427, in createXML
    if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self)
libvirtError: internal error: Expected exactly 1 host for the gluster volume

Version-Release number of selected component (if applicable):
libvirt-daemon-driver-nodedev-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-qemu-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-kvm-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-interface-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-nwfilter-1.2.8-16.el7_1.3.x86_64
libvirt-client-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-secret-1.2.8-16.el7_1.3.x86_64
libvirt-lock-sanlock-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-network-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-storage-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-config-nwfilter-1.2.8-16.el7_1.3.x86_64
libvirt-python-1.2.8-7.el7_1.1.x86_64

How reproducible:
100%

Steps to Reproduce:
1. create a vm with gluster as network devices and provide more than one host
2.
3.

Actual results:
libvirtError: internal error: Expected exactly 1 host for the gluster volume

Expected results:
Shouldn't fail

Additional info:
We have to send three gluster hosts because we work with replica 3 volumes.

disk xml sent to libvirt:
<disk device="disk" snapshot="no" type="network">
    <source name="vol4/68df3c0d-f58d-44fa-8cc7-ade9fd0a85da/images/b91d13c7-758c-4fbd-9408-220d1d5c65fb/b6050647-1ac3-47bf-a892-ea90a8519e6f" protocol="gluster">
        <host name="gluster01" port="0" transport="tcp"/>
        <host name="gluster02" port="0" transport="tcp"/>
        <host name="gluster03" port="0" transport="tcp"/>
    </source>
    <target bus="virtio" dev="vda"/>
    <serial>b91d13c7-758c-4fbd-9408-220d1d5c65fb</serial>
    <boot order="2"/>
    <driver cache="none" error_policy="stop" io="threads" name="qemu" type="raw"/>
</disk>

Comment 1 Ján Tomko 2015-07-28 08:56:28 UTC
libvirt-daemon-1.2.8-16.el7_1.3.x86_64 is not an upstream release, moving to the downstream tracker.

Comment 3 Peter Krempa 2015-07-29 09:28:20 UTC
We are lacking qemu support. I've filed 1247933 to track the qemu addition.

Comment 4 Prasanna Kumar Kalever 2015-09-10 10:08:12 UTC
Patches Send to QEMU: (Under Review)

[Qemu-devel] [PATCH 1/1] block/gluster: add support for multiple gluster
https://lists.gnu.org/archive/html/qemu-devel/2015-09/msg02016.html

[Qemu-devel] [PATCH v2 1/1] block/gluster: add support for multiple glus
https://lists.gnu.org/archive/html/qemu-devel/2015-09/msg02437.html

Comment 5 Prasanna Kumar Kalever 2015-10-05 10:51:36 UTC
Just send the patch to libvirt:
https://www.redhat.com/archives/libvir-list/2015-October/msg00106.html

Comment 7 Karen Noel 2016-02-07 13:35:17 UTC
Progress upstream?

Comment 8 Peter Krempa 2016-02-08 07:37:10 UTC
Still waiting on qemu upstream to deal with some design issues:

http://lists.nongnu.org/archive/html/qemu-devel/2016-02/msg01328.html

Comment 10 Andrejs Baulins 2016-03-08 12:54:21 UTC
Can someone explain, please - why multiple volumes support became blocker for libgfapi calls, while it wasn't blocker while single fuse mountpoint usage was not considered blocker?

Is it because it was supposed, that some network failover solution is supposed to be used in the middle between fuse mountpoint and qemu process? But such network failover was always a workaround, isn't it? 

Why current blocker could not be workarounded same way?
If "localhost:volume" is most natural way of configuration, case, when gluster dies on localhost leaving all other processes intact is quite rare, I believe.

So, feature could be already released, with a caution.
Or am I wrong?

Comment 11 Andrejs Baulins 2016-03-08 13:00:37 UTC
Wsorry for mistakes and lexical duplicates :/

Comment 12 Niels de Vos 2016-03-12 16:06:21 UTC
(In reply to Andrejs Baulins from comment #10)

From a Gluster point of view, you are not wrong. I do not see a strong requirement to have multiple hosts listed in the libvirt xml. Once one server (could be "localhost" running GlusterD) is connected with qemu/libgfapi, the Gluster client (qemu+gfapi) is aware of all the bricks participating in the volume, and a connection is made to all of those bricks. The host in the libvirt xml is only used for the initial retrieving of the layout of the Gluster volume.

Comment 13 Ralf Schenk 2016-06-27 09:11:26 UTC
(In reply to Niels de Vos from comment #12)
> (In reply to Andrejs Baulins from comment #10)
> 
> From a Gluster point of view, you are not wrong. I do not see a strong
> requirement to have multiple hosts listed in the libvirt xml. Once one
> server (could be "localhost" running GlusterD) is connected with
> qemu/libgfapi, the Gluster client (qemu+gfapi) is aware of all the bricks
> participating in the volume, and a connection is made to all of those
> bricks. The host in the libvirt xml is only used for the initial retrieving
> of the layout of the Gluster volume.

I', also waiting really for a way to use the native qmui+gfapi way of mounting my VM's. 
I did a long way and switched from OpenNebula (which supports native gluster mounts!) to Ovirt to finally see my VM XML's to use it the fuse way. 
I'm really disappointed and would also use a workaround to be able to specify "Yes I want gfapi". Since I started gluster I used round-robin a DNS entry for my gluster volume servers includincg all "entry points" to my gluster I would be ok to specify only _one_ gluster host.

Now OVirt 4.0 is out and still nothing new ?

Comment 16 lijuan men 2016-09-01 09:02:10 UTC
There are 2 scenarios I feel confused.

version:
1)test host:
libvirt-2.0.0-6.el7.x86_64
qemu-kvm-rhev-2.6.0-22.el7.x86_64

2)glusterfs server :
glusterfs-server-3.7.9-12.el7rhgs.x86_64


scenario1:
1.prepare  glusterfs environment in glusterfs servers:
[root@localhost ~]# gluster volume status
Status of volume: test
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.66.4.213:/opt/test1                49154     0          Y       2545 
Brick 10.66.4.105:/opt/test2                49152     0          Y       24386
Brick 10.66.4.148:/opt/test3                49153     0          Y       20994

2.in the test host(10.66.70.107)
[root@localhost ~]# qemu-img create -f qcow2 gluster://10.66.4.213/test/test1.img 200M

3.start a guest with xml:
<driver name='qemu' type='qcow2' cache='none'/>
      <source protocol='gluster' name='test/test1.img'>
        <host name='10.66.4.2'/>  --->not exists
        <host name='10.66.4.105'/>    --->one of the above glusterfs servers
        <host name='10.66.4.148'/>    --->one of the above glusterfs servers
      </source>
      <target dev='vda' bus='virtio'/>
      <boot order='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </disk>

[root@localhost ~]# virsh start bios
error: Failed to start domain bios
error: failed to initialize gluster connection (src=0x7f1be48b7130 priv=0x7f1bec254f10): Transport endpoint is not connected

My question:
if the first host can't be found,the second host will not be tried? can we add the function?



scenario2:
1.shutdown the first glusterfs server(10.66.4.213)
2.start the guest with xml:
<driver name='qemu' type='qcow2' cache='none'/>
      <source protocol='gluster' name='test/test1.img'>
  ***      <host name='10.66.4.105'/>    
        <host name='10.66.4.148'/>  ***  They are other two glusterfs server available
      </source>
      <target dev='vda' bus='virtio'/>
      <boot order='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </disk>

[root@localhost ~]# virsh start bios      --->guest can be started,but very slow,about 6 mins.

3.destroy the guest
[root@localhost ~]# virsh destroy bios
error: Disconnected from qemu:///system due to keepalive timeout
error: Failed to destroy domain bios
error: internal error: connection closed due to keepalive timeout      -->is the error expected?

Additional info:
I have also tried the command in the test host:
[root@localhost ~]# mount -t glusterfs 10.66.4.105:/test /var/lib/libvirt/images/
It is also very slow.

Comment 18 Peter Krempa 2016-09-08 13:50:25 UTC
Both of those originate from the glusterfs library since libvirt just initializes it. There's a discussion on the QEMU bug relevant to this work regarding missing servers and the test cases connected to it.

Comment 19 Niels de Vos 2016-09-08 14:42:06 UTC
http://git.qemu.org/?p=qemu.git;a=commit;h=6c7189bb29de9fa2202f613f3c6caf028f96f261 is included in QEMU v2.7.0.

http://libvirt.org/git/?p=libvirt.git;a=commit;h=f1bbc7df4a9959e09679486c769e12f82d443d9a is included in libvirt v2.1.0

It seems that the test-host with these versions is too old:
 - libvirt-2.0.0-6.el7.x86_64
 - qemu-kvm-rhev-2.6.0-22.el7.x86_64

Comment 20 Peter Krempa 2016-09-08 15:18:40 UTC
libvirt backported that from the upstream v2.1.0 version to libvirt-2.0.0-4.el7 thus -6 version is good enough.

Comment 21 Niels de Vos 2016-09-08 16:57:22 UTC
(In reply to Peter Krempa from comment #20)
> libvirt backported that from the upstream v2.1.0 version to
> libvirt-2.0.0-4.el7 thus -6 version is good enough.

Yes, the missing part seems to be the patch(es) for QEMU.

Comment 22 lijuan men 2016-09-09 11:38:08 UTC
hi ,Peter,

for scenario 1:
I have confirmed with qemu qe,if the first glusterfs server IP specified in vm does not exist(or can't be connected),and the second and third IP are right,the guest can boot up successfully using qemu cli.

the cli is:
...-drive file.driver=gluster,file.volume=test,file.path=/test.img,file.server.0.type=tcp,file.server.0.host=10.66.4.1,file.server.0.port=24007,file.server.1.type=tcp,file.server.1.host=10.66.4.105,file.server.1.port=24007,format=raw,if=none,id=drive-virtio-disk1,cache=none .....

I think there is no difference from what libvirt used.

But in libvirt,the vm can't boot up,as scenario1 in comment 16.

qemu qe and I use the same glusterfs servers.

Is the phenomenon in libvirt normal? Do we need to improve it?

for scenario2:
It is my fault. My test host maybe have some issue. I have used other test host,I can't reproduce the scenario again. Sorry.

Comment 23 lijuan men 2016-09-20 10:33:52 UTC
the scenario1 in comment16 will be tracked in bug 1377663.
change the bug status as verified.

Comment 25 errata-xmlrpc 2016-11-03 18:20:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2577.html