Description of problem: I have a VM backed by file on replica 3 gluster volume with libgfapi enabled. When i try to make a snpshot of that VM, libvirt answers me: internal error: protocol 'gluster' accepts only one host Version-Release number of selected component (if applicable): libvirt-daemon.x86_64 2.0.0-10.el7_3.9 How reproducible: Always Steps to Reproduce: 1. Create a replica 3 gluster volume and place image file for vm on it. 2. Run a VM accessing that file using libgfapi and all three hosts 3. Try to make a snapshot Actual results: error is returned: "internal error: protocol 'gluster' accepts only one host" Expected results: Snapshot should be created. Additional info: Here is a XML definition of my VM disks: <disk type='network' device='disk' snapshot='no'> <driver name='qemu' type='raw' cache='none' error_policy='stop' io='threads'/> <source protocol='gluster' name='data/image'> <host name='host1' port='0'/> <host name='host2' port='0'/> <host name='host3' port='0'/> </source> <backingStore/> <target dev='sda' bus='scsi'/> <serial>8a96ac00-96fc-452d-ae42-eb2a8bd27c43</serial> <boot order='1'/> <alias name='scsi0-0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk>
I can't reproduce the issue version: libvirt-2.0.0-10.el7_3.9.x86_64 qemu-kvm-rhev-2.6.0-28.el7_3.12.x86_64 glusterfs-server-3.8.4-31.el7rhgs.1.1463907.x86_64 steps: 1.prepare the glusterfs env [root@localhost ~]# gluster volume create test replica 3 10.66.70.107:/opt/b1 10.66.4.163:/opt/b2 10.66.7.102:/opt/b3 force [root@localhost ~]# gluster volume start test [root@localhost ~]# gluster volume info Volume Name: test Type: Replicate Volume ID: 5502765d-4c5a-41a7-958d-4a6b7f937359 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: 10.66.70.107:/opt/b1 Brick2: 10.66.4.163:/opt/b2 Brick3: 10.66.7.102:/opt/b3 Options Reconfigured: server.allow-insecure: on transport.address-family: inet nfs.disable: on 2.start a guest with a gluster disk with the xml: <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/RHEL-7.4-x86_64-latest.qcow2'/> <target dev='hda' bus='ide'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <disk type='network' device='disk' snapshot='no'> <driver name='qemu' type='raw' cache='none' error_policy='stop' io='threads'/> <source protocol='gluster' name='test/test.img'> <host name='10.66.70.107'/> <host name='10.66.4.163'/> <host name='10.66.7.102'/> </source> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/> </disk> [root@localhost ~]# virsh domblklist test Target Source ------------------------------------------------ hda /var/lib/libvirt/images/RHEL-7.4-x86_64-latest.qcow2 vda test/test.qcow2 3.create all kinds of snapshots [root@localhost ~]# virsh snapshot-create-as test s1 error: unsupported configuration: internal snapshots and checkpoints require all disks to be selected for snapshot [root@localhost ~]# virsh snapshot-create-as test s1 --memspec file=/tmp/s1.mem Domain snapshot s1 created [root@localhost ~]# virsh snapshot-create-as test s2 --disk-only --diskspec vda,file=/tmp/s2 Domain snapshot s2 created I can not see the error info in bug description So I want to know: 1)Are the steps above same with the bug description? 2)what are the versions of qemu-kvm-rhev and glusterfs-server you used? 3)what is the detailed creating snapshot command you used?
(In reply to lijuan men from comment #2) > Domain snapshot s1 created > > [root@localhost ~]# virsh snapshot-create-as test s2 --disk-only --diskspec > vda,file=/tmp/s2 > Domain snapshot s2 created The overlay image needs to be a 3 brick replica on gluster too. You need to specify it via XML to snapshot-create. You created a snapshot to a local file, which obviously works.
(In reply to Peter Krempa from comment #3) > (In reply to lijuan men from comment #2) > > Domain snapshot s1 created > > > > [root@localhost ~]# virsh snapshot-create-as test s2 --disk-only --diskspec > > vda,file=/tmp/s2 > > Domain snapshot s2 created > > The overlay image needs to be a 3 brick replica on gluster too. You need to > specify it via XML to snapshot-create. > > You created a snapshot to a local file, which obviously works. thanks,peter, based on your comment,I can reproduce it steps: 1.start a guest with 2 disks,hda is a local disk,vda is a gluster disk [root@localhost ~]# virsh domblklist test Target Source ------------------------------------------------ hda /var/lib/libvirt/images/RHEL-7.4-x86_64-latest.qcow2 vda test/test.img 2.for vda, create an external snapshot with replica 3 gluster disk [root@localhost ~]# cat snapshot.xml <domainsnapshot> <name>my snap name</name> <disks> <disk name='vda' snapshot='external' type='network'> <source protocol='gluster' name='test/sna.qcow2'> <host name='10.66.70.107'/> <host name='10.66.4.163'/> <host name='10.66.7.102'/> </source> </disk> </disks> </domainsnapshot> [root@localhost ~]# virsh snapshot-create test snapshot.xml --disk-only error: internal error: protocol 'gluster' accepts only one host 3.for hda, create an external snapshot with replica 3 gluster disk [root@localhost ~]# cat snapshot.xml <domainsnapshot> <name>my snap name</name> <disks> <disk name='hda' snapshot='external' type='network'> <source protocol='gluster' name='test/sna.qcow2'> <host name='10.66.70.107'/> <host name='10.66.4.163'/> <host name='10.66.7.102'/> </source> </disk> </disks> </domainsnapshot> [root@localhost ~]# virsh snapshot-create test snapshot.xml --disk-only error: internal error: protocol 'gluster' accepts only one host
Unfortunately qemu does not accept any of the modern JSON based ways to specify the target for the snapshot directly. The only way is to use blockdev-add along with this which will require more work.
This is going to be finalized in the next major RHEL release.
As of: commit a3f8abb00315181a24a7c769669f71527020ea92 Author: Peter Krempa <pkrempa> Date: Mon Dec 17 18:31:29 2018 +0100 qemu: Add -blockdev support for external snapshots Use the code for creating or attaching new storage source in the snapshot code and switch to 'blockdev-snapshot' for creating the snapshot itself. we use blockdev-add infrastructure to open images in qemu. This allows to specify even multi-host gluster images properly. The blockdev feature was enabled since: commit c6a9e54ce3252196f1fc6aa9e57537a659646d18 Author: Peter Krempa <pkrempa> Date: Mon Jan 7 11:45:19 2019 +0100 qemu: enable blockdev support Now that all pieces are in place (hopefully) let's enable -blockdev. We base the capability on presence of the fix for 'auto-read-only' on files so that blockdev works properly, mandate that qemu supports explicit SCSI id strings to avoid ABI regression and that the fix for 'savevm' is present so that internal snapshots work. v5.9.0-390-gc6a9e54ce3 and requires upstream qemu-4.2 or appropriate downstream.
Hi peter, One permission deni error is found when creating disk snapshot to glusterfs source: Run as the same steps as https://bugzilla.redhat.com/show_bug.cgi?id=1783187#c0 # virsh snapshot-create q35 snapshot.xml --no-metadata --disk-only error: operation failed: failed to format image: 'Permission denied' Which is caused by perimission deni when write to gluserfs client log: ERROR: failed to create logfile "/var/log/glusterfs/10.66.85.42-gv-27855.log" (Permission denied) I have set selinux to permissive mode and allow qemu user to write /var/log/glusterfs/, but it is failed, too: # setenforce 0 # setfacl -m u:qemu:rw /var/log/glusterfs/ Maybe we should set /var/log/glusterfs/ to rw in qemu namespace?
BTW, it works well when blockdev is disabled: # virsh dumpxml q35-drive|less <domain type='kvm' id='4' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'> <name>q35-drive</name> ... <qemu:capabilities> <qemu:add capability='drive'/> <qemu:del capability='blockdev'/> </qemu:capabilities> </domain> # virsh snapshot-create q35-drive snapshot.xml --no-metadata --disk-only Domain snapshot s1 created from 'snapshot.xml'
Could you please try setting the gluster debug level to 0? gluster_debug_level = 0 in /etc/qemu.conf At any rate, if the debug logging differs between blockdev and non-blockdev it might also be a qemu bug.
At any rate, I don't think we should allow qemu just to write the gluster logs where it pleases. Allowing it to write to the log directory will mean that it could overwrite other logs which is not acceptable.
(In reply to Peter Krempa from comment #37) > Could you please try setting the gluster debug level to 0? > > gluster_debug_level = 0 > > in /etc/qemu.conf > > At any rate, if the debug logging differs between blockdev and non-blockdev > it might also be a qemu bug. Still reproduced when gluster_debug_level = 0 # augtool get /files//etc/libvirt/qemu.conf/gluster_debug_level /files//etc/libvirt/qemu.conf/gluster_debug_level = 0 # systemctl restart libvirtd # virsh snapshot-create q35 snapshot.xml --no-metadata --disk-only error: operation failed: failed to format image: 'Permission denied'
(In reply to Peter Krempa from comment #38) > At any rate, I don't think we should allow qemu just to write the gluster > logs where it pleases. Allowing it to write to the log directory will mean > that it could overwrite other logs which is not acceptable. The block node create qmp called by libvirt is: 2019-12-13 07:42:29.439+0000: 27733: info : qemuMonitorIOWrite:453 : QEMU_MONITOR_IO_WRITE: mon=0x7f1df8043810 buf={"execute":"blockdev-create","arguments":{"job-id":"create-libvirt-2-storage","options":{"driver":"gluster","location":{"volume":"gv","path":"q35.s1","server":[{"type":"inet","host":"10.66.85.42","port":"24007"}],"debug":9},"size":3298103296}},"id":"libvirt-367" logfile is not specified. As qmp mannual said(https://qemu.weilnetz.de/doc/qemu-qmp-ref.html#index-BlockdevOptionsGluster): logfile: string (optional) libgfapi log file (default /dev/stderr) (Since 2.8) The default logfile is stderr. I am very curious why it is set as /var/log/glusterfs/<server IP>-<volume name>-<pid>.log
So this is a qemu bug. The blockdev code path in qemu results in calling of qemu_gluster_glfs_init which in turn calls glfs_set_logging with the @logfile parameter being NULL. A NULL @logfile results into the following behaviour cited from the documentation of glfs_set_logging: @logfile: The logfile to be used for logging. Will be created if it does not already exist (provided system permissions allow). If NULL, a new logfile will be created in default log directory associated with the glusterfs installation. Thanks to Kevin Wolf for pointing out the qemu bits.
(In reply to Han Han from comment #40) > (In reply to Peter Krempa from comment #38) > > At any rate, I don't think we should allow qemu just to write the gluster > > logs where it pleases. Allowing it to write to the log directory will mean > > that it could overwrite other logs which is not acceptable. > > The block node create qmp called by libvirt is: > 2019-12-13 07:42:29.439+0000: 27733: info : qemuMonitorIOWrite:453 : > QEMU_MONITOR_IO_WRITE: mon=0x7f1df8043810 > buf={"execute":"blockdev-create","arguments":{"job-id":"create-libvirt-2- > storage","options":{"driver":"gluster","location":{"volume":"gv","path":"q35. > s1","server":[{"type":"inet","host":"10.66.85.42","port":"24007"}],"debug": > 9},"size":3298103296}},"id":"libvirt-367" This says 'debug:9'. Did you set debug to 9 explicitly? or is it set to 0 according to my suggestion above?
I filed https://bugzilla.redhat.com/show_bug.cgi?id=1783313 on qemu to track the problem with the permissions and set the dependancy. I'm moving this back to ON_QA
(In reply to Peter Krempa from comment #42) > (In reply to Han Han from comment #40) > > (In reply to Peter Krempa from comment #38) > > > At any rate, I don't think we should allow qemu just to write the gluster > > > logs where it pleases. Allowing it to write to the log directory will mean > > > that it could overwrite other logs which is not acceptable. > > > > The block node create qmp called by libvirt is: > > 2019-12-13 07:42:29.439+0000: 27733: info : qemuMonitorIOWrite:453 : > > QEMU_MONITOR_IO_WRITE: mon=0x7f1df8043810 > > buf={"execute":"blockdev-create","arguments":{"job-id":"create-libvirt-2- > > storage","options":{"driver":"gluster","location":{"volume":"gv","path":"q35. > > s1","server":[{"type":"inet","host":"10.66.85.42","port":"24007"}],"debug": > > 9},"size":3298103296}},"id":"libvirt-367" > > This says 'debug:9'. Did you set debug to 9 explicitly? or is it set to 0 > according to my suggestion above? The log above is from the operations of comment35. And I also tried gluster_debug_level=0, which resulted 'Permission denied', too.
Just wanted to comment to say that I'm also affected by this bug in my replica 3 HCI cluster. I'm very interested in a resolution as I cannot use libgfapi in my environment without snapshot ability and am missing out on major performance improves as as result.
When this bug is Resolved, it would be nice if someone could reopen the following ones : https://bugzilla.redhat.com/show_bug.cgi?id=1633642 https://bugzilla.redhat.com/show_bug.cgi?id=1484227
(In reply to Guillaume Pavese from comment #46) > When this bug is Resolved, it would be nice if someone could reopen the > following ones : > https://bugzilla.redhat.com/show_bug.cgi?id=1633642 > https://bugzilla.redhat.com/show_bug.cgi?id=1484227 We do not plan to enable libgfapi for oVirt/RHV. We did not find enough performance improvement justification for it.
Set qa_ack- since depended bug BZ1447694 is WONTFIX
@(In reply to Yaniv Kaul from comment #47) > (In reply to Guillaume Pavese from comment #46) > > When this bug is Resolved, it would be nice if someone could reopen the > > following ones : > > https://bugzilla.redhat.com/show_bug.cgi?id=1633642 > > https://bugzilla.redhat.com/show_bug.cgi?id=1484227 > > We do not plan to enable libgfapi for oVirt/RHV. We did not find enough > performance improvement justification for it. Someone on ovirt-user mailing list has reported a 4x~5x performance improvement with libgfapi : https://lists.ovirt.org/archives/list/users@ovirt.org/message/FYVTG3NUIXE5LJBBVEGGKHQFOGKJ5CU2/ https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZPHBO66WLXTCBW5XD2A3HLBY7AFG3JAJ/
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.