1373786 – unable to attach gluster json backing image with unix socket

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1373786 - unable to attach gluster json backing image with unix socket

Summary: unable to attach gluster json backing image with unix socket

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	7.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Peter Krempa
QA Contact:	Han Han
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1375408
TreeView+	depends on / blocked

Reported:	2016-09-07 07:10 UTC by Han Han
Modified:	2017-06-28 07:09 UTC (History)
CC List:	20 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-04-04 14:57:15 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Han Han 2016-09-07 07:10:54 UTC

Description of problem:
as subject

Version-Release number of selected component (if applicable):
libvirt-2.0.0-6.el7.x86_64
qemu-kvm-rhev-2.6.0-22.el7.x86_64
glusterfs-3.7.9-12.el7rhgs.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Create a glusterfs server on localhost
2. Create glusterfs json backing image and attach the image to a running VM
# qemu-img create -f qcow2 -b 'json:{"file.driver":"gluster", "file.volume":"gluster-vol1", "file.path":"V","file.server":[ { "type":"unix", "socket":"/var/run/glusterd.socket"}]}' /var/lib/libvirt/images/gluster_socket.img
Formatting '/var/lib/libvirt/images/gluster_socket.img', fmt=qcow2 size=524288000 backing_file=json:{"file.driver":"gluster",, "file.volume":"gluster-vol1",, "file.path":"V",,"file.server":[ { "type":"unix",, "socket":"/var/run/glusterd.socket"}]} encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16

# virsh attach-disk V  /var/lib/libvirt/images/gluster_socket.img vdb --subdriver qcow2
error: Failed to attach disk
error: internal error: unable to execute QEMU command '__com.redhat_drive_add': Device 'drive-virtio-disk1' could not be initialized

Actual results:
As step2

Expected results:
Attach successfully and VM run with following xml:
 <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/gluster_multi.img'/>
      <backingStore type='network' index='1'>
        <format type='raw'/>
        <source protocol='gluster' name='gluster-vol1/V'>
        <host transport='unix' socket='/var/run/glusterd.socket'/>
        </source>
        <backingStore/>
      </backingStore>
      <target dev='vdb' bus='virtio'/>
    </disk>

By the way, the xml above can be attached and detached successfully in libvirt.

Qemu log:
[2016-09-06 01:59:41.830238] E [rpc-clnt.c:362:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7fc7e9960c32] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fc7e972b84e] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fc7e972b95e] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7a)[0x7fc7e972d2ea] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7fc7e972db18] ))))) 0-gfapi: forced unwinding frame type(GlusterFS Handshake) op(GETSPEC(2)) called at 2016-09-06 01:59:41.829375 (xid=0x1)
[2016-09-06 01:59:41.830388] E [MSGID: 104007] [glfs-mgmt.c:637:glfs_mgmt_getspec_cbk] 0-glfs-mgmt: failed to fetch volume file (key:gluster-vol1) [Invalid argument]
[2016-09-06 01:59:41.830409] E [MSGID: 104024] [glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to connect with remote-host: /var/run/glusterd.socket (Transport endpoint is not connected) [Transport endpoint is not connected]
Could not open backing file: Gluster connection for volume gluster-vol1, path V failed to connect

Actual results:
As above 

Expected results:
Attach successfully.

Additional info:

Comment 1 Han Han 2016-09-07 07:15:03 UTC

The bug will block some test scenarios for BZ1134878

Comment 5 Jeff Cody 2017-01-24 01:53:43 UTC

This worked for me when running QEMU by itself.  I was able to reproduce the exact bug description when using libvirt / virsh, however.

The issue is default permissions of the gluster socket in /var/run.  In the example in the description, "/var/run/glusterd.socket" should be owned by qemu, not by root:

My test image:

qemu-img info --backing-chain /home/jcody/images/gluster_unix.img
image: /home/jcody/images/gluster_unix.img
file format: qcow2
virtual size: 100M (104857600 bytes)
disk size: 196K
cluster_size: 65536
backing file: json:{
    "file.driver": "gluster",
    "file.path": "test.qcow2",
    "file.server": [
        {
            "type": "unix",
            "socket": "/var/run/glusterd.socket"
        }
    ],
    "file.volume": "gv0"
}
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

image: json:{"driver": "qcow2", "file": {"driver": "gluster", "path": "test.qcow2", "server.0.type": "unix", "server.0.socket": "/var/run/glusterd.socket", "volume": "gv0"}}
file format: qcow2
virtual size: 100M (104857600 bytes)
disk size: 193K
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false




Before attending to the permissions of the socket:

# virsh attach-disk c /home/jcody/images/gluster_unix.img vdc --subdriver qcow2
error: Failed to attach disk
error: internal error: unable to execute QEMU command '__com.redhat_drive_add': Device 'drive-virtio-disk2' could not be initialized



Changing owner of the gluster socket, and retrying:

# chown qemu.qemu /var/run/glusterd.socket 

# virsh attach-disk c /home/jcody/images/gluster_unix.img vdc --subdriver qcow2
Disk attached successfully


I believe this is the root of the error, so I will close this as NOTABUG, as it is likely due to steps in the testing scenario.

However, first I'd like to make sure that libvirt (if it currently supports gluster unix sockets) sets the owner / permissions correctly on the socket in this scenario. Eric, do you know if this is the case?

Comment 6 Eric Blake 2017-01-25 14:30:22 UTC

I think libvirt intends to support unix sockets for gluster (at least this comment in storage_backend_gluster.c is telling:
     /* Accept the same URIs as qemu's block/gluster.c:
      * gluster[+transport]://[server[:port]]/vol/[dir/]image[?socket=...] */
), so it's probably best to reassign this to libvirt for evaluation.  Either libvirt is not properly managing the socket permissions, or it is indeed a test setup bug, but it needs evaluation from the libvirt point of view.

Comment 7 Jeff Cody 2017-01-25 14:34:57 UTC

(In reply to Eric Blake from comment #6)
> I think libvirt intends to support unix sockets for gluster (at least this
> comment in storage_backend_gluster.c is telling:
>      /* Accept the same URIs as qemu's block/gluster.c:
>       * gluster[+transport]://[server[:port]]/vol/[dir/]image[?socket=...] */
> ), so it's probably best to reassign this to libvirt for evaluation.  Either
> libvirt is not properly managing the socket permissions, or it is indeed a
> test setup bug, but it needs evaluation from the libvirt point of view.

Thanks Eric.  Re-assigning to libvirt.

Comment 8 Peter Krempa 2017-01-25 14:45:11 UTC

We can't just chown the socket to qemu:qemu, since it would possibly break other configurations. I think it should be left as is and the admin has to set up permissions correctly.

The only sane solution I can see here is to use FD passing if that's possible for gluster connections.

Comment 9 Peter Krempa 2017-04-04 14:57:15 UTC

The qemu gluster driver passes the socket path directly to glfs_set_volfile_server so it's not possible to use qemu's fd passing mechanism.

Since libvirt can't modify the socket permissions as the socket is shared, the user needs to set up the access to the gluster socket properly.

Comment 10 Han Han 2017-06-28 01:42:41 UTC

Hi Peter,
Since glusterd socket and nbd socket cannot be used by libvirt directly, how abourt disabling them?

Comment 11 Peter Krempa 2017-06-28 07:09:47 UTC

It can be used directly if you properly configure permissions for the unix socket. Since it's a shared resource, libvirt can't modify them on the VMs behalf as it would break other clients connecting.

Note You need to log in before you can comment on or make changes to this bug.

aliang
chayang
coli
dyuan
eblake
hachen
huding
jcody
jinzhao
juzhang
knoel
meyang
michen
ngu
pingl
pkrempa
rbalakri
virt-maint
xuwei
xuzhang