Bug 1369390 - qemu gets SIGABRT when using glusterfs
Summary: qemu gets SIGABRT when using glusterfs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: libgfapi
Version: rhgs-3.1
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: RHGS 3.1.3 Async
Assignee: Prasanna Kumar Kalever
QA Contact: SATHEESARAN
URL:
Whiteboard:
Depends On:
Blocks: 1356372
TreeView+ depends on / blocked
 
Reported: 2016-08-23 09:32 UTC by Atin Mukherjee
Modified: 2016-09-06 07:06 UTC (History)
21 users (show)

Fixed In Version: glusterfs-3.7.9-12
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1356372
Environment:
Last Closed: 2016-09-06 07:06:18 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1812 0 normal SHIPPED_LIVE Red Hat Gluster Storage 3.1 glusterfs Update 2016-09-06 10:59:06 UTC

Description Atin Mukherjee 2016-08-23 09:32:47 UTC
+++ This bug was initially created as a clone of Bug #1356372 +++

Description of problem:
As subject.

Version-Release number of selected component:
glusterfs-3.7.9-10.el7.x86_64
qemu-kvm-rhev-2.6.0-13.el7.x86_64
libvirt-2.0.0-2.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Prepare a glusterfs server
2. Start a VM whose image is based on the glusterfs server
# cat run.xml
```
<disk type='network' device='disk'>
<driver name='qemu' type='qcow2' cache='none'/>
<source protocol='gluster' name='gluster-vol1/c18468.qcow2'>
<host name='10.66.4.164'/>
</source>
<target dev='vda' bus='virtio'/>
<alias name='virtio-disk0'/>
</disk>
```
# virsh create run.xml --console

Qemu will get SIGABRT after VM load its kernel

# abrt-cli ls|head
id 2a10b5c86ea1a63dc1e18bf32e2773a7ee0dde30
reason:         qemu-kvm killed by SIGABRT
time:           Thu 14 Jul 2016 10:24:59 AM CST
cmdline:        /usr/libexec/qemu-kvm -name guest=16572,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-21-16572/master-key.aes -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off,vmport=off -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid fc108f05-29e9-4823-ae05-a601d4481d53 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-21-16572/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=gluster://10.66.4.164/gluster-vol1/RHEL-7.3-latest.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=54,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:4e:f5:57,bus=pci.0,addr=0x9 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/channel/target/domain-21-16572/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -device usb-tablet,id=input0 -spice port=5900,tls-port=5901,addr=0.0.0.0,disable-ticketing,x509-dir=/etc/pki/libvirt-spice,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on
package:        qemu-kvm-rhev-2.6.0-13.el7
uid:            107 (qemu)
count:          1
Directory:      /var/spool/abrt/ccpp-2016-07-14-10:24:59-8845
Run 'abrt-cli report /var/spool/abrt/ccpp-2016-07-14-10:24:59-8845' for creating a case in Red Hat Customer Portal


Additional info:
There is error when using qemu-img:
# qemu-img create gluster://10.66.4.164/gluster-vol1/G.qcow2 5G -f qcow2
Formatting 'gluster://10.66.4.164/gluster-vol1/G.qcow2', fmt=qcow2 size=5368709120 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
[2016-07-14 02:50:05.431996] E [glfs-fops.c:746:glfs_io_async_cbk] (-->/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so(io_stats_writev_cbk+0x24c) [0x7fa1b5cee31c] -->/lib64/libgfapi.so.0(+0xb81d) [0x7fa1cd3d981d] -->/lib64/libgfapi.so.0(+0xb736) [0x7fa1cd3d9736] ) 0-gfapi: invalid argument: iovec [Invalid argument]

--- Additional comment from RHEL Product and Program Management on 2016-07-14 00:17:00 EDT ---

Since this bug report was entered in bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-07-14 00:17:07 EDT ---

This bug report has Keywords: Regression or TestBlocker.

Since no regressions or test blockers are allowed between releases, it is also being [proposed|marked] as a blocker for this release.

Please resolve ASAP.

--- Additional comment from Jeff Cody on 2016-07-27 19:02:16 EDT ---

This appears to be an issue in the gluster library.

When I use library version 3.7.1-16.el7, everything works fine.

With gluster library version 3.7.9-10.el7, I get the same error message reported.

When trying 3.7.9-11.el7, I get even more errors:

[2016-07-27 22:51:58.758094] E [glfs-fops.c:746:glfs_io_async_cbk] (-->/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so(io_stats_writev_cbk+0x24c) [0x7feebaaf231c] -->/lib64/libgfapi.so.0(+0xb81d) [0x7feed66fe81d] -->/lib64/libgfapi.so.0(+0xb736) [0x7feed66fe736] ) 0-gfapi: invalid argument: iovec [Invalid argument]
[2016-07-27 22:51:58.758241] E [glfs-fops.c:746:glfs_io_async_cbk] (-->/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so(io_stats_writev_cbk+0x24c) [0x7feebaaf231c] -->/lib64/libgfapi.so.0(+0xb81d) [0x7feed66fe81d] -->/lib64/libgfapi.so.0(+0xb736) [0x7feed66fe736] ) 0-gfapi: invalid argument: iovec [Invalid argument]
[2016-07-27 22:51:58.758289] E [glfs-fops.c:746:glfs_io_async_cbk] (-->/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so(io_stats_writev_cbk+0x24c) [0x7feebaaf231c] -->/lib64/libgfapi.so.0(+0xb81d) [0x7feed66fe81d] -->/lib64/libgfapi.so.0(+0xb736) [0x7feed66fe736] ) 0-gfapi: invalid argument: iovec [Invalid argument]
[2016-07-27 22:51:58.758310] E [glfs-fops.c:746:glfs_io_async_cbk] (-->/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so(io_stats_writev_cbk+0x24c) [0x7feebaaf231c] -->/lib64/libgfapi.so.0(+0xb81d) [0x7feed66fe81d] -->/lib64/libgfapi.so.0(+0xb736) [0x7feed66fe736] ) 0-gfapi: invalid argument: iovec [Invalid argument]
[2016-07-27 22:51:58.758329] E [glfs-fops.c:746:glfs_io_async_cbk] (-->/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so(io_stats_writev_cbk+0x24c) [0x7feebaaf231c] -->/lib64/libgfapi.so.0(+0xb81d) [0x7feed66fe81d] -->/lib64/libgfapi.so.0(+0xb736) [0x7feed66fe736] ) 0-gfapi: invalid argument: iovec [Invalid argument]
[2016-07-27 22:51:58.758356] E [glfs-fops.c:746:glfs_io_async_cbk] (-->/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so(io_stats_writev_cbk+0x24c) [0x7feebaaf231c] -->/lib64/libgfapi.so.0(+0xb81d) [0x7feed66fe81d] -->/lib64/libgfapi.so.0(+0xb736) [0x7feed66fe736] ) 0-gfapi: invalid argument: iovec [Invalid argument]
[2016-07-27 22:51:58.758373] E [glfs-fops.c:746:glfs_io_async_cbk] (-->/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so(io_stats_writev_cbk+0x24c) [0x7feebaaf231c] -->/lib64/libgfapi.so.0(+0xb81d) [0x7feed66fe81d] -->/lib64/libgfapi.so.0(+0xb736) [0x7feed66fe736] ) 0-gfapi: invalid argument: iovec [Invalid argument]
[2016-07-27 22:51:58.758443] E [glfs-fops.c:746:glfs_io_async_cbk] (-->/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so(io_stats_writev_cbk+0x24c) [0x7feebaaf231c] -->/lib64/libgfapi.so.0(+0xb81d) [0x7feed66fe81d] -->/lib64/libgfapi.so.0(+0xb736) [0x7feed66fe736] ) 0-gfapi: invalid argument: iovec [Invalid argument]
[2016-07-27 22:51:58.758472] E [glfs-fops.c:746:glfs_io_async_cbk] (-->/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so(io_stats_writev_cbk+0x24c) [0x7feebaaf231c] -->/lib64/libgfapi.so.0(+0xb81d) [0x7feed66fe81d] -->/lib64/libgfapi.so.0(+0xb736) [0x7feed66fe736] ) 0-gfapi: invalid argument: iovec [Invalid argument]
[2016-07-27 22:51:58.758488] E [glfs-fops.c:746:glfs_io_async_cbk] (-->/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so(io_stats_writev_cbk+0x24c) [0x7feebaaf231c] -->/lib64/libgfapi.so.0(+0xb81d) [0x7feed66fe81d] -->/lib64/libgfapi.so.0(+0xb736) [0x7feed66fe736] ) 0-gfapi: invalid argument: iovec [Invalid argument]
[2016-07-27 22:51:58.758505] E [glfs-fops.c:746:glfs_io_async_cbk] (-->/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so(io_stats_writev_cbk+0x24c) [0x7feebaaf231c] -->/lib64/libgfapi.so.0(+0xb81d) [0x7feed66fe81d] -->/lib64/libgfapi.so.0(+0xb736) [0x7feed66fe736] ) 0-gfapi: invalid argument: iovec [Invalid argument]
[2016-07-27 22:51:58.758521] E [glfs-fops.c:746:glfs_io_async_cbk] (-->/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so(io_stats_writev_cbk+0x24c) [0x7feebaaf231c] -->/lib64/libgfapi.so.0(+0xb81d) [0x7feed66fe81d] -->/lib64/libgfapi.so.0(+0xb736) [0x7feed66fe736] ) 0-gfapi: invalid argument: iovec [Invalid argument]
[2016-07-27 22:51:58.758536] E [glfs-fops.c:746:glfs_io_async_cbk] (-->/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so(io_stats_writev_cbk+0x24c) [0x7feebaaf231c] -->/lib64/libgfapi.so.0(+0xb81d) [0x7feed66fe81d] -->/lib64/libgfapi.so.0(+0xb736) [0x7feed66fe736] ) 0-gfapi: invalid argument: iovec [Invalid argument]
[2016-07-27 22:51:58.758555] E [glfs-fops.c:746:glfs_io_async_cbk] (-->/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so(io_stats_writev_cbk+0x24c) [0x7feebaaf231c] -->/lib64/libgfapi.so.0(+0xb81d) [0x7feed66fe81d] -->/lib64/libgfapi.so.0(+0xb736) [0x7feed66fe736] ) 0-gfapi: invalid argument: iovec [Invalid argument]
[2016-07-27 22:51:58.758571] E [glfs-fops.c:746:glfs_io_async_cbk] (-->/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so(io_stats_writev_cbk+0x24c) [0x7feebaaf231c] -->/lib64/libgfapi.so.0(+0xb81d) [0x7feed66fe81d] -->/lib64/libgfapi.so.0(+0xb736) [0x7feed66fe736] ) 0-gfapi: invalid argument: iovec [Invalid argument]
[2016-07-27 22:51:58.758587] E [glfs-fops.c:746:glfs_io_async_cbk] (-->/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so(io_stats_writev_cbk+0x24c) [0x7feebaaf231c] -->/lib64/libgfapi.so.0(+0xb81d) [0x7feed66fe81d] -->/lib64/libgfapi.so.0(+0xb736) [0x7fe[2016-07-27 22:51:58.761104] E [socket.c:1101:__socket_ioq_new] 0-gv0-client-0: msg size (3131969733) bigger than the maximum allowed size on sockets (2147483647)
[2016-07-27 22:51:58.761118] W [rpc-clnt.c:1586:rpc_clnt_submit] 0-gv0-client-0: failed to submit rpc-request (XID: 0x13 Program: GlusterFS 3.3, ProgVers: 330, Proc: 13) to rpc-transport (gv0-client-0)
[2016-07-27 22:51:58.761128] W [MSGID: 114031] [client-rpc-fops.c:907:client3_3_writev_cbk] 0-gv0-client-0: remote operation failed [Transport endpoint is not connected]
[2016-07-27 22:51:58.761148] W [MSGID: 114029] [client-rpc-fops.c:4424:client3_3_writev] 0-gv0-client-0: failed to send the fop
[2016-07-27 22:51:58.761160] E [socket.c:1101:__socket_ioq_new] 0-gv0-client-1: msg size (3131969733) bigger than the maximum allowed size on sockets (2147483647)
[2016-07-27 22:51:58.761166] W [rpc-clnt.c:1586:rpc_clnt_submit] 0-gv0-client-1: failed to submit rpc-request (XID: 0x11c Program: GlusterFS 3.3, ProgVers: 330, Proc: 13) to rpc-transport (gv0-client-1)
[2016-07-27 22:51:58.761173] W [MSGID: 114031] [client-rpc-fops.c:907:client3_3_writev_cbk] 0-gv0-client-1: remote operation failed [Transport endpoint is not connected]
[2016-07-27 22:51:58.761232] W [MSGID: 114029] [client-rpc-fops.c:4424:client3_3_writev] 0-gv0-client-1: failed to send the fop

--- Additional comment from Sankarshan Mukhopadhyay on 2016-08-16 22:40:12 EDT ---

Re-assigning RHBZ to Prasanna. He is the subject expert in being able to respond to the documentation text request.

--- Additional comment from Prasanna Kumar Kalever on 2016-08-17 13:36:02 EDT ---

This is actually a bug in libgfapi, it was side effect for neglecting count in glfs_io struct. If we go into the details, glfs_buf_copy actually assembles all iovecs to a iovec with count=1 and since the gio->count is not updated it dereference invalid addresses.

This bug was fixed by http://review.gluster.org/#/c/14859/
And went in gluster 3.7.13 release.

--- Additional comment from Prasanna Kumar Kalever on 2016-08-18 04:15:23 EDT ---

In comment 5,  I missed to mention about http://review.gluster.org/#/c/14779/

This bug needs two patches actually

Cause:
It was side effect for 
1. neglecting count in glfs_io struct, i.e. gio->count is not updated, hence it
dereference invalid addresses. (http://review.gluster.org/#/c/14859/)
2. In all async ops such as write, fsync, ftruncate expect for read the value for "iovec" was NULL, hence glfs_io_async_cbk checks the value in common routine which may end up in failures. (http://review.gluster.org/#/c/14779/)

Consequence:
we see dereferencing of invalid addresses while performing fops, hence using qemu-img and qemu-system-x86_64 will not behave as expected meaning when qemu block driver invoke glfs api's we see 'iovec [Invalid argument]', may cause the hang or failure.

Fix:
This bug needs two patches
http://review.gluster.org/#/c/14779/ (BZ#1350880)
http://review.gluster.org/#/c/14859/ (BZ#1352482)

Both of them went in gluster 3.7.13 release.

--- Additional comment from Prasanna Kumar Kalever on 2016-08-22 11:13:03 EDT ---

Jiri Herrmann,

Please specify "access through libgfapi", as the issue doesn't exist with FUSE, NFS etc.

Something like "Accessing gluster storage over libgfapi with Qemu fails" could be better.

--- Additional comment from Atin Mukherjee on 2016-08-23 05:31:14 EDT ---

Based on comment 6, moving the BZ to POST

Comment 2 Atin Mukherjee 2016-08-23 09:36:02 UTC
http://review.gluster.org/#/c/14779/ & http://review.gluster.org/#/c/14859/ needs to be back ported to rhgs-3.1.3 branch.

Comment 8 SATHEESARAN 2016-08-30 10:39:46 UTC
Tested with RHEL 7.3 and RHGS 3.1.3 async build ( glusterfs-3.7.9-12.el7rhgs on server and glusterfs-3.7.9-12.el7 on hypervisor )

With qemu-kvm-rhev-2.6.0-22.el7.x86_64 and qemu-kvm-img-2.6.0-22.el7.x86_64

1.  Create a image using qemu-img ( with format - raw and qcow2 )
2.  Installed RHEL 7 on that VM
3.  Also attached additional disks from the gluster volume and partitioned the disks. 
4. Formatted the parition with XFS

All works well

Comment 10 errata-xmlrpc 2016-09-06 07:06:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1812.html


Note You need to log in before you can comment on or make changes to this bug.