Bug 1352482 - qemu libgfapi clients hang when doing I/O with 3.7.12
Summary: qemu libgfapi clients hang when doing I/O with 3.7.12
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: libgfapi
Version: 3.7.12
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact: Sudhir D
URL:
Whiteboard:
Depends On: 1352634
Blocks: glusterfs-3.7.13
TreeView+ depends on / blocked
 
Reported: 2016-07-04 09:14 UTC by Kaushal
Modified: 2016-07-20 13:55 UTC (History)
5 users (show)

Fixed In Version: glusterfs-3.7.13
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1352632 1352634 (view as bug list)
Environment:
Last Closed: 2016-07-20 13:55:32 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)
qemu-img create libgfapi 3.7.11 log (13.21 KB, text/plain)
2016-07-04 09:26 UTC, Kaushal
no flags Details
qemu-img create libgfapi 3.7.12 (13.90 KB, text/plain)
2016-07-04 09:28 UTC, Kaushal
no flags Details
tcpdump captured while creating a qcow2 image (19.42 KB, application/octet-stream)
2016-07-04 10:58 UTC, Niels de Vos
no flags Details
tcpdump captured while creating a raw image (4.09 KB, application/octet-stream)
2016-07-04 10:58 UTC, Niels de Vos
no flags Details
qemu-img running under ltrace (passed due to race condition?) (24.88 KB, text/plain)
2016-07-04 11:33 UTC, Niels de Vos
no flags Details
qemu-img running under ltrace (failed due to race condition?) (29.17 KB, text/plain)
2016-07-04 11:34 UTC, Niels de Vos
no flags Details

Description Kaushal 2016-07-04 09:14:09 UTC
qemu and related tools (qemu-img) hang when using libgfapi from glusterfs-3.7.12.

For eg., running the following qemu-img command against a single brick glusterfs-3.7.12 volume, causes the qemu-img command to hang,

# qemu-img create -f qcow2 gluster://localhost/testvol/testimg.qcow2 10G

With qemu-img at least the hangs happen when creating qcow2 images. The command doesn't hang when creating raw images.

When creating a qcow2 image, the qemu-img appears to be reloading the glusterfs graph several times. This can be observed in the attached log where qemu-img is run against glusterfs-3.7.11.

With glusterfs-3.7.12, this doesn't happen as an early writev failure happens on the brick transport with a EFAULT (Bad address) errno (see attached log). No further actions happen after this, and the qemu-img command hangs till the RPC ping-timeout happens and then fails.

Investigation is still on to find out the cause for this error.

This issue was originally reported in the gluster-users mailing list by Lindsay Mathieson, Kevin Lemonnier and Dmitry Melekhov. [1][2][3]

[1] https://www.gluster.org/pipermail/gluster-users/2016-June/027144.html
[2] https://www.gluster.org/pipermail/gluster-users/2016-June/027186.html
[3] https://www.gluster.org/pipermail/gluster-users/2016-July/027218.html

Comment 1 Kaushal 2016-07-04 09:26:31 UTC
Created attachment 1175883 [details]
qemu-img create libgfapi 3.7.11 log

Comment 2 Kaushal 2016-07-04 09:28:22 UTC
Created attachment 1175888 [details]
qemu-img create libgfapi 3.7.12

Comment 3 Niels de Vos 2016-07-04 10:58:12 UTC
Created attachment 1175955 [details]
tcpdump captured while creating a qcow2 image

Comment 4 Niels de Vos 2016-07-04 10:58:49 UTC
Created attachment 1175956 [details]
tcpdump captured while creating a raw image

Comment 5 Niels de Vos 2016-07-04 11:06:23 UTC
The image is actually created, even tough this error was reported:

qemu-img: gluster://localhost/vms/qcow2.img: Could not resize image: Input/output error

[root@vm017 ~]# qemu-img info gluster://localhost/vms/qcow2.img 
image: gluster://localhost/vms/qcow2.img
file format: qcow2
virtual size: 0 (0 bytes)
disk size: 193K
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false


[root@vm017 ~]# qemu-img info gluster://localhost/vms/raw.img 
image: gluster://localhost/vms/raw.img
file format: raw
virtual size: 32M (33554432 bytes)
disk size: 4.0K


There are no errors in the tcpdump that I could spot in a glance.

Comment 6 Niels de Vos 2016-07-04 11:33:40 UTC
Created attachment 1175960 [details]
qemu-img running under ltrace (passed due to race condition?)

Comment 7 Niels de Vos 2016-07-04 11:34:24 UTC
Created attachment 1175962 [details]
qemu-img running under ltrace (failed due to race condition?)

Comment 8 Poornima G 2016-07-04 12:55:09 UTC
RCA:

Debugged this along with Raghavendra Talur and Kaushal M. turns out this is caused by http://review.gluster.org/#/c/14148/ .

pub_glfs_pwritev_async(..., iovec, iovec_count...) can take array of iovecs as input and another parameter count that indicates the number of iovecs passed. gfapi internally collates all the iovecs into a single iovec and sends it all the way to the RPC(network layer), as a result of collating all the iovecs, the count of iovecs should also be passed as '1', but the patch was sending the count as sent by the user. i.e. if user specified 3 iovecs, and count is 3, gfapi copies all iovecs into one and should send the count as 1, but it is currently sending as 3, and hence the issue.

The fix for the same will be sent, and will try to include it in 3.7.13.

Regards,
Poornima

Comment 9 Vijay Bellur 2016-07-04 13:09:11 UTC
REVIEW: http://review.gluster.org/14854 (gfapi: update count when glfs_buf_copy is used) posted (#1) for review on master by Raghavendra Talur (rtalur@redhat.com)

Comment 10 Vijay Bellur 2016-07-05 05:54:52 UTC
REVIEW: http://review.gluster.org/14859 (gfapi: update count when glfs_buf_copy is used) posted (#1) for review on release-3.7 by Poornima G (pgurusid@redhat.com)

Comment 11 Vijay Bellur 2016-07-07 09:01:04 UTC
COMMIT: http://review.gluster.org/14859 committed in release-3.7 by Kaushal M (kaushal@redhat.com) 
------
commit bddf6f8e6909ea1a3a9f240ca3a7515aea4e35b4
Author: Raghavendra Talur <rtalur@redhat.com>
Date:   Mon Jul 4 18:36:26 2016 +0530

    gfapi: update count when glfs_buf_copy is used
    
    Backport of http://review.gluster.org/#/c/14854
    
    glfs_buf_copy collates all iovecs into a iovec with count=1. If
    gio->count is not updated it will lead to dereferencing of invalid
    address.
    
    Change-Id: I7c58071d5c6515ec6fee3ab36af206fa80cf37c3
    BUG: 1352482
    Signed-off-by: Raghavendra Talur <rtalur@redhat.com>
    Signed-off-by: Poornima G <pgurusid@redhat.com>
    Reported-By: Lindsay Mathieson <lindsay.mathieson@gmail.com>
    Reported-By: Dmitry Melekhov <dm@belkam.com>
    Reported-By: Tom Emerson <TEmerson@cyberitas.com>
    Reviewed-on: http://review.gluster.org/14859
    Smoke: Gluster Build System <jenkins@build.gluster.org>
    Reviewed-by: Prashanth Pai <ppai@redhat.com>
    NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins@build.gluster.org>

Comment 12 Kaushal 2016-07-20 13:55:32 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.13, please open a new bug report.

glusterfs-3.7.13 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://www.gluster.org/pipermail/gluster-users/2016-July/027604.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.