1456225 – gluster-block is not working as expected when shard is enabled

Bug 1456225 - gluster-block is not working as expected when shard is enabled

Summary: gluster-block is not working as expected when shard is enabled

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	sharding
Sub Component:
Version:	3.11
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	bugs@gluster.org
QA Contact:	bugs@gluster.org
Docs Contact:
URL:
Whiteboard:
Depends On:	1454313 1455301
Blocks:
TreeView+	depends on / blocked

Reported:	2017-05-28 01:33 UTC by Pranith Kumar K
Modified:	2017-05-30 18:53 UTC (History)
CC List:	5 users (show)
Fixed In Version:	glusterfs-3.11.0
Clone Of:	1455301
Environment:
Last Closed:	2017-05-30 18:53:41 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Comment 1 Pranith Kumar K 2017-05-28 01:34:18 UTC

Because gluster-block is storing metadata on the same volume as that of data and since metadata updates are multi-client writes, gluster-block create hangs and goes into a loop before it dies.
Reason is that the actual file size and filesize on the mount are differing and gluster-block is not able to understand if the operation succeeded or not.

[root@localhost block-meta]# ls -l /brick1/
block-meta/  block-store/ .glusterfs/  .shard/      .trashcan/   
[root@localhost block-meta]# ls -l /brick1/block-meta/1
-rw-------. 2 root root 52304 May 20 19:36 /brick1/block-meta/1 <<<---- true size.
[root@localhost block-meta]# ls -l 1
-rw-------. 1 root root 101 May 20 19:36 1 <<----- has truncated size.

When a file is opened with O_APPEND, offset gets ignored and the write buffer is always appended to the file. Where as shard doesn't ignore offset when the fd has O_APPEND. This is leading the size to be always stuck at 101 bytes because that is the biggest write that comes on the file:

Thread 2 "gluster-blockd" hit Breakpoint 1, shard_writev (frame=0x61200005391c, 
    this=0x61f00001a4c0, fd=0x61100000b21c, vector=0x60800000cee0, count=1, offset=0, 
    flags=0, iobref=0x60d00001d7c0, xdata=0x0) at shard.c:4827
4827	        shard_common_inode_write_begin (frame, this, GF_FOP_WRITE, fd, vector,
Missing separate debuginfos, use: dnf debuginfo-install json-c-0.12-7.fc24.x86_64 libacl-2.2.52-11.fc24.x86_64 libattr-2.4.47-16.fc24.x86_64 libstdc++-6.2.1-2.fc25.x86_64 sssd-client-1.14.2-1.fc25.x86_64
(gdb) dis 1
(gdb) c
Continuing.
[Switching to Thread 0x7fffe565a700 (LWP 9037)]

Thread 10 "gluster-blockd" hit Breakpoint 2, trace_writev_cbk (frame=0x612000053c1c, 
    cookie=0x61200005391c, this=0x61f0000196c0, op_ret=101, op_errno=0, 
    prebuf=0x61b00001a68c, postbuf=0x61b00001a6fc, xdata=0x611000052d9c) at trace.c:232
232	        char         preopstr[4096]  = {0, };
(gdb) p postbuf.ia_size
$1 = 101
(gdb) en 1
(gdb) c
Continuing.

Thread 10 "gluster-blockd" hit Breakpoint 1, shard_writev (frame=0x61200002841c, 
    this=0x61f00001a4c0, fd=0x61100003cf9c, vector=0x608000020be0, count=1, offset=0, 
    flags=0, iobref=0x60d00003d530, xdata=0x0) at shard.c:4827
4827	        shard_common_inode_write_begin (frame, this, GF_FOP_WRITE, fd, vector,
(gdb) c
Continuing.
[Switching to Thread 0x7fffe0f08700 (LWP 9038)]

Thread 11 "gluster-blockd" hit Breakpoint 2, trace_writev_cbk (frame=0x61200002871c, 
    cookie=0x61200002841c, this=0x61f0000196c0, op_ret=21, op_errno=0, prebuf=0x61b00000cd8c, 
    postbuf=0x61b00000cdfc, xdata=0x611000064bdc) at trace.c:232
232	        char         preopstr[4096]  = {0, };
(gdb) p postbuf.ia_size
$2 = 101
(gdb) c
Continuing.
[New Thread 0x7fffe04e8700 (LWP 9040)]
[New Thread 0x7fffdfcc4700 (LWP 9041)]
[New Thread 0x7fffdf490700 (LWP 9042)]

Thread 11 "gluster-blockd" hit Breakpoint 1, shard_writev (frame=0x61200003dd1c, 
    this=0x61f00001a4c0, fd=0x61100009479c, vector=0x608000032a60, count=1, offset=0, 
    flags=0, iobref=0x60d00006c800, xdata=0x0) at shard.c:4827
4827	        shard_common_inode_write_begin (frame, this, GF_FOP_WRITE, fd, vector,
(gdb) c
Continuing.
[Switching to Thread 0x7fffe565a700 (LWP 9037)]

Thread 10 "gluster-blockd" hit Breakpoint 2, trace_writev_cbk (frame=0x61200003e01c, 
    cookie=0x61200003dd1c, this=0x61f0000196c0, op_ret=33, op_errno=0, prebuf=0x61b00002b78c, 
    postbuf=0x61b00002b7fc, xdata=0x61100007e5dc) at trace.c:232
232	        char         preopstr[4096]  = {0, };
(gdb) p postbuf.ia_size
$3 = 101
(gdb) q
A debugging session is active.

	Inferior 1 [process 9024] will be killed.

After fixing the issue with:

[root@localhost r3]# gluster-block create r3/12 ha 3 192.168.122.61,192.168.122.123,192.168.122.113 1GiB
IQN: iqn.2016-12.org.gluster-block:1aef8052-2547-482e-9316-e41ba0e4b289
PORTAL(S):  192.168.122.61:3260 192.168.122.123:3260 192.168.122.113:3260
RESULT: SUCCESS
[root@localhost r3]# ls -l /brick1/block-meta/12
-rw-------. 2 root root 315 May 24 22:52 /brick1/block-meta/12
[root@localhost r3]# ls -l /mnt/block-meta/12
-rw-------. 1 root root 315 May 24 22:52 /mnt/block-meta/12

Comment 2 Worker Ant 2017-05-28 01:47:27 UTC

REVIEW: https://review.gluster.org/17404 (features/shard: Handle offset in appending writes) posted (#1) for review on release-3.11 by Pranith Kumar Karampuri (pkarampu)

Comment 3 Worker Ant 2017-05-29 14:12:24 UTC

COMMIT: https://review.gluster.org/17404 committed in release-3.11 by Shyamsundar Ranganathan (srangana) 
------
commit 1db7887771c748a63f3c46ce72918c98cb6dc208
Author: Pranith Kumar K <pkarampu>
Date:   Wed May 24 22:30:29 2017 +0530

    features/shard: Handle offset in appending writes
    
    When a file is opened with append, all writes are appended at the end of file
    irrespective of the offset given in the write syscall. This needs to be
    considered in shard size update function and also for choosing which shard to
    write to.
    
    At the moment shard piggybacks on queuing from write-behind
    xlator for ordering of the operations. So if write-behind is disabled and
    two parallel appending-writes come both of which can increase the file size
    beyond shard-size the file will be corrupted.
    
     >BUG: 1455301
     >Change-Id: I9007e6a39098ab0b5d5386367bd07eb5f89cb09e
     >Signed-off-by: Pranith Kumar K <pkarampu>
     >Reviewed-on: https://review.gluster.org/17387
     >Smoke: Gluster Build System <jenkins.org>
     >Reviewed-by: Krutika Dhananjay <kdhananj>
     >NetBSD-regression: NetBSD Build System <jenkins.org>
     >CentOS-regression: Gluster Build System <jenkins.org>
    
    BUG: 1456225
    Change-Id: I9007e6a39098ab0b5d5386367bd07eb5f89cb09e
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: https://review.gluster.org/17404
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Shyamsundar Ranganathan <srangana>

Comment 4 Shyamsundar 2017-05-30 18:53:41 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.11.0, please open a new bug report.

glusterfs-3.11.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-May/000073.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.